Sparse Signal Processing for Massive MIMO Communications 9819953936, 9789819953936

The book focuses on utilizing sparse signal processing techniques in designing massive MIMO communication systems. As th

122 104 6MB

English Pages 233 [227] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents
Acronyms
1 Introduction
1.1 Compressive Sensing Theory
1.2 Massive MIMO Systems
1.2.1 Massive MIMO Schemes
1.2.2 Massive SM-MIMO Schemes
1.3 Prior Work
1.4 Book Organization
References
2 Subspace-Based Super-Resolution Sparse Channel Estimation in MIMO-OFDM Systems
2.1 Introduction
2.2 Sparse MIMO Channel Model
2.2.1 Channel Sparsity
2.2.2 Spatial Correlation
2.2.3 Temporal Correlation
2.3 Sparse MIMO-OFDM CE
2.3.1 Pilot Pattern
2.3.2 Super-Resolution CE
2.3.3 Discussion on Pilot Overhead
2.4 Simulation Results
2.5 Summary
References
3 Compressive Sensing Sparse Channel Estimation in FDD Massive MIMO Systems
3.1 Introduction
3.2 Spatio-Temporal Common Sparsity of Delay-Domain
3.3 Proposed SCS-Based Spatio-Temporal Joint Channel Estimation Scheme
3.3.1 Non-orthogonal Pilot Scheme at the BS
3.3.2 SCS-Based CE at the User
3.3.3 Space-Time Adaptive Pilot Scheme
3.3.4 CE in Multi-Cell Massive MIMO
3.4 Performance Analysis
3.4.1 Non-Orthogonal Pilot Design Under the Framework of CS Theory
3.4.2 Convergence Analysis of Proposed ASSP Algorithm
3.4.3 Computational Complexity of ASSP Algorithm
3.5 Simulation Results
3.6 Summary
References
4 Compressive Sensing CSI Acquisition and Feedback in FDD Massive MIMO Systems
4.1 Introduction
4.2 System Model
4.2.1 Massive MIMO in the Downlink
4.2.2 Massive MIMO Channels in Virtual Angular Domain
4.2.3 Temporal Correlation of Wireless Channels
4.2.4 Challenges of CE and Feedback
4.3 Spatially Common Sparsity Based Adaptive Channel Estimation and Feedback Scheme
4.3.1 Non-orthogonal Pilot for Downlink CE
4.3.2 CS Based Adaptive CSI Acquisition Scheme
4.3.3 Proposed DSAMP Algorithm for CE
4.3.4 Closed-Loop Channel Tracking with Adaptive Pilot Design
4.4 Performance Analysis
4.4.1 Non-orthogonal Pilot Design for CS Based Adaptive CSI Acquisition
4.4.2 Time Slot Overhead for CS Based Adaptive CSI Acquisition
4.4.3 Frequency-Domain Placement of Pilot Signals
4.4.4 Performance Analysis of Proposed DSAMP Algorithm
4.4.5 Performance Bound of CE
4.4.6 Adaptive Pilot Design and Required Time Slot Overhead for Closed-Loop Channel Tracking
4.4.7 Selection of Thresholds for Algorithms 4.1 and 4.2
4.5 Simulation Results
4.6 Summary
References
5 Compressive Sensing Sparse Channel Estimation in Broadband Millimeter-Wave Massive MIMO Systems
5.1 Introduction
5.2 System Model
5.3 DCS-Based CE Scheme
5.3.1 UL Pilot Training
5.3.2 DCS-Based CE
5.3.3 Pilot Design According to DCS Theory
5.4 Simulation Results
5.5 Summary
References
6 Subspace-Based Super-Resolution Sparse Channel Estimation in Millimeter-Wave Massive MIMO Systems
6.1 Introduction
6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband Millimeter-Wave Massive MIMO Systems
6.2.1 System Model
6.2.2 Proposed 2D Unitary Esprit Based Super-Resolution Channel Estimation Scheme
6.2.3 Simulation Results
6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband Millimeter-Wave Massive MIMO Systems
6.3.1 Downlink CE Stage
6.3.2 UL Channel Estimation Stage
6.3.3 MDU-ESPRIT Algorithm
6.3.4 ML Pairing and Path Gains Estimation
6.3.5 Performance Evaluation
6.4 Summary
References
7 Compressive Sensing Single-User Signal Detection in Massive MIMO Systems with Spatial Modulation
7.1 Introduction
7.2 System Model
7.3 SCS-Based Signal Detector for Massive SM-MIMO
7.3.1 Transmitter Design
7.3.2 SCS-Based Signal Detector at the Receiver
7.4 Performance Analysis
7.4.1 Superiority of SCS-Based Signal Detectors
7.4.2 Benefits from SM Signal Interleaving
7.4.3 Computational Complexity
7.5 Simulation Results
7.6 Summary
References
8 Compressive Sensing Multi-User Detection in Massive MIMO Systems with Spatial Modulation
8.1 Introduction
8.2 System Model
8.2.1 Multi-User Spatial Modulation Scheme for Massive MIMO Systems
8.2.2 Uplink Transmission
8.3 Multi-User Detection for Massive MIMO Systems with Spatial Modulation
8.3.1 Transmitter Design at the Users
8.3.2 SCS-Based MUD at the BS
8.3.3 Computational Complexity
8.4 Simulation Results
8.5 Summary
References
9 Compressive Sensing Massive IoT Access in Massive MIMO Systems with Media Modulation
9.1 Introduction
9.2 System Model
9.2.1 Media Modulation Aided mMTC
9.2.2 Transmission Model
9.3 CS-Based Massive Access Scheme
9.3.1 The StrOMP Algorithm for AUD
9.3.2 SIC-SSP Algorithm for Data Detection
9.3.3 Computational Complexity
9.4 Simulation Results
9.5 Summary
References
10 Sparse Channel Estimation in TDS-OFDM Systems
10.1 Introduction
10.2 System Model
10.3 PA-IHT Based Channel Estimation
10.3.1 The Proposed PA-IHT Based CE Method
10.3.2 Convergence Properties
10.3.3 Computational Complexity
10.4 Simulation Results
10.5 Summary
References
Correction to: Sparse Signal Processing for Massive MIMO Communications
Correction to: Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3
Appendix A Proof of Theorem 3.1
Appendix B Proof of (A.2)
Appendix C Proof of (A.3)
Appendix D Derivation of Eq. (6.7摥映數爠eflinkeq:FWtilde16.76)
Recommend Papers

Sparse Signal Processing for Massive MIMO Communications
 9819953936, 9789819953936

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Zhen Gao · Yikun Mei · Li Qiao

Sparse Signal Processing for Massive MIMO Communications

Sparse Signal Processing for Massive MIMO Communications

Zhen Gao · Yikun Mei · Li Qiao

Sparse Signal Processing for Massive MIMO Communications

Zhen Gao Advanced Research Institute of Multidisciplinary Sciences Beijing Institute of Technology Beijing, China

Yikun Mei Beijing Institute of Technology Beijing, China

Li Qiao Beijing Institute of Technology Beijing, China

ISBN 978-981-99-5393-6 ISBN 978-981-99-5394-3 (eBook) https://doi.org/10.1007/978-981-99-5394-3 Jointly published with Beijing Institute of Technology Press The print edition is not for sale in China (Mainland). Customers from China (Mainland) please order the print book from: Beijing Institute of Technology Press. © Beijing Institute of Technology Press 2024, corrected publication 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publishers, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publishers nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publishers remain neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

The original version of the book was revised: Text corrections have been updated in the chapter content. The correction to the book is available at https://doi.org/10.1007/978-981-99-5394-3_11

Contents

1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Compressive Sensing Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Massive MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Massive MIMO Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Massive SM-MIMO Schemes . . . . . . . . . . . . . . . . . . . . . . 1.3 Prior Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Book Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 4 4 5 7 10 11

2

Subspace-Based Super-Resolution Sparse Channel Estimation in MIMO-OFDM Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Sparse MIMO Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Channel Sparsity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Spatial Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Temporal Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Sparse MIMO-OFDM CE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Pilot Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Super-Resolution CE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Discussion on Pilot Overhead . . . . . . . . . . . . . . . . . . . . . . . 2.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15 15 16 16 17 17 17 17 18 21 21 23 24

3

Compressive Sensing Sparse Channel Estimation in FDD Massive MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Spatio-Temporal Common Sparsity of Delay-Domain . . . . . . . . . 3.3 Proposed SCS-Based Spatio-Temporal Joint Channel Estimation Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Non-orthogonal Pilot Scheme at the BS . . . . . . . . . . . . . . 3.3.2 SCS-Based CE at the User . . . . . . . . . . . . . . . . . . . . . . . . .

25 25 27 29 30 30 vii

viii

Contents

3.3.3 Space-Time Adaptive Pilot Scheme . . . . . . . . . . . . . . . . . 3.3.4 CE in Multi-Cell Massive MIMO . . . . . . . . . . . . . . . . . . . 3.4 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Non-Orthogonal Pilot Design Under the Framework of CS Theory . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Convergence Analysis of Proposed ASSP Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3 Computational Complexity of ASSP Algorithm . . . . . . . 3.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Compressive Sensing CSI Acquisition and Feedback in FDD Massive MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Massive MIMO in the Downlink . . . . . . . . . . . . . . . . . . . . 4.2.2 Massive MIMO Channels in Virtual Angular Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Temporal Correlation of Wireless Channels . . . . . . . . . . . 4.2.4 Challenges of CE and Feedback . . . . . . . . . . . . . . . . . . . . . 4.3 Spatially Common Sparsity Based Adaptive Channel Estimation and Feedback Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Non-orthogonal Pilot for Downlink CE . . . . . . . . . . . . . . 4.3.2 CS Based Adaptive CSI Acquisition Scheme . . . . . . . . . 4.3.3 Proposed DSAMP Algorithm for CE . . . . . . . . . . . . . . . . 4.3.4 Closed-Loop Channel Tracking with Adaptive Pilot Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Non-orthogonal Pilot Design for CS Based Adaptive CSI Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Time Slot Overhead for CS Based Adaptive CSI Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Frequency-Domain Placement of Pilot Signals . . . . . . . . 4.4.4 Performance Analysis of Proposed DSAMP Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.5 Performance Bound of CE . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.6 Adaptive Pilot Design and Required Time Slot Overhead for Closed-Loop Channel Tracking . . . . . . . . . 4.4.7 Selection of Thresholds for Algorithms 4.1 and 4.2 . . . . 4.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34 36 36 36 39 40 41 49 49 51 51 53 53 53 55 56 57 58 59 60 62 63 63 66 67 67 68 70 70 72 79 80

Contents

5

6

7

Compressive Sensing Sparse Channel Estimation in Broadband Millimeter-Wave Massive MIMO Systems . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 DCS-Based CE Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 UL Pilot Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 DCS-Based CE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Pilot Design According to DCS Theory . . . . . . . . . . . . . . 5.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

83 83 84 85 85 87 88 90 91 92

Subspace-Based Super-Resolution Sparse Channel Estimation in Millimeter-Wave Massive MIMO Systems . . . . . . . . . . . . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband Millimeter-Wave Massive MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Proposed 2D Unitary Esprit Based Super-Resolution Channel Estimation Scheme . . . . . . . . 6.2.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband Millimeter-Wave Massive MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Downlink CE Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 UL Channel Estimation Stage . . . . . . . . . . . . . . . . . . . . . . 6.3.3 MDU-ESPRIT Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.4 ML Pairing and Path Gains Estimation . . . . . . . . . . . . . . . 6.3.5 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

114 114 121 125 127 129 139 140

Compressive Sensing Single-User Signal Detection in Massive MIMO Systems with Spatial Modulation . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 SCS-Based Signal Detector for Massive SM-MIMO . . . . . . . . . . . 7.3.1 Transmitter Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 SCS-Based Signal Detector at the Receiver . . . . . . . . . . . 7.4 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Superiority of SCS-Based Signal Detectors . . . . . . . . . . . 7.4.2 Benefits from SM Signal Interleaving . . . . . . . . . . . . . . . . 7.4.3 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

143 143 144 146 146 148 149 150 150 151 152

95 95

97 97 99 107

x

Contents

7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 8

Compressive Sensing Multi-User Detection in Massive MIMO Systems with Spatial Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 Multi-User Spatial Modulation Scheme for Massive MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . 8.2.2 Uplink Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Multi-User Detection for Massive MIMO Systems with Spatial Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Transmitter Design at the Users . . . . . . . . . . . . . . . . . . . . . 8.3.2 SCS-Based MUD at the BS . . . . . . . . . . . . . . . . . . . . . . . . 8.3.3 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

163 163 164 166 166 169 170

Compressive Sensing Massive IoT Access in Massive MIMO Systems with Media Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.1 Media Modulation Aided mMTC . . . . . . . . . . . . . . . . . . . 9.2.2 Transmission Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 CS-Based Massive Access Scheme . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.1 The StrOMP Algorithm for AUD . . . . . . . . . . . . . . . . . . . 9.3.2 SIC-SSP Algorithm for Data Detection . . . . . . . . . . . . . . 9.3.3 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

173 173 174 174 175 176 177 178 178 180 183 183

10 Sparse Channel Estimation in TDS-OFDM Systems . . . . . . . . . . . . . . 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 PA-IHT Based Channel Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 The Proposed PA-IHT Based CE Method . . . . . . . . . . . . 10.3.2 Convergence Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.3 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

185 185 187 189 189 196 196 197 202 203

Correction to: Sparse Signal Processing for Massive MIMO Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

C1

9

157 157 159 159 160

Contents

xi

Appendix A: Proof of Theorem 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Appendix B: Proof of (A.2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Appendix C: Proof of (A.3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Appendix D: Derivation of Eq. (6.7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Acronyms

16-QAM 2D 3-D ADCs AEs AGMP AoA/AoD ASE ASP ASSP AUD AWGN BER BP BPDN BS CC CDT-8 CFR CIR CoSaMP CP CPSC CRLB CS CSI DCS DFT DGMP DOA DPN-OFDM

16-quadrature amplitude modulation Two dimensional Three dimensional Analog-to-digital converters Antenna elements Adaptive grid matching pursuit Angles of arrival or departure Average spectral efficiency Adaptive subspace pursuit Adaptive structured subspace pursuit Active user detection Additive white Gaussian noise Bit-error-rate Basis pursuit Basis pursuit de-noising Base station Computational complexity China digital television test 8th channel model Channel frequency response Channel impulse response Compressive sampling matching pursuit Cyclic prefix Cyclic-prefix single carrier Cramer–Rao lower bound Compressive sensing Channel state information Distributed compressive sensing Discrete Fourier transform Distributed grid matching pursuit Direction-of-arrival Dual pseudo-noise OFDM xiii

xiv

DSAMP DTMB DTTB DVB-T2 ESPRIT EVD FD FDD FDM FFT FRI FSF GMMV GSP IBI IHT ITU-VA ITU-VB JD J-OMP LMMSE LOS LS LS-MIMO LTE-A MAPs MDU-ESPRIT MIMO ML mMIMO MMSE mMTC MMV MPCs MS MSE MTDs MUD NCS NLOS NMSE OFDM OMP PA PCA

Acronyms

Distributed sparsity adaptive matching pursuit Digital terrestrial multimedia broadcasting standard Digital terrestrial television broadcasting Digital video broadcasting standard Estimating signal parameters via rotational invariance techniques Eigenvalue decomposition Full dimensional Frequency division duplex Frequency division multiplexing Fast Fourier transformation Finite rate of innovation Frequency-selective fading Generalized MMV Group subspace pursuit Inter-block-interference Iterative hard threshold International Telecommunications Union Vehicular A International Telecommunication Union Vehicular B Joint diagonalization Joint OMP Linear minimum mean square error Line-of-sight Least squares Large-scale MIMO Long-term evolution-advanced Mirror activation patterns Multi-dimensional unitary ESPRIT Multiple-input multiple-output Maximum likelihood Massive multi-input multi-output Minimum mean square error Massive machine-type communications Multiple vector measurement Multipath components Mobile station Mean square error Machine type devices Multi-user detector Normalized compressive sensing Non-line-of-sight Normalized mean square error Orthogonal frequency division multiplexing Orthogonal matching pursuit Priori-information aided Principle component analysis

Acronyms

PDF PN PSK PSN R-D RF RIP RMT RVP SAMP SCS SCSER SD SIC SIC-SSP SIES SM SM-MIMO SMMP SMV SNR SOMP SP SRIP SSA SSD SSP SSPP StrOMP SV SVD SW TAs TDD TDM TDS TFJ TLS TLS-ESPRIT TLSSCS ToA TS TTOP

xv

Probability density function Pseudo-noise Phase shift keying Phase shift network R-dimensional Radio frequency Restricted isometry property Random matrix theory Real-valued processing Sparsity adaptive matching pursuit Structured compressive sensing Spatial constellation symbol error rate Sphere decoding Successive interference cancellation Successive interference cancellation-based structured subspace pursuit Shift-invariance equation solving Spatial modulation Spatial modulation MIMO Spatial modulation matching pursuit Single-measurement vector Signal noise ratio Simultaneous orthogonal matching pursuit Subspace pursuit Structured restricted isometry property Signal subspace approximation Simultaneous Schur decomposition Structured subspace pursuit Spatial smoothing preprocessing Structured orthogonal matching pursuit Signal vector Singular value decomposition Simultaneous weighted Transmit antennas Time division duplex Time division multiplexing Time-domain synchronous Time-frequency joint Total least squares Total least square estimating signal parameters via rotational invariance techniques Two-level sparse structure-based CS Time of arrival Training sequence Time-domain training-based orthogonal pilot

xvi

UD UEs UL ULA UPA ZF

Acronyms

User devices User equipments Uplink Uniform linear array Uniform planar array Zero forcing

Chapter 1

Introduction

Abstract Massive multiple-input multiple-output (MIMO) technology has emerged as a promising approach for the next generation of wireless communication systems, as it offers increased spectral efficiency, higher energy efficiency, and improved user experience. However, as the number of antennas at the base station (BS) grows significantly, the computational complexity and the amount of data to be processed also increase, leading to significant challenges in signal processing. Sparse signal processing techniques have been proposed to address these challenges by exploiting the sparsity of the wireless channels, which often have a small number of significant paths. By exploiting this sparsity, sparse signal processing can reduce the computational complexity and improve the performance of massive MIMO systems. Some of the most popular sparse signal processing techniques used in massive MIMO systems include compressed sensing, matching pursuit, and message passing algorithms. These techniques can be used to design efficient algorithms for channel estimation (CE), detection, and precoding, and can significantly reduce the computational complexity of massive MIMO systems while maintaining high performance. Overall, the combination of massive MIMO and sparse signal processing offers a powerful framework for addressing the challenges of next-generation wireless communication systems.

1.1 Compressive Sensing Theory In the real world, most continuous signals from the real world exhibit some inherent redundancy or correlation, which implies that the effective amount of information conveyed by them is typically lower than the maximum amount carried by uncorrelated signals in the same bandwidth [1]. This is exemplified by the inter-sample correlation of so-called voiced speech segments, by adjacent video pixels, correlated fading channel envelopes, etc. Hence the number of effective degrees of freedom of the corresponding sampled discrete time signals can be much smaller than that potentially allowed by their dimensions. This indicates that these correlated timedomain signals typically can be represented by much less samples in the frequencydomain [1], because correlated signals only have a few non-negligible low-frequency © Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3_1

1

2

1 Introduction

frequency-domain components. Just to give a simple example, a sinusoidal signal can be represented by a single non-zero frequency-domain tone after the transformation by the Fast Fourier transformation (FFT). Sometimes this is also referred to as the energy-compaction property of the FFT. Against this background, compressive sensing (CS) theory has been developed and applied in diverse fields, which shows that the sparsity of a signal can indeed be exploited to recover a replica of the original signal from fewer samples than that required by the classic Nyquist sampling theorem. To briefly introduce CS theory, we consider the sparse signal x ∈ Cn×1 having the sparsity level of k (i.e., x has only k  n non-zero elements), which is characterized by the measurement matrix of  ∈ Cm×n associated with m  n, where y = x ∈ Cm×1 is the measured signal. In CS theory, the key issue is how to recover x by solving the under-determined set of equations y = x, given y and . Generally, x may not exhibit sparsity itself, but it may exhibit sparsity in some transformed domain, which is formulated as x = s, where  is the transform matrix and s is the sparse signal associated with the sparsity level k. Hence we can formulate the standard CS Model (1) of Table 1.1. Additionally, we can infer from the standard CS Model (1) of Table 1.1 the equally important Models (2)–(4) of Table 1.1, which can provide more reliable compression and recovery of sparse signals, when some of the specific sparse properties of practical applications are considered. Specifically,  P Model (2) is capable of separating multiple sparse signals s p p=1 associated with  P different measurement matrices  p p=1 by recovering the aggregate sparse signal  T s = s1T , s2T , . . . , sTP ; Model (3) has the potential of improving the estimation performance of s by exploiting the block sparsity of s, as shown in Table 1.1; Model  P (4) is capable of enhancing the estimation performance of P sparse signals s p p=1 , when their identical/partially common sparsity pattern is exploited. Considering the standard CS model, we arrive at the three fundamental elements of CS theory as follows. (1) Sparse transformation is essential for CS, since finding a suitable transform matrix  can efficiently transform the original (non-sparse) signal x into the sparse signal s. (2) Sparse signal compression refers to the design of  or  = .  should reduce the dimension of measurements, while minimizing the information loss imposed, which can be quantified in terms of the coherence or restricted isometry property (RIP) of  or  [1]. (3) Sparse signal recovery algorithms are important for the reliable reconstruction of x or s from the measured signal y. Particularly, the CS algorithms widely applied in wireless communications can be mainly divided into three categories as follows. (i) Convex relaxation algorithms such as basis pursuit (BP) as well as BP de-noising (BPDN), and so on, can formulate the CS problem as a convex optimization problem and solve them using convex optimization software like CVX [2]. For instance, the CS problem for Model (1) of Table 1.1 can be formulated as a Lagrangian relaxation of a quadratic program as sˆ = arg min s1 + λy − s2 , s

(1.2)

1.1 Compressive Sensing Theory Table 1.1 Typical CS models [81] Types of model CS models

3

Mathematical expression

Illustration  = 

Model (1)

Standard CS model

y = x = s = s

Model (2)

Signal separation by sparse representations

y˜ =

P 

psp

p=1

= 1 s1 +

P  p=2



psp



interference

Model (3)

Block sparse signal

Model (4)

Multiple vector measurement

s p and  p are the pth sparse signal and the pth measurement matrix, respectively, s is the sparse aggregate signal

= s  = [1 , 2 , . . . ,  P ] ,  T s = s1T , s2T , . . . , sTP y = s,s appears the block sparsity d L = N , and as s = [s1 · · · sd sd+1 · · · s2d sT [l] for   1 ≤ l ≤ L has sT [1] sT [2] non-zero T · · · s N −d+1 · · · s N ] Euclidean norm 

for at most k sT [L] indices   y1 , y2 , . . . , y P = s p and y p for  [s1 , s2 , . . . , s P ], 1 ≤ p ≤ P are  P the sparse signal s p p=1 share the identical or partially common sparsity pattern and measured signal associated with the pth observation, respectively

with ·1 and ·2 being l1 -norm and l2 -norm operators, respectively, and λ > 0, and the resultant algorithms belong to the BPDN family. These algorithms usually require a small number of measurements, but they are complex, e.g., the complexity of BP algorithm is on the order of O m 2 n 3/2 [1]. (ii) Greedy iterative algorithms can identify the support set in a greedy iterative manner. They have a low complexity and fast speed of recovery, but suffer from a performance loss, when the signals are not very sparse. The representatives of these algorithms are orthogonal matching pursuit (OMP), compressive sampling matching pursuit (CoSaMP), and subspace pursuit (SP), which have the complexity of O (kmn) [1]. (iii) Bayesian inference algorithms like sparse Bayesian learning and approximate message passing infer the sparse unknown signal from the Bayesian viewpoint by considering the sparse priori. The complexity of these algorithms varies from individual to individual. For

example, the complexity of Bayesian CS via belief propagation is O nlog2 n [1]. Note that, the algorithms mentioned above have to be further developed for Models (2)–(4) of Table 1.1. For example, the group-

4

1 Introduction

sparse BPDN, the simultaneous OMP (SOMP), and the group-sparse Bayesian CS algorithms tailored for multiple vector measurement (MMV) Model (4) are promising future candidates [1]. Since the conception of CS theory in 2004, it has been extensively developed, extended and applied to practical systems. Indeed, prototypes for MIMO radar, cognitive radar, ultra-wide band, and so on based on CS theory have been reported by Eldar’s research group [1]. Undoubtedly, the emerging CS theory provides us with a revolutionary tool for reconstructing signals, despite using sub-Nyquist sampling rates [1]. Therefore, how to exploit CS theory in the emerging 5G wireless networks has become a hot research topic [2–13]. By exploring and exploiting the inherent sparsity in all aspects of wireless networks, we can create more efficient 5G networks. In the following sections, we will explore and exploit the sparsity inherent in massive MIMO-based 5G wireless networks.

1.2 Massive MIMO Systems 1.2.1 Massive MIMO Schemes Massive MIMO employing hundreds of antennas at the BS are capable of simultaneously serving multiple users at an improved spectral- and the energy-efficiency [2, 3]. Although massive MIMO indeed exhibit attractive advantages, a challenging issue that hinders the evolution from the current frequency division duplex (FDD) cellular networks to FDD massive MIMO is the indispensable estimation and feedback of the downlink FDD channels to the transmitter. However, for FDD massive MIMO, the users have to estimate the downlink channels associated with hundreds of transmit and receive antenna pairs, which results in a prohibitively high pilot overhead. Moreover, even if the users have succeeded in acquiring accurate downlink channel state information (CSI), its feedback to the BS requires a high feedback rate. Hence the codebook-based CSI-quantization and feedback remains challenging, while the overhead of analog CSI feedback is simply unaffordable [3]. By contrast, in time division duplex (TDD) massive MIMO, the downlink CSI can be acquired from the uplink (UL) CSI by exploiting the channel’s reciprocity, provided that the interference is also similar at both ends of the link. Furthermore, the pilot contamination may significantly degrade the system’s performance due to the limited number of orthogonal pilots, which hence have to be reused in adjacent cells [2]. Fortunately, recent experiments have shown that due to the limited number of significant scatterers in the propagation environments and owing to the strong spatial correlation inherent in the co-located antennas at the BS, the massive MIMO channels exhibit sparsity either in the delay domain [2] or in the angular domain or in both [3]. For massive MIMO channels observed in the delay domain, the number of paths containing the majority of the received energy is usually much smaller than the total number of channel impulse response (CIR) taps, which implies that the massive

1.2 Massive MIMO Systems

5

MIMO CIRs exhibit sparsity in the delay domain and can be estimated using the standard CS Model (1) of Table 1.1, where s is the sparse delay-domain CIR,  consists of pilot signals, and y is the received signal [2]. Due to the co-located nature of the antenna elements (AEs), the CIRs associated with different transmit and receiver antenna pairs further exhibit structured sparsity, which manifests itself in the blocksparsity Model (3) of [2]. Moreover, the BS antennas are usually found at elevated location with much few scatterers around, while the users roam at ground-level and experience rich scatterers. Therefore, the massive MIMO CIRs seen from the BS exhibit only limited angular spread, which indicates that the CIRs exhibit sparsity in the angular domain [3]. Due to the common scatterers shared by multiple users close to each other, the massive multi-user MIMO channels further have the structured sparsity and can be jointly estimated using the MMV Model (4) of Table 1.1 [3]. Additionally, this sparsity can also be exploited for mitigating the pilot contamination in TDD massive MIMO, where the CSI of the adjacent cells can be estimated with the aid of the signal separation Model (2) for further interference mitigation or for multi-point cooperation. Remark: Exploiting the sparsity of massive MIMO channels with the aid of CS theory to reduce the overhead required for CE and feedback are expected to solve various open challenges and constitute a hot topic in the field of massive MIMO [2, 3]. However, if the pilot signals of CS-based solutions are tailored to a sub-Nyquist sampling rate, ensuring its compatibility with the existing systems based on the classic Nyquist sampling rate requires further research.

1.2.2 Massive SM-MIMO Schemes In massive MIMO systems, each antenna requires a dedicated radio frequency (RF) chain, which will substantially increase the power consumption of RF circuits, when the number of BS antennas becomes large. To circumvent this issue, as shown in Fig. 1.1, the BS of massive SM-MIMO employs hundreds of antennas, but a much smaller number of RF chains and antennas is activated for transmission. Explicitly, only a small fraction of the antennas is selected for the transmission of classic modulated signals in each time slot. For massive SM-MIMO, a three-dimensional (3-D) constellation diagram including the classic signal constellation and the spatial constellation is exploited. Moreover, massive spatial modulation MIMO (SM-MIMO) can also b e used in the UL [4], where multiple users equipped with a single-RF chain, but multiple antennas can simultaneously transmit their spatial modulation (SM) signals to the BS. In this way, the UL throughput can also be improved by using SM, albeit at the cost of having no transmit diversity gain. This problem can be mitigated by activating a limited fraction of the antennas. Due to the potentially higher number of transmit antennas (TAs) than the number of activated receive antennas, signal detection and CE in massive SM-MIMO can be a large-scale under-determined problem. The family of optimal maximum likelihood

6

1 Introduction

Fig. 1.1 The SM signals in massive SM-MIMO systems are sparse [81]

(ML) or near-optimal sphere decoding (SD) algorithms suffers from a potentially excessive complexity. By contrast, the conventional low-complexity linear algorithms, such as the linear minimum mean square error (LMMSE) algorithm, suffer from the obvious performance loss inflicted by under-determined rank-deficient systems. Fortunately, it can be observed that in the downlink of massive SM-MIMO, since only a fraction of the TAs are active in each time slot, the downlink SM signals are sparse in the signal domain. Hence, we can use the standard CS Model (1) of Table 1.1 for developing SM signal detection, where s is the sparse SM signal,  is the MIMO channel matrix, and y is the received signal. Moreover, observe in Fig. 1.1 that for the UL of massive SM-MIMO, each user’s UL SM signal also exhibits sparsity, thus the aggregated SM signal incorporating all of the multiple users’ UL SM signals exhibits sparsity. Therefore, it is expected that by exploiting the sparsity of the aggregated SM signals, we can use the signal separation Model (2) of Table 1.1 to develop a low-complexity, high-accuracy signal detector for improved UL signal detection [4]. Remark: The sparsity of SM signals can be exploited for reducing the computational complexity of signal detection at the receiver. To elaborate a little further, CE in massive SM-MIMO is more challenging than that in massive MIMO, since only a fraction of the antennas are active in each time slot. Hence, how to further explore the intrinsic sparsity of massive SM-MIMO channels and how to exploit the estimated CSI associated with the active antennas to reconstruct the complete CSI is a challenging problem requiring further investigations.

1.3 Prior Work

7

1.3 Prior Work An accurate acquisition of CSI is crucial for signal detection, beamforming, resource allocation, and other functions in massive MIMO systems. However, because of the large number of antennas at the BS, each user has to estimate channels associated with hundreds of TAs, resulting in prohibitively high pilot overhead. Therefore, the challenging problem of achieving accurate CE with affordable pilot overhead arises, particularly for FDD massive MIMO systems [14]. Previous studies have extensively explored CE for conventional small-scale FDD MIMO systems [15–21]. It has been proven that equi-spaced and equi-power orthogonal pilots are optimal for estimating non-correlated Rayleigh MIMO channels for one orthogonal frequency division multiplexing (OFDM) symbol, where the required pilot overhead increases with the number of TAs [18]. By exploiting the spatial correlation of MIMO channels, the pilot overhead to estimate Rician MIMO channels can be reduced [19]. Furthermore, by utilizing the temporal channel correlation, even further reduction in pilot overhead can be achieved to estimate MIMO channels associated with multiple OFDM symbols [15, 16]. Currently, the orthogonal pilots are widely used in the existing MIMO systems, where the pilot overhead is not a significant issue due to the small number of TAs (e.g., up to eight antennas in Long Term Evolution-Advanced (LTEA) system) [17, 20, 21]. However, with the increasing number of BS antennas, the overhead of orthogonal pilots can become critical in massive MIMO systems that have a large number of antennas at the BS (e.g., 128 antennas or more [22]). For FDD massive MIMO systems, an approach to exploit the temporal correlation and sparsity of delay-domain channels for the reduced pilot overhead has been proposed in [23], but the interference cancellation of training sequences (TS) of different TAs will be difficult when the number of TAs is large. The authors of [24–26] leveraged the spatial correlation and sparsity of delay-domain MIMO channels to estimate channels with reduced pilot overhead, but the assumption of known channel sparsity level at the user is unrealistic. By exploiting the spatial channel correlation, the CS-based CE schemes were proposed in [27–29], but the leveraged spatial correlation can be impaired due to the non-ideal antenna array [14, 30]. A pilot design for downlink CE by exploiting the channel statistics has been proposed in [31], although the acquisition of the downlink channel covariance matrix can be challenging in practice. The authors of [32] proposed an open-loop and closed-loop CE scheme for massive MIMO, but the long-term channel statistics perfectly known at the user can be difficult. Furthermore, previous studies [23–27, 31, 32] did not consider the channel feedback to the BS. Conventional codebook-based CSI feedback schemes may not be feasible to obtain fine-grain spatial channel structures because the dimension of the codebook can be enormous in massive MIMO systems, making the design, storage, and encoding of the high-dimensional codebook difficult [33]. CS-based channel feedback schemes have been proposed for massive MIMO to reduce feedback overhead by exploiting the spatial correlation of CSI [33, 34]. However, these schemes did not consider downlink CE. To tackle this issue, works [35, 36] proposed a joint

8

1 Introduction

OMP-based CSI acquisition scheme by exploiting the spatially joint sparsity of multiple users’ channel matrices. However, this scheme cannot adaptively adjust the required overhead according to the sparsity level of the channels. Moreover, the spatially joint sparsity may disappear when the users are not spatially close. Even when multiple users’ channel matrices share spatially common sparsity, the sparse CSI acquisition problem may be a MMV problem, where the reduction in required overhead is limited. On the other hand, millimeter-wave (mmWave) massive MIMO is a promising approach for boosted data rates, owing to the large under-utilized bandwidth in the mmWave frequency band [37]. In mmWave massive MIMO systems, a large number of small form-factor antennas can be deployed at the BS to achieve large array gain or spatial multiplexing. However, the conventional full digital precoding can be unaffordable for a large antenna array in mmWave massive MIMO, as each antenna requires an expensive RF chain and high-power-consumption analog-to-digital converters (ADCs) [8, 37]. To reduce hardware cost and power consumption while achieving spatial multiplexing, hybrid precoding with a much smaller number of RF chains than the number of antennas has been proposed, which consists of digital precoding at baseband and analog precoding at the RF front end [38, 39]. For mmWave MIMO with hybrid precoding, several approaches for acquiring CSI have been proposed in the literature for narrowband mmWave communications, including codebook-based beam training [40–43] and compressed sensing (CS)-based CE [44, 45]. The beam training approaches were initially adopted in analog beamforming schemes such as the IEEE standards 802.11ad [40] and 802.15.3c [46], in which the transceiver exhaustively searches for the optimal beam pair from a predefined codebook to maximize the received signal noise ratio (SNR) for improved transmission performance. To reduce the search dimension of codebooks for achieving lower training overhead, the multi-stage overlapped beam patterns were designed in [41]. However, these schemes only consider analog beamforming with single-stream transmission. For hybrid beamforming with multi-stream transmission, the beam training solutions with hierarchical multi-beam codebooks were proposed in [42, 43], where the optimal multi-beam pairs can be acquired after hierarchical beam search with gradually finer and narrower beams. However, the training overhead of a beam training scheme is often proportional to the dimension of the codebook, which is challenging for full-dimensional (FD) MIMO with a large number of antennas. By exploiting the inherent angle-domain sparsity of mmWave MIMO channels, several CS-based CE schemes have been proposed to reduce the CE overhead [44, 45]. In [44], the OMP algorithm was considered to estimate sparse mmWave channels, by formulating the CSI acquisition problem as a sparse signal recovery problem, where a redundant dictionary with non-uniformly quantized angle-domain grids was designed for improved performance. Furthermore, a Bayesian CS-based CE scheme was proposed in [45] by considering the impact of transceiver hardware impairments. Besides, by leveraging the low-rank property of mmWave channels, a CANDECOMP/PARAFAC decomposition-based CE scheme [47] was proposed, which has further improved performance.

1.3 Prior Work

9

The aforementioned solutions [41–45, 47] only consider frequency-flat mmWave channels. However, practical mmWave channels can be frequency-selective due to their large system bandwidth in the mmWave frequency band and the distinct delay spreads of multipath components (MPCs) [8]. In [48], a distributed grid matching pursuit (DGMP) algorithm was proposed to estimate time-dispersive channels, where the OFDM is considered. An adaptive grid matching pursuit (AGMP) algorithm, developed from the DGMP, was proposed in [49] to reduce power leakage by using an adaptive grid matching solution. In [50], the sparse mmWave channels at different subcarriers were estimated separately by utilizing the OMP, but the computational complexity is high as the number of subcarriers is typically large. To reduce complexity, a simultaneous weighted (SW)-OMP based scheme was proposed in [51], which exploits the angle-domain common sparsity of channels at different subcarriers to improve performance. By leveraging the common sparsity of delay-domain channels among transceiver antenna pairs, a block CS-based CE solution was proposed for mmWave fully-digital MIMO systems [52], where the TS are designed to improve CE performance. Moreover, these works [41–45, 47–53] usually focus on the ideal uniform linear array (ULA) and seldom investigate the practical uniform planar array (UPA). Compared to the ULA, the UPA offers a more compact array with 3-D beamforming in both horizontal and vertical directions [54, 55], leading to the FD-MIMO. Although mmWave FD-MIMO CE has been investigated in [54, 55], they only considered either fully-digital MIMO or frequency-flat channels. Compared with small-scale SM-MIMO which only introduces the limited gain in spectrum efficiency, massive SM-MIMO is proposed by integrating SM-MIMO with massive MIMO working at 3–6 GHz to achieve higher spectrum efficiency [56]. For massive SM-MIMO, due to the small number of receive antennas at the user and massive antennas at the BS, the signal detection is a challenging large-scale underdetermined problem. When the number of TAs becomes large, the optimal ML signal detector suffers from prohibitively high complexity [57]. A low-complexity signal vector (SV)-based detector has been proposed for SM-MIMO [57], but it is confined to SM-MIMO with a single transmit RF chain. In [58–60], the SM is generalized, where more than one active antenna is used to transmit independent signal constellation symbols for spatial multiplexing. The LMMSE-based signal detector [56] and the SD-based detector [61] can be used for SM-MIMO systems with multiple transmit RF chains. However, they are only suitable for well or overdetermined SM-MIMO with Nr ≥ Nt , and suffer from a significant performance loss in underdetermined SM-MIMO systems with Nr < Nt , where Nt and Nr are the numbers of transmit and receive antennas, respectively. Due to the limited number of RF chains, SM signals have the inherent sparsity, which can be considered by exploiting the CS theory [62] for improved signal detection performance. By far, the CS-based signal detectors have been proposed for underdetermined small-scale SM-MIMO [63, 64]. However, their bit-error-rate (BER) performance still has a significant gap compared with that of the optimal ML detector, especially in massive SM-MIMO with large Nt , Nr , and Nr  N t . For massive machine-type communications (mMTC), the emerging grant-free approach has gained significant attention for enabling massive access of machinetype devices (MTDs), as it simplifies the access procedure by directly delivering data

10

1 Introduction

without scheduling [65–70]. In [65, 66], the authors proposed CS solutions for joint active device and data detection by exploiting the block-sparsity of mMTC, and a maximum a posteriori probability-based scheme was proposed in [67] to improve the performance. Moreover, due to the slowly-varying activity, MTDs tend to exhibit partial block sparsity, hence a modified OMP solution was developed in [68]. Similarly, a modified SP algorithm was proposed in [69]. However, previous studies [65–69] only considered single-antenna configurations at both the MTDs and the BS. To achieve higher efficiency and more reliable detection, the authors of [4, 70] considered multi-antenna-aided MTDs using SM and massive MIMO at the BS, then proposed a two-level sparse structure-based CS (TLSSCS) detector and a structured CS detector in [4, 70], respectively. Despite this, increasing the data rate of SM by one bit requires doubling the number of TAs [71, 72], violating the low-cost requirement of MTDs. To improve the UL (UL) throughput at a low cost and power consumption, the authors of [73, 74] proposed to employ media modulation at the MTDs. Correspondingly, an iterative interference cancellation detector and a CS detector were designed for multi-user detection in [73, 74], respectively. However, these authors did not consider active user detection (AUD).

1.4 Book Organization In this book, we offer a comprehensive overview of the compressive signal processing for massive MIMO wireless communications systems in recent years, and summarize the limitations of the existing solutions. Then, several promising CS-based schemes are introduced to overcome the aforementioned limitations by exploiting the distinct structured sparsity in massive MIMO systems. Moreover, we provide exhaustive simulation results to verify the superiority of the proposed methods. This book is organized as follows. Chapter 2 presents a super-resolution sparse MIMO-OFDM CE scheme based on spatial and temporal correlations. Chapter 3 investigates the spatio-temporal joint CE for FDD massive MIMO by exploiting the spatio-temporal common sparsity of delay-domain MIMO channels. In Chap. 4, the spatially common sparsity based adaptive CE and feedback for FDD massive MIMO is discussed. Chapter 5 studies the CE for mmWave Massive MIMO with hybrid precoding over frequency-selective fading (FSF) channels. Chapter 6 presents superresolution sparse CE schemes for mmWave massive MIMO with hybrid precoding in both narrow and wideband systems. Chapter 7 provides a structured compressive sensing (SCS) based near-optimal signal detector for massive SM-MIMO systems. Chapter 8 studies the CS based multi-user detector (MUD) for the large-scale SMMIMO UL. Chapter 9 discusses the CS based massive access for Internet-of-Things (IoT) relying on media modulation aided machine-type communications. At last, Chap. 10 explores the low-complexity high-accuracy CS based CE for time-domain synchronous (TDS)-OFDM systems. The content presented in Chap. 2 through 10 is derived from the respective original research papers: [25], [2], [28], [48], [75] and [76] (for Chap. 6), [77–79], and [80].

References

11

References 1. Eldar, Y.C.: Sampling Theory: Beyond Bandlimited Systems. Cambridge University Press (2015) 2. Gao, Z., Dai, L., Dai, W., Shim, B., Wang, Z.: Structured compressive sensing-based spatiotemporal joint channel estimation for FDD massive MIMO. IEEE Trans. Commun. 64(2), 601–617 (2015) 3. Liu, A., Zhu, F., Lau, V.K.: Closed-loop autonomous pilot and compressive CSIT feedback resource adaptation in multi-user FDD massive MIMO systems. IEEE Trans. Signal Process. 65(1), 173–183 (2016) 4. Gao, Z., Dai, L., Wang, Z., Chen, S., Hanzo, L.: Compressive-sensing-based multiuser detector for the large-scale SM-MIMO uplink. IEEE Trans. Veh. Technol. 65(10), 8725–8730 (2015) 5. Qin, Z., Gao, Y., Plumbley, M.D., Parini, C.G.: Wideband spectrum sensing on real-time signals at sub-Nyquist sampling rates in single and cooperative multiple nodes. IEEE Trans. Signal Process. 64(12), 3106–3117 (2015) 6. Qin, Z., Gao, Y., Parini, C.G.: Data-assisted low complexity compressive spectrum sensing on real-time signals under sub-Nyquist rate. IEEE Trans. Wirel. Commun. 15(2), 1174–1185 (2015) 7. Cheng, X., Wang, M., Guan, Y.L.: Ultrawideband channel estimation: a Bayesian compressive sensing strategy based on statistical sparsity. IEEE Trans. Veh. Technol. 64(5), 1819–1832 (2014) 8. Heath, R.W., Gonzalez-Prelcic, N., Rangan, S., Roh, W., Sayeed, A.M.: An overview of signal processing techniques for millimeter wave MIMO systems. IEEE J. Sel. Topics Signal Process. 10(3), 436–453 (2016) 9. Zhou, Z., Fang, J., Yang, L., Li, H., Chen, Z., Blum, R.S.: Low-rank tensor decomposition-aided channel estimation for millimeter wave MIMO-OFDM systems. IEEE J. Sel. Areas Commun. 35(7), 1524–1538 (2017) 10. Rajamohan, N., Joshi, A., Kannu, A.P.: Joint block sparse signal recovery problem and applications in LTE cell search. IEEE Trans. Veh. Technol. 66(2), 1130–1143 (2016) 11. Liu, J., Liu, A., Lau, V.K.: Compressive interference mitigation and data recovery in cloud radio access networks with limited fronthaul. IEEE Trans. Signal Process. 65(6), 1437–1446 (2016) 12. Jiang, D., Nie, L., Lv, Z., Song, H.: Spatio-temporal Kronecker compressive sensing for traffic matrix recovery. IEEE Access 4, 3046–3053 (2016) 13. Gishkori, S., Lottici, V., Leus, G.: Compressive sampling-based multiple symbol differential detection for UWB communications. IEEE Trans. Wirel. Commun. 13(7), 3778–3790 (2014) 14. Rusek, F., Persson, D., Lau, B.K., Larsson, E.G., Marzetta, T.L., Edfors, O., Tufvesson, F.: Scaling up MIMO: opportunities and challenges with very large arrays. IEEE Signal Process. Mag. 30(1), 40–60 (2012) 15. Barhumi, I., Leus, G., Moonen, M.: Optimal training design for MIMO OFDM systems in mobile wireless channels. IEEE Trans. Signal Process. 51(6), 1615–1624 (2003) 16. Minn, H., Al-Dhahir, N.: Optimal training signals for MIMO OFDM channel estimation. IEEE Trans. Wirel. Commun. 5(5), 1158–1168 (2006) 17. Nam, Y.H., Akimoto, Y., Kim, Y., Lee, M.i., Bhattad, K., Ekpenyong, A.: Evolution of reference signals for LTE-advanced systems. IEEE Commun. Mag. 50(2), 132–138 (2012) 18. Hassibi, B., Hochwald, B.M.: How much training is needed in multiple-antenna wireless links? IEEE Trans. Inf. Theory 49(4), 951–963 (2003) 19. Bjornson, E., Ottersten, B.: A framework for training-based estimation in arbitrarily correlated Rician MIMO channels with Rician disturbance. IEEE Trans. Signal Process. 58(3), 1807–1820 (2009) 20. Technical specification group radio access network; evolved universal terrestrial radio access (E-UTRA); physical channels and modulation. Release 13), TS 36.211 V13 2 (2016) 21. Correia, L.M.: Mobile Broadband Multimedia Networks: Techniques, Models and Tools for 4G. Elsevier (2010)

12

1 Introduction

22. Lu, L., Li, G.Y., Swindlehurst, A.L., Ashikhmin, A., Zhang, R.: An overview of massive MIMO: benefits and challenges. IEEE J. Sel. Topics Signal Process. 8(5), 742–758 (2014) 23. Dai, L., Wang, Z., Yang, Z.: Spectrally efficient time-frequency training OFDM for mobile large-scale MIMO systems. IEEE J. Sel. Areas Commun. 31(2), 251–263 (2013) 24. Qi, C., Wu, L.: Uplink channel estimation for massive MIMO systems exploring joint channel sparsity. Electron. Lett. 50(23), 1770–1772 (2014) 25. Gao, Z., Dai, L., Lu, Z., Yuen, C., Wang, Z.: Super-resolution sparse MIMO-OFDM channel estimation based on spatial and temporal correlations. IEEE Commun. Lett. 18(7), 1266–1269 (2014) 26. Gao, Z., Dai, L., Wang, Z.: Structured compressive sensing based superimposed pilot design in downlink large-scale MIMO systems. Electron. Lett. 50(12), 896–898 (2014) 27. Nguyen, S.L.H., Ghrayeb, A.: Compressive sensing-based channel estimation for massive multiuser MIMO systems. In: 2013 IEEE Wireless Communications and Networking Conference (WCNC), pp. 2890–2895. IEEE (2013) 28. Gao, Z., Dai, L., Wang, Z., Chen, S.: Spatially common sparsity based adaptive channel estimation and feedback for FDD massive MIMO. IEEE Trans. Signal Process. 63(23), 6169–6183 (2015) 29. Shen, W., Dai, L., Shim, B., Mumtaz, S., Wang, Z.: Joint CSIT acquisition based on low-rank matrix completion for FDD massive MIMO systems. IEEE Commun. Lett. 19(12), 2178–2181 (2015) 30. Björnson, E., Hoydis, J., Kountouris, M., Debbah, M.: Massive MIMO systems with non-ideal hardware: energy efficiency, estimation, and capacity limits. IEEE Trans. Inf. Theory 60(11), 7112–7139 (2014) 31. Noh, S., Zoltowski, M.D., Sung, Y., Love, D.J.: Pilot beam pattern design for channel estimation in massive MIMO systems. IEEE J. Sel. Topics Signal Process. 8(5), 787–801 (2014) 32. Choi, J., Love, D.J., Bidigare, P.: Downlink training techniques for FDD massive MIMO systems: open-loop and closed-loop training with memory. IEEE J. Sel. Topics Signal Process. 8(5), 802–814 (2014) 33. Cheng, P., Chen, Z.: Multidimensional compressive sensing based analog CSI feedback for massive MIMO-OFDM systems. In: 2014 IEEE 80th Vehicular Technology Conference (VTC2014Fall), pp. 1–6. IEEE (2014) 34. Kuo, P.H., Kung, H., Ting, P.A.: Compressive sensing based channel feedback protocols for spatially-correlated massive antenna arrays. In: 2012 IEEE Wireless Communications and Networking Conference (WCNC), pp. 492–497. IEEE (2012) 35. Rao, X., Lau, V.K.: Distributed compressive CSIT estimation and feedback for FDD multi-user massive MIMO systems. IEEE Trans. Signal Process. 62(12), 3261–3271 (2014) 36. Rao, X., Lau, V.K., Kong, X.: CSIT estimation and feedback for FDD multi-user massive MIMO systems. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3157–3161. IEEE (2014) 37. Gao, X., Dai, L., Han, S., Chih-Lin, I., Heath, R.W.: Energy-efficient hybrid analog and digital precoding for mmWave MIMO systems with large antenna arrays. IEEE J. Sel. Areas Commun. 34(4), 998–1009 (2016) 38. Alkhateeb, A., Mo, J., Gonzalez-Prelcic, N., Heath, R.W.: MIMO precoding and combining solutions for millimeter-wave systems. IEEE Commun. Mag. 52(12), 122–131 (2014) 39. Méndez-Rial, R., Rusu, C., González-Prelcic, N., Alkhateeb, A., Heath, R.W.: Hybrid MIMO architectures for millimeter wave communications: phase shifters or switches? IEEE Access 4, 247–267 (2016) 40. Committee, I.L.S., et al.: Wireless LAN media access control (MAC) and physical layer (PHY) specifications. http://standardsieee.org/getieee802/ (2009) 41. Kokshoorn, M., Chen, H., Wang, P., Li, Y., Vucetic, B.: Millimeter wave MIMO channel estimation using overlapped beam patterns and rate adaptation. IEEE Trans. Signal Process. 65(3), 601–616 (2016) 42. Xiao, Z., Xia, P., Xia, X.G.: Codebook design for millimeter-wave channel estimation with hybrid precoding structure. IEEE Trans. Wirel. Commun. 16(1), 141–153 (2016)

References

13

43. Xiao, Z., Xia, P., Xia, X.G.: Channel estimation and hybrid precoding for millimeter-wave MIMO systems: a low-complexity overall solution. IEEE Access 5, 16100–16110 (2017) 44. Lee, J., Gil, G.T., Lee, Y.H.: Channel estimation via orthogonal matching pursuit for hybrid MIMO systems in millimeter wave communications. IEEE Trans. Commun. 64(6), 2370–2386 (2016) 45. Wu, Y., Gu, Y., Wang, Z.: Channel estimation for mmWave MIMO with transmitter hardware impairments. IEEE Commun. Lett. 22(2), 320–323 (2017) 46. Gao, Z., Dai, L., Mi, D., Wang, Z., Imran, M.A., Shakir, M.Z.: mmWave massive-MIMO-based wireless backhaul for the 5G ultra-dense network. IEEE Wirel. Commun. 22(5), 13–21 (2015) 47. Zhou, Z., Fang, J., Yang, L., Li, H., Chen, Z., Li, S.: Channel estimation for millimeter-wave multiuser MIMO systems via PARAFAC decomposition. IEEE Trans. Wirel. Commun. 15(11), 7501–7516 (2016) 48. Gao, Z., Hu, C., Dai, L., Wang, Z.: Channel estimation for millimeter-wave massive MIMO with hybrid precoding over frequency-selective fading channels. IEEE Commun. Lett. 20(6), 1259–1262 (2016) 49. Dong, Y., Chen, C., Yi, N., Lu, G., Jin, Y.: Channel estimation using low-resolution PSS for wideband mmWave systems. In: 2017 IEEE 85th Vehicular Technology Conference (VTC Spring), pp. 1–5. IEEE (2017) 50. Venugopal, K., Alkhateeb, A., Prelcic, N.G., Heath, R.W.: Channel estimation for hybrid architecture-based wideband millimeter wave systems. IEEE J. Sel. Areas Commun. 35(9), 1996–2009 (2017) 51. Rodríguez-Fernández, J., González-Prelcic, N., Venugopal, K., Heath, R.W.: Frequencydomain compressive channel estimation for frequency-selective hybrid millimeter wave MIMO systems. IEEE Trans. Wirel. Commun. 17(5), 2946–2960 (2018) 52. Ma, X., Yang, F., Liu, S., Song, J., Han, Z.: Design and optimization on training sequence for mmWave communications: a new approach for sparse channel estimation in massive MIMO. IEEE J. Sel. Areas Commun. 35(7), 1486–1497 (2017) 53. Wang, B., Gao, F., Jin, S., Lin, H., Li, G.Y.: Spatial- and frequency-wideband effects in millimeter-wave massive MIMO systems. IEEE Trans. Signal Process. 66(13), 3393–3406 (2018) 54. Tsai, Y., Zheng, L., Wang, X.: Millimeter-wave beamformed full-dimensional MIMO channel estimation based on atomic norm minimization. IEEE Trans. Commun. 66(12), 6150–6163 (2018) 55. Hu, C., Dai, L., Mir, T., Gao, Z., Fang, J.: Super-resolution channel estimation for mmWave massive MIMO with hybrid precoding. IEEE Trans. Veh. Technol. 67(9), 8954–8958 (2018) 56. Di Renzo, M., Haas, H., Ghrayeb, A., Sugiura, S., Hanzo, L.: Spatial modulation for generalized MIMO: challenges, opportunities, and implementation. Proc. IEEE 102(1), 56–103 (2013) 57. Zheng, J.: Signal vector based list detection for spatial modulation. IEEE Wirel. Commun. Lett. 1(4), 265–267 (2012) 58. Legnain, R.M., Hafez, R.H., Legnain, A.M.: Improved spatial modulation for high spectral efficiency. ArXiv Preprint arXiv:1204.1414 (2012) 59. Legnain, R.M., Hafez, R.H., Marsland, I.D., Legnain, A.M.: A novel spatial modulation using MIMO spatial multiplexing. In: 2013 1st International Conference on Communications, Signal Processing, and Their Applications (ICCSPA), pp. 1–4. IEEE (2013) 60. Wang, J., Jia, S., Song, J.: Generalised spatial modulation system with multiple active transmit antennas and low complexity detection scheme. IEEE Trans. Wirel. Commun. 11(4), 1605– 1615 (2012) 61. Cal-Braz, J.A., Sampaio-Neto, R.: Low-complexity sphere decoding detector for generalized spatial modulation systems. IEEE Commun. Lett. 18(6), 949–952 (2014) 62. Duarte, M.F., Eldar, Y.C.: Structured compressed sensing: From theory to applications. IEEE Trans. Signal Process. 59(9), 4053–4085 (2011) 63. Liu, W., Wang, N., Jin, M., Xu, H.: Denoising detection for the generalized spatial modulation system using sparse property. IEEE Commun. Lett. 18(1), 22–25 (2013)

14

1 Introduction

64. Yu, C.M., Hsieh, S.H., Liang, H.W., Lu, C.S., Chung, W.H., Kuo, S.Y., Pei, S.C.: Compressed sensing detector design for space shift keying in MIMO systems. IEEE Commun. Lett. 16(10), 1556–1559 (2012) 65. Wang, B., Dai, L., Mir, T., Wang, Z.: Joint user activity and data detection based on structured compressive sensing for NOMA. IEEE Commun. Lett. 20(7), 1473–1476 (2016) 66. Du, Y., Cheng, C., Dong, B., Chen, Z., Wang, X., Fang, J., Li, S.: Block-sparsity-based multiuser detection for uplink grant-free NOMA. IEEE Trans. Wirel. Commun. 17(12), 7894–7909 (2018) 67. Jeong, B.K., Shim, B., Lee, K.B.: Map-based active user and data detection for massive machine-type communications. IEEE Trans. Veh. Technol. 67(9), 8481–8494 (2018) 68. Wang, B., Dai, L., Zhang, Y., Mir, T., Li, J.: Dynamic compressive sensing-based multi-user detection for uplink grant-free NOMA. IEEE Commun. Lett. 20(11), 2320–2323 (2016) 69. Du, Y., Dong, B., Chen, Z., Wang, X., Liu, Z., Gao, P., Li, S.: Efficient multi-user detection for uplink grant-free NOMA: Prior-information aided adaptive compressive sensing perspective. IEEE J. Sel. Areas Commun. 35(12), 2812–2828 (2017) 70. Ma, X., Kim, J., Yuan, D., Liu, H.: Two-level sparse structure-based compressive sensing detector for uplink spatial modulation with massive connectivity. IEEE Commun. Lett. 23(9), 1594–1597 (2019) 71. Xiao, L., Yang, P., Xiao, Y., Fan, S., Di Renzo, M., Xiang, W., Li, S.: Efficient compressive sensing detectors for generalized spatial modulation systems. IEEE Trans. Veh. Technol. 66(2), 1284–1298 (2016) 72. Xiao, L., Xiao, Y., Yang, P., Liu, J., Li, S., Xiang, W.: Space-time block coded differential spatial modulation. IEEE Trans. Veh. Technol. 66(10), 8821–8834 (2017) 73. Zhang, L., Zhao, M., Li, L.: Low-complexity multi-user detection for MBM in uplink largescale MIMO systems. IEEE Commun. Lett. 22(8), 1568–1571 (2018) 74. Shamasundar, B., Jacob, S., Theagarajan, L.N., Chockalingam, A.: Media-based modulation for the uplink in massive MIMO systems. IEEE Trans. Veh. Technol. 67(9), 8169–8183 (2018) 75. Liao, A., Gao, Z., Wu, Y., Wang, H., Alouini, M.S.: 2D unitary ESPRIT based super-resolution channel estimation for millimeter-wave massive MIMO with hybrid precoding. IEEE Access 5, 24747–24757 (2017) 76. Liao, A., Gao, Z., Wang, H., Chen, S., Alouini, M.S., Yin, H.: Closed-loop sparse channel estimation for wideband millimeter-wave full-dimensional MIMO systems. IEEE Trans. Commun. 67(12) 8329–8345 (2019) 77. Gao, Z., Dai, L., Qi, C., Yuen, C., Wang, Z.: Near-optimal signal detector based on structured compressive sensing for massive SM-MIMO. IEEE Trans. Veh. Technol. 66(2) 1860–1865 (2017) 78. Gao, Z., Dai, L., Wang, Z., Chen, S., Hanzo, L.: Compressive-sensing-based multiuser detector for the large-scale SM-MIMO uplink. IEEE Trans. Veh. Tech. 65(10) 8725–8730 (2016) 79. Qiao, L., Zhang, J., Gao, Z., Chen, S., Hanzo, L.: Compressive sensing based massive access for IoT relying on media modulation aided machine type communications. IEEE Trans. Veh. Technol. 69(9) 10391–10396 (2020) 80. Gao, Z., Zhang, C., Wang, Z., Chen, S.: Priori-information aided iterative hard threshold: A low-complexity high-accuracy compressive sensing based channel estimation for TDS-OFDM. IEEE Trans. Wirel. Commun. 14(1) 242–251 (2015) 81. Gao, Z., Dai, L., Han, S., Chih-Lin, I., Wang, Z., Hanzo, L.: Compressive sensing techniques for next-generation wireless communications. IEEE Wirel. Commun. 25(3), 144–153 (2018)

Chapter 2

Subspace-Based Super-Resolution Sparse Channel Estimation in MIMO-OFDM Systems

Abstract This chapter introduces a parametric sparse MIMO-OFDM CE scheme that can enable super-resolution estimation of path delays with arbitrary values based on the finite rate of innovation (FRI) theory. Since the wireless MIMO channels exhibit sparsity, the compressive sensing methods can be employed to achieve effective channel estimation. Furthermore, the spatial and temporal correlations of MIMO channels can be exploited to enhance the estimation performance. Specifically, the spatial channel correlation leads to the common sparse pattern on the path delays of different antennas, while the temporal channel correlation leads to the relatively unchanged sparsity pattern during several OFDM symbols. Taking advantage of these characteristics, the considered scheme outperforms existing state-of-the-art methods, and reduces the pilot overhead through joint signal processing across different antennas.1

2.1 Introduction For MIMO-OFDM systems, accurate CE is essential to guarantee the system performance [1]. Generally, there are two categories of CE scheme for MIMO-OFDM systems. The first one is non-parametric scheme, which adopts orthogonal frequencydomain pilots or orthogonal time-domain TS to convert the CE in MIMO systems to that in single antenna systems [1]. However, such scheme suffers from high pilot overhead when the number of TAs increases. The second category is parametric CE scheme, which exploits the sparsity of wireless channels to reduce the pilot overhead [2, 3]. The parametric scheme is more favorable for future wireless systems as it can achieve higher spectral efficiency. However, path delays of sparse channels are assumed to be located at the integer times of the sampling period, which is usually unrealistic in practice. In this chapter, a more practical sparse MIMO-OFDM CE scheme based on spatial and temporal correlations of sparse wireless MIMO channels is proposed to deal with arbitrary path delays. Specifically, the proposed scheme can achieve super-resolution 1

The work introduced in this chapter is based on the reference [11].

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3_2

15

16

2 Subspace-Based Super-Resolution Sparse Channel Estimation …

estimates of arbitrary path delays, which is more suitable for wireless channels in practice. Due to the small scale of the transmit and receive antenna arrays compared to the long signal transmission distance in typical MIMO antenna geometry, CIRs of different transmit-receive antenna pairs share common path delays [4], which can be translated as a common sparse pattern of CIRs due to the spatial correlation of MIMO channels. Meanwhile, such common sparse pattern is nearly unchanged along several adjacent OFDM symbols due to the temporal correlation of wireless channels [5, 6]. Compared with previous work which just simply extends the sparse CE scheme in single antenna systems to that in MIMO by exploiting the spatial correlation of MIMO channels [4] or only considers the temporal correlation for single antenna systems [5, 6], the proposed scheme exploits both spatial and temporal correlations to improve the CE accuracy. In addition, we reduce the pilot overhead by using the FRI theory [7], which can recovery the analog sparse signal with very low sampling rate, as a result, the average pilot overhead per antenna only depends on the channel sparsity level instead of the channel length. Notation: (·)† and (·) H are the Moore-Penrose matrix inversion operation and matrix conjugate transpose operation, respectively. diag{x} is a diagonal matrix with the vector x on its diagonal. The operator ∗ denotes the linear convolution.

2.2 Sparse MIMO Channel Model The MIMO channel is shown in Fig. 2.1, and its following characteristics will be considered in this chapter.

2.2.1 Channel Sparsity In typical outdoor communication scenarios, the CIR is intrinsically sparse due to several significant scatterers [2, 4]. For an Nt × Nr MIMO system, the CIR h (i, j) (t) between the ith TA and the jth receive antenna can be modelled as [8],

Fig. 2.1 Spatial and temporal correlations of wireless MIMO channels [11]

2.3 Sparse MIMO-OFDM CE

h (i, j) (t) =

P 

17 j) (i, j) α (i, ), 1 ≤ i ≤ Nt , 1 ≤ j ≤ Nr , p δ(t − τ p

(2.1)

p=1

where δ(·) is Dirac function, P is the total number of resolvable propagation paths, (i, j) (i, j) τ p and α p denote the path delay and path gain of the pth path, respectively.

2.2.2 Spatial Correlation Because the scale of the transmit or receive antenna array is very small compared to the long signal transmission distance, channels of different transmit-receive antenna pairs share very similar scatterers. Meanwhile, for most communication systems, the path delay difference from the similar scatterer is far less than the system sampling period. Therefore, CIRs of different transmit-receive antenna pairs share a common sparse pattern, although the corresponding path gains may be quite different [4].

2.2.3 Temporal Correlation For wireless channels, the path delays vary much slowly than the path gains, and the path gains vary continuously [5]. Thus, the channel sparse pattern is nearly unchanged during several adjacent OFDM symbols, and the path gains are also correlated [6].

2.3 Sparse MIMO-OFDM CE In this section, the widely used pilot pattern is briefly introduced at first, based on which a super-resolution sparse MIMO-OFDM CE method is then applied. Finally, the required number of pilots is discussed under the framework of the FRI theory.

2.3.1 Pilot Pattern The pilot pattern widely used in common MIMO-OFDM systems is illustrated in Fig. 2.2. In the frequency domain, N p pilots are uniformly spaced with the pilot interval D (e.g., D = 4 in Fig. 2.2). Meanwhile, every pilot is allocated with a pilot index l for 0 ≤ l ≤ N p − 1, which is ascending with the increase of the subcarrier index. Furthermore, to distinguish MIMO channels associated with different TAs,

18

2 Subspace-Based Super-Resolution Sparse Channel Estimation …

Fig. 2.2 Pilot pattern. Note that the specific Nt = 2, D = 4, N p = 4, N p_total = 8 are used for illustration purpose [11]

each TA uses a unique subcarrier index initial phase θi for 1 ≤ i ≤ Nt and (Nt − 1)N p zero subcarriers to ensure the orthogonality of pilots [3]. Therefore, for the ith TA, the subcarrier index of the lth pilot is i (l) = θi + l D, 0 ≤ l ≤ N p − 1. Ipilot

(2.2)

Consequently, the total pilot overhead per TA is N p_total = Nt N p , and thus N p can be also referred as the average pilot overhead per TA in this chapter.

2.3.2 Super-Resolution CE At the receiver, the equivalent baseband channel frequency response (CFR) H ( f ) can be expressed as H( f ) =

P 

α p e− j2π f τ p , − f s /2 ≤ f ≤ f s /2,

(2.3)

p=1

where superscripts i and j in (2.1) are omitted for convenience, f s = 1/Ts is the system bandwidth, and Ts is the sampling period. Meanwhile, the N -point discrete Fourier transform (DFT) of the time-domain equivalent baseband channel can be expressed as [4], i.e., H [k] = H (

k fs ), 0 ≤ k ≤ N − 1. N

(2.4)

Therefore, for the (i, j)th transmit-receive antenna pair, according to (2.2)–(2.4), the estimated CFRs over pilots can be written as

2.3 Sparse MIMO-OFDM CE

19

ˆ (i, j) [l] = H [I i (l)] = H ( (θi + l D) f s ) H pilot N P  (θ +l D) f (i, j) j) − j2π i N s τ p = α (i, + W (i, j) [l], p e

(2.5)

p=1

ˆ (i, j) [l] for 0 ≤ l ≤ N p − 1 can be obtained by using the conventional minwhere H imum mean square error (MMSE) or least square (LS) method [1], and W (i, j) [l] is the additive white Gaussian noise (AWGN). Equation (2.5) can be also written in a vector form as ˆ (i, j) [l] = (v(i, j) [l])T a(i, j) + W (i, j) [l], H (i, j)

(i, j)

(i, j)

(2.6) (i, j)

(i, j)

where v(i, j) [l] = [γ l Dτ1 , γ l Dτ2 , . . . , γ l Dτ P ]T , a(i, j) = [α1 γ θi τ1 , fs (i, j) θi τ2(i, j) (i, j) θi τ (i, j) T α2 γ , . . . , α P γ P ] , and γ = e− j2π N . Because the wireless channel is inherently sparse and the small scale of multiple transmit or receive antennas is negligible compared to the long signal transmission distance, CIRs of different transmit-receive antenna pairs share common path delays, which is equivalently translated as a common sparse pattern of CIRs due to the spatial (i, j) correlation of MIMO channels [4], i.e., τ p = τ p and v(i, j) [l] = v[l] for 1 ≤ p ≤ P, 1 ≤ i ≤ Nt , 1 ≤ j ≤ Nr . Hence by exploiting such spatially common sparse pattern shared among different receive antennas associated with the ith TA, we have ˆ i = VAi + Wi , 1 ≤ i ≤ Nt , H

(2.7)

ˆ i is where the N p × Nr measurement matrix H ⎡

ˆ (i,1) [0] ˆ (i,2) [0] H H ⎢ (i,1) ˆ ˆ (i,2) [1] [1] H ⎢ H ˆi =⎢ H .. .. ⎢ ⎣ . . ˆ (i,2) [N p − 1] ˆ (i,1) [N p − 1] H H

⎤ ˆ (i,Nr ) [0] H ˆ (i,Nr ) [1] ⎥ H ⎥ ⎥, .. ⎥ ⎦ . ˆ (i,Nr ) [N p − 1] ··· H

··· ··· .. .

V = [v[0], v[1], . . . , v[N p − 1]]T is a Vandermonde matrix of size N p × N p , Ai = [a(i,1) , a(i,2) , . . . , a(i,Nr ) ] of size N p × Nr , and Wi is an N p × Nr matrix with W (i, j) [l] in its jth column and the (l + 1)th row. When all Nt TAs are considered based on (2.7), we have ˆ = VA + W, H

(2.8)

ˆ 2, . . . , H ˆ Nt ] of size N p × Nt Nr , A = [A1 , A2 , . . . , A Nt ], and ˆ = [H ˆ 1, H where H 1 2 Nt W = [W , W , . . . , W ].

20

2 Subspace-Based Super-Resolution Sparse Channel Estimation …

Comparing the formulated problem (2.8) with the classical direction-of-arrival (DOA) problem [9], we find out that they are mathematically equivalent. Specifically, the traditional DOA problem is to typically estimate the DOAs of the P sources from a set of time-domain measurements, which are obtained from the N p sensors outputs at Nt Nr distinct time instants (time-domain samples). In contrast to our problem in (2.8), we try to estimate the path delays of P multipaths from a set of frequency-domain measurements, which are acquired from N p pilots of Nt Nr distinct antenna pairs (antenna-domain samples). It has been verified in [10] that the total LS estimating signal parameters via rotational invariance techniques (TLSESPRIT) algorithm in [9] can be applied to (2.8) to efficiently estimate path delays with arbitrary values. By using the TLS-ESPRIT algorithm, we can obtain super-resolution estimates ˆ can be obtained accordingly. Then of path delays, i.e., τˆ p for 1 ≤ p ≤ P, and thus V path gains can be acquired by the LS method [6], i.e., ˆ = (V ˆ H V) ˆ −1 V ˆ H H. ˆ ˆ =V ˆ †H A

(2.9)

j) θi τˆ p ˆ i.e., αˆ (i, For a certain entry of A, , because θi is known at the receiver and τˆ p has p γ been estimated after applying the TLS-ESPRIT algorithm, we can easily obtain the (i, j) estimation of the path gain αˆ p for 1 ≤ p ≤ P, 1 ≤ i ≤ Nt , 1 ≤ j ≤ Nr . Finally, the complete CFR estimation over all OFDM subcarriers can be obtained based on (2.3) and (2.4). Furthermore, we can also exploit the temporal correlation of wireless channels to improve the accuracy of the CE. First, path delays of CIRs during several adjacent OFDM symbols are nearly unchanged [5, 6], which is equivalently referred as a common sparse pattern of CIRs due to the temporal correlation of MIMO channels. Thus, the Vandermonde matrix V in (2.8) remains unchanged across several adjacent OFDM symbols. Moreover, path gains during adjacent OFDM symbols are also correlated owing to the temporal continuity of the CIR, so A’s in (2.8) for several adjacent OFDM symbols are also correlated. Therefore, when estimating CIRs of the ˆ of several adjacent OFDM symbols qth OFDM symbol, we can jointly exploit H’s based on (2.8), i.e., q+R ρ=q−R

q+R

ˆρ H

2R + 1

= Vq

ρ=q−R

q+R



2R + 1

+

ρ=q−R



2R + 1

.

(2.10)

where the subscript ρ is used to denote the index of the OFDM symbol, and the common sparse pattern of CIRs is assumed in 2R + 1 adjacent OFDM symbols [6]. In this way, the effective noise can be reduced, so the improved CE accuracy is expected. In contrast to the existing non-parametric scheme which estimates the channel by interpolating or predicting based on CFRs over pilots [1, 8], our proposed scheme exploits the sparsity as well as the spatial and temporal correlations of wireless

2.4 Simulation Results

21

MIMO channels to first acquire estimations of channel parameters including path delays and gains, and then obtain the estimation of CFR according to (2.3) and (2.4).

2.3.3 Discussion on Pilot Overhead Compared with the model of the multiple filters bank based on the FRI theory [10], it can be found out that CIRs of Nt Nr transmit-receive antenna pairs are equivalent to the Nt Nr semi-period sparse subspaces, and the N p pilots are equivalent to the N p multichannel filters. Therefore, by using the FRI theory, the smallest required number of pilots for each TA is N p = 2P in a noiseless scenario. For practical channels with the maximum delay spread τmax , although the normalized channel length L = τmax /Ts is usually very large, the sparsity level P is small, i.e., P  L [2]. Consequently, in contrast to the non-parametric CE method where the required number of pilots heavily depends on L, our proposed parametric scheme only needs 2P pilots in theory. Note that the number of pilots in practice is larger than 2P to improve the accuracy of the CE due to AWGN.

2.4 Simulation Results A simulation study was carried out to compare the performance of the proposed scheme with those of the existing state-of-the-art methods for MIMO-OFDM systems. The conventional comb-type pilot and time-domain training based orthogonal pilot (TTOP) [1] schemes were selected as the typical examples of the non-parametric CE scheme, while the recent time-frequency joint (TFJ) CE scheme [3] was selected as an example of the conventional parametric scheme. System parameters were set as follows: the carrier frequency is f c = 1 GHz, the system bandwidth is f s = 10 MHz, the size of the OFDM symbol is N = 4096, and N g = 256 is the guard interval length, which can combat channels whose maximum delay spread is 25.6 μs. The International Telecommunication Union Vehicular B (ITU-VB) channel model with the maximum delay spread 20 μs and the number of paths P = 6 [3] was considered. Figure 2.3 compares the mean square error (MSE) performance of different CE schemes. Both the static ITU-VB channel and time-varying ITU-VB channel with the mobile speed of 90 km/h in a 4 × 4 MIMO system were considered. The comb-type pilot based scheme used N p = 256 pilots, the TTOP scheme used N p = 64 pilots with T adjacent OFDM symbols for training where T = 4 for the time-varying channel and T = 8 for the static channel to achieve better performance, the TFJ scheme adopted time-domain TS of 256-length and N p = 64 pilots, and our proposed scheme used N p = 64 pilots with R = 4 for fair comparison. From Fig. 2.3, we can observe that the conventional parametric TFJ scheme is inferior to other three schemes obviously. Meanwhile, for static ITU-VB channel, the MSE performance of the proposed parametric scheme is more than 2 dB and 5 dB better than the TTOP and comb-type

22

2 Subspace-Based Super-Resolution Sparse Channel Estimation …

Fig. 2.3 MSE performance comparison of different schemes in a 4 × 4 MIMO system [11]. a Static channel; b time-varying channel with the mobile speed of 90 km/h

pilot based schemes, respectively. Moreover, for the time-varying ITU-VB channel, the superior performance of our proposed parametric scheme to conventional nonparametric schemes is more obvious. The existing sparse CE scheme [3] does not work, because path delays may not be located at the integer times of the sampling period for practical channels. The TTOP scheme works well over static channels, but it performs poorly over fast time-varying channels, since it assumes the channel is static during the adjacent OFDM symbols. Finally, the comb-type pilot based scheme performs worse than our proposed scheme, and it also suffers from much higher pilot overhead. Figure 2.4 compares the MSE performance of the proposed scheme in 4 × 4, 8 × 8 and 12 × 12 MIMO systems. We can observe that the MSE performance of the proposed scheme in 12 × 12 MIMO system is superior to that in 8 × 8 MIMO system by 5 dB with the same N p , and outperforms that in 4 × 4 MIMO system with the reduced N p . These simulations indicate that with the increased number of antennas, the MSE performance improves with the same N p . Equivalently, to achieve the same CE accuracy, the required number of pilots N p can be reduced. As a result, the total pilot overhead N p_total in our proposed scheme does not increase linearly with the number of TAs Nt because the required N p reduces when Nt increases accordingly. The reason is that with the increased number of antennas, the dimension ˆ in (2.8)) in the TLS-ESPRIT algorithm or the of the measurement matrix (e.g., H number of the sampling in the model of multiple filters bank [10] increases, thus the accuracy of the path delay estimate improves accordingly. The superior performance of the proposed scheme is contributed by following reasons. First, the spatially common sparse pattern shared among CIRs of differ-

2.5 Summary

23

Fig. 2.4 MSE performance of the proposed scheme in 4 × 4, 8 × 8 and 12 × 12 MIMO systems [11]

ent transmit-receive antenna pairs is exploited in the proposed scheme, such that we can employ the TLS-ESPRIT algorithm to obtain super-resolution estimations of path delays with arbitrary values. Meanwhile, the FRI theory indicates that the smallest required number of pilots is N p = 2P in a noiseless scenario. Therefore, the pilot overhead can be reduced as compared with conventional non-parametric schemes. Second, our scheme exploits the temporal correlation of wireless channels, namely, across several adjacent OFDM symbols, the sparse pattern of the CIR remains unchanged, and path gains are also correlated. Accordingly, by joint processing of signals of adjacent OFDM symbols based on (2.8), the effective noise can be reduced and thus the accuracy of the CE is improved further.

2.5 Summary This chapter investigates a super-resolution sparse MIMO CE scheme exploiting the sparsity as well as the spatial and temporal correlations of wireless MIMO channels. It can achieve super-resolution estimates of path delays with arbitrary values, and has higher CE accuracy than conventional schemes. Under the framework of the FRI theory, the required number of pilots in the proposed scheme is obviously less than that in non-parametric CE schemes. Moreover, simulations demonstrate that the average pilot overhead per TA will be interestingly reduced with the increased number of antennas.

24

2 Subspace-Based Super-Resolution Sparse Channel Estimation …

References 1. Barhumi, I., Leus, G., Moonen, M.: Optimal training design for MIMO OFDM systems in mobile wireless channels. IEEE Trans. Signal Process. 51(6), 1615–1624 (2003) 2. Bajwa, W.U., Haupt, J., Sayeed, A.M., Nowak, R.: Compressed channel sensing: a new approach to estimating sparse multipath channels. Proc. IEEE 98(6), 1058–1076 (2010) 3. Dai, L., Wang, Z., Yang, Z.: Spectrally efficient time-frequency training OFDM for mobile large-scale MIMO systems. IEEE J. Sel. Areas Commun. 31(2), 251–263 (2013) 4. Barbotin, Y., Hormati, A., Rangan, S., Vetterli, M.: Estimation of sparse MIMO channels with common support. IEEE Trans. Commun. 60(12), 3705–3716 (2012) 5. Telatar, I.E., Tse, D.N.C.: Capacity and mutual information of wideband multipath fading channels. IEEE Trans. Inf. Theor. 46(4), 1384–1400 (2000) 6. Dai, L., Wang, J., Wang, Z., Tsiaflakis, P., Moonen, M.: Spectrum-and energy-efficient OFDM based on simultaneous multi-channel reconstruction. IEEE Trans. Signal Process. 61(23), 6047–6059 (2013) 7. Dragotti, P.L., Vetterli, M., Blu, T.: Sampling moments and reconstructing signals of finite rate of innovation: Shannon meets strange-fix. IEEE Trans. Signal Process. 55(5), 1741–1757 (2007) 8. Stuber, G.L., Barry, J.R., Mclaughlin, S.W., Li, Y., Ingram, M.A., Pratt, T.G.: Broadband MIMO-OFDM wireless communications. Proc. IEEE 92(2), 271–294 (2004) 9. Roy, R., Kailath, T.: Esprit-estimation of signal parameters via rotational invariance techniques. IEEE Trans. Acoust. Speech Signal Process. 37(7), 984–995 (1989) 10. Gedalyahu, K., Eldar, Y.C.: Time-delay estimation from low-rate samples: a union of subspaces approach. IEEE Trans. Signal Process. 58(6), 3017–3031 (2010) 11. Gao, Z., Dai, L., Lu, Z., Yuen, C., Wang, Z.: Super-resolution sparse MIMO-OFDM channel estimation based on spatial and temporal correlations. IEEE Commun. Lett. 18(7), 1266–1269 (2014)

Chapter 3

Compressive Sensing Sparse Channel Estimation in FDD Massive MIMO Systems

Abstract Precise channel estimation (CE) is essential to fully realize the potential performance benefits of massive MIMO technology. However, the pilot overhead required by the conventional CE schemes is unacceptable due to the massive number of antennas, especially for FDD massive MIMO. To address this issue, this chapter introduces a structured compressive sensing-based joint CE scheme exploiting the spatio-temporal common sparsity of delay-domain MIMO channels. It starts by developing non-orthogonal pilots at the base station, guided by compressive sensing theory, to reduce pilot overhead. Subsequently, an Adaptive Structured Subspace Pursuit (ASSP) algorithm is considered at the user end to jointly estimate channels of multiple OFDM symbols, which leverages the spatio-temporal common sparsity of MIMO channels to enhance CE accuracy. By exploiting temporal channel correlation, a space-time adaptive pilot scheme is introduced to further reduce pilot overhead. Additionally, the discussion on the CE scheme is extended from the single-cell scenario to the multi-cell scenario for wider applications. Numerical results confirm that the considered scheme achieves accurate channel estimation with limited pilot overhead, approaching the performance of the optimal oracle least-square estimator.1

3.1 Introduction Massive MIMO employing a large number of antennas at the BS to simultaneously serve multiple users is an attractive approach to realize high-throughput green wireless communications [1]. By exploiting the large number of degrees of spatial freedom, massive MIMO can boost the system capacity and energy efficiency by orders of magnitude. Therefore, massive MIMO has been widely recognized as a key enabling technique for spectrum and energy efficient 5G communications [2]. In massive MIMO systems, an accurate acquisition of the CSI is essential for signal detection, beamforming, resource allocation, etc. However, due to massive antennas at the BS, each user has to estimate channels associated with hundreds of TAs, which results in the prohibitively high pilot overhead. Hence, how to realize the accurate 1

The work introduced in this chapter is based on the reference [32].

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3_3

25

26

3 Compressive Sensing Sparse Channel Estimation …

CE with the affordable pilot overhead becomes a challenging problem, especially for FDD massive MIMO systems [3].Some previous researches on massive MIMO sidestep this challenge by assuming the TDD protocol, where the CSI in the UL can be more easily acquired at the BS due to the small number of single-antenna users and the powerful processing capability of the BS, and then the channel reciprocity property can be leveraged to directly obtain the CSI in the downlink [4]. However, due to the calibration error of RF chains and limited coherence time, the CSI acquired in the UL may not be accurate for the downlink [5, 6]. More importantly, compared with TDD systems, FDD systems can provide more efficient communications with low latency [7], and it has dominated current cellular systems. Therefore, it is of importance to explore the challenging problem of CE for FDD massive MIMO systems, which can facilitate massive MIMO to be backward compatible with current FDD dominated cellular networks. On the other hand, for typical broadband wireless communication systems, delaydomain channels intrinsically exhibit the sparse nature due to the limited number of significant scatterers in the propagation environments and large channel delay spread [8–15]. Meanwhile, for MIMO systems with co-located antenna array at the BS, channels between one user and different TAs at the BS exhibit very similar path delays due to very similar scatterers in the propagation environments, which indicates that delay-domain channels between the user and different TAs at the BS share the common sparsity when the aperture of the antenna array is not very large [3, 16]. Moreover, since the path delays vary much slower than the path gains due to the temporal channel correlation, such sparsity is almost unchanged during the coherence time [17]. In this chapter, such channel properties of MIMO channels are referred to as the spatio-temporal common sparsity, which is usually not considered in most of current work. In this chapter, by exploiting the spatio-temporal common sparsity of delaydomain MIMO channels, we propose a SCS-based spatio-temporal joint CE scheme with significantly reduced pilot overhead for FDD massive MIMO systems. Specifically, at the BS, we propose a non-orthogonal pilot scheme under the framework of CS theory, which is essentially different from the widely used orthogonal pilots under the framework of classical Nyquist sampling theorem. Compared with conventional orthogonal pilots, the proposed non-orthogonal pilot scheme can substantially reduce the required pilot overhead for CE. At the user side, we propose an ASSP algorithm for CE, whereby the spatio-temporal common sparsity of delay-domain MIMO channels is leveraged to improve the CE performance from the limited number of pilots. Furthermore, by leveraging the temporal channel correlation, we propose a space-time adaptive pilot scheme to realize the accurate CE with further reduced pilot overhead, where the specific pilot signals should consider the geometry of antenna array at the BS and the mobility of served users. Additionally, we further extend the proposed CE scheme from the single-cell scenario to the multi-cell scenario. Finally, simulation results verify that the proposed scheme outperforms its conventional counterparts with reduced pilot overhead, where the performance of the SCS-based CE scheme approaches that of the oracle LS estimator.

3.2 Spatio-Temporal Common Sparsity of Delay-Domain

27

Notation: Boldface lower and upper-case symbols represent column vectors and matrices, respectively. The operator ◦ represents the Hadamard product, · denotes the integer floor operator, and diag{x} is a diagonal matrix with elements of x on its diagonal. The matrix inversion, transpose, and Hermitian transpose operations are denoted by (·)−1 , (·)T , and (·)H , respectively, while (·)† denotes the Moore-Penrose matrix inversion. | · |c denotes the cardinality of a set, the l2 -norm operation and Frobenius-norm operation are given by  · 2 and  ·  F , respectively. c denotes the complementary set of the set . Tr{·} is the trace of a matrix. ·, · is the Frobenius inner product, and A,B = Tr{AH B}. Finally, (l) denotes the lth column vector of the matrix .

3.2 Spatio-Temporal Common Sparsity of Delay-Domain Extensive experimental studies have shown that wireless broadband channels exhibit the sparsity in the delay domain. This is caused by the fact that the number of multipath dominating the majority of channel energy is small due to the limited number of significant scatterers in the wireless signal propagation environments, while the channel delay spread can be large due to the large difference between the time of arrival (ToA) of the earliest multipath and the ToA of the latest multipath [8–15]. Specifically, in the downlink, the delay-domain CIR between the mth TA at the BS and one user can be expressed as  T hm,r = h m,r [1], h m,r [2], · · · , h m,r [L] , 1 ≤ m ≤ M,

(3.1)

where r is the index of the OFDM symbol   in the time domain, L isthe equivalent channel length, Dm,r = supp{hm,r } = l : h m,r [l] > pth , 1 ≤ l ≤ L is the support to [18]. The sparsity level of wireless set of hm,r , and pth is the noise floor  according  channels is denoted as Pm,r =  Dm,r c , and we have Pm,r L due to the sparse nature of delay-domain channels [8–11].2 Moreover, there are measurements showing that CIRs between different TAs and one user exhibit very similar path delays [3, 16]. The reason is that, in typical massive MIMO geometry, the scale of the compact antenna array at the BS is relatively small compared with the large signal transmission distance, and channels associated with different transmit-receive antenna pairs share the common scatterers. Therefore, the sparsity patterns of CIRs of different transmit-receive antenna pairs have a large overlap. Moreover, for MIMO systems with not very large M, these CIRs can share the common sparse pattern [3, 16, 19], i.e.,

2

The sparse delay-domain channels may exhibit the power leakage due to the non-integer normalized path delays. To solve this issue, there have been off-the-shelf techniques to mitigate the power leakage [12, 13]. For convenience, we consider the sparse channel model in the equivalent discrete-time baseband widely used in CS-based CE [8–11].

28

3 Compressive Sensing Sparse Channel Estimation …

D1,r = D2,r = · · · = D M,r ,

(3.2)

which is referred to as the spatial common sparsity of wireless MIMO channels. For example, we consider the LTE-A system working at a carrier frequency of f c = 2 GHz with a signal bandwidth of f s = 10 MHz, and the ULA with the antenna spacing of half-wavelength. For two TAs with the distance of 8 half-wavelengths, = 4/ f c = their maximum difference of path delays from the common scatterer is 8λ/2 c 0.002 µs, which is negligible compared with the system sample period Ts = 1/ f s = 0.1 µs, where λ and c are the wavelength and the velocity of light, respectively. It should be pointed out that the path gains of different transmit-receive antenna pairs from the same scatterer can be different or even uncorrelated due to the non-isotropic antennas3 [5]. Finally, practical wireless channels also exhibit the temporal correlation even in fast time-varying scenarios [17]. It has been demonstrated that the path delays usually vary much slower than the path gains [17]. In other words, although the path gains can vary significantly from one OFDM symbol to another, the path delays remain almost unchanged during several successive OFDM symbols. This is due to the fact that the coherence time of path gains over time-varying channels is inversely proportional to the system carrier frequency, while the duration for path delay variation is inversely proportional to the system bandwidth [17]. For example, in the LTE-A system with f c = 2 GHz and f s = 10 MHz, the path delays vary at a rate that is about several hundred times slower than that of the path gains [10]. That is to say, during the coherence time of path delays, CIRs associated with R successive OFDM symbols have the common sparsity due to the almost unchanged path delays, i.e., Dm,r = Dm,r +1 = · · · = Dm,r +R−1 , 1 ≤ m ≤ M.

(3.3)

This temporal correlation of wireless channels is also referred to as the temporal common sparsity of wireless channels in this chapter. The spatial and temporal channel correlations discussed above are jointly referred to as the spatio-temporal common sparsity of delay-domain MIMO channels, which can be illustrated in Fig. 3.1. This channel property is usually not considered in existing CE schemes. In this chapter, we will exploit this channel property to overcome the challenging problem of CE for FDD massive MIMO.

3

For practical massive MIMO systems, different antennas at the BS with different directivities can destroy the spatial correlation of path gains over different transmit-receive pairs from the same scatterer and improve the system capacity [3]. However, this spatial channel correlation is usually exploited in conventional CE schemes for reduced pilot overhead, which can be unrealistic.

3.3 Proposed SCS-Based Spatio-Temporal Joint Channel Estimation Scheme

29

Fig. 3.1 Spatio-temporal common sparsity of delay-domain MIMO channels [32]: a wireless channels exhibit the sparse nature due to the limited number of scatterers; b delay-domain MIMO channels between the co-located antenna array and one user exhibit the spatio-temporal common sparsity

3.3 Proposed SCS-Based Spatio-Temporal Joint Channel Estimation Scheme In this section, the SCS-based spatio-temporal joint CE scheme is proposed for FDD massive MIMO. First, we propose the non-orthogonal pilot scheme at the BS to reduce the pilot overhead. Then, we propose the ASSP algorithm at the user for reliable CE. Moreover, we propose the space-time adaptive pilot scheme for further reduction of the pilot overhead. Finally, we briefly discuss the proposed CE scheme extended to multi-cell scenario.

30

3 Compressive Sensing Sparse Channel Estimation …

Fig. 3.2 Pilot designs for massive MIMO with M = 64 in one time-frequency resource block [32]. a conventional orthogonal pilot design; b proposed non-orthogonal pilot design

3.3.1 Non-orthogonal Pilot Scheme at the BS The design of conventional orthogonal pilots is based on the framework of classical Nyquist sampling theorem, and this design has been widely used in the existing MIMO systems. The orthogonal pilots can be illustrated in Fig. 3.2a, where pilots associated with different TAs occupy the different subcarriers. For massive MIMO systems with hundreds of TAs, such orthogonal pilots will suffer from the prohibitively high pilot overhead. In contrast, the design of the proposed non-orthogonal pilot scheme, as shown in Fig. 3.2b, is based on CS theory, and it allows pilots of different TAs to occupy the completely same subcarriers. By leveraging the sparse nature of channels, the pilots used for CE can be reduced substantially. For the proposed non-orthogonal pilot scheme, we first consider the MIMO CE for one OFDM symbol as an example. Particularly, we denote the index set of subcarriers allocated to pilots as ξ, which is uniquely selected from the set of {1, 2, . . . , N } and identical for all TAs. Here N p = |ξ|c is the number of pilot subcarriers in one OFDM symbol, and N is the number of subcarriers in one OFDM symbol. Moreover, we denote the pilot sequence of the mth TA as pm ∈ C N p ×1 . The specific pilot design ξ M will be detailed in Sect. 3.4.1. and {pm }m=1

3.3.2 SCS-Based CE at the User At the user, after the removal of the guard interval and DFT, the received pilot sequence yr ∈ C N p ×1 of the r th OFDM symbol can be expressed as

3.3 Proposed SCS-Based Spatio-Temporal Joint Channel Estimation Scheme

yr =

M  m=1

=

M  m=1

 diag{pm } F|ξ

31

hm,r + wr 0(N −L)×1

Pm F L |ξ hm,r + wr =

M 

(3.4) m hm,r + wr ,

m=1

where Pm = diag{pm }, F ∈ C N ×N is a DFT matrix, F L ∈ C N ×L is a partial DFT matrix consisted of the first L columns of F, F|ξ ∈ C N p ×N and F L |ξ ∈ C N p ×L are the sub-matrices by selecting the rows of F and F L according to ξ, respectively, wr ∈ C N p ×1 is the AWGN vector in the r th OFDM symbol, and m = Pm F L |ξ . Moreover, (3.4) can be rewritten in a more compact form as yr = h˜ r + wr ,

(3.5)

T T , h2,r , · · · , hTM,r ]T ∈ where  = [1 , 2 , · · · ,  M ] ∈ C N p ×M L , and h˜ r = [h1,r M L×1 C is an aggregate CIR vector. For massive MIMO systems, we usually have N p M L due to the large number of TAs M and the limited number of pilots N p . This indicates that we cannot reliably estimate h˜ r from yr using conventional CE schemes, since (3.5) is an underdetermined system. However, the observation that h˜ r is a sparse signal due to the M inspires us to estimate the sparse signal h˜ r of high dimension sparsity of {hm,r }m=1 from the received pilot sequence yr of low dimension under the framework of CS theory [20]. Moreover, the inherently spatial common sparsity of wireless MIMO channels can be also exploited for performance enhancement. Specifically, we rearrange the aggregate CIR vector h˜ r to obtain the equivalent CIR vector d˜ r as T T , d2,r , · · · , dTL ,r ]T ∈ C M L×1 , d˜ r = [d1,r

(3.6)

 T where dl,r = h 1,r [l], h 2,r [l], · · · , h M,r [l] for 1 ≤ l ≤ L. Similarly,  can be rearranged as , i.e.,  = [1 , 2 , · · · ,  L ] ∈ C N p ×M L ,

(3.7)



 (l) (l) N p ×M ,  , · · · ,  . In this way, where l = (l) 1 2 M = ψ 1,l , ψ 2,l , · · · , ψ M,l ∈ C (3.5) can be reformulated as (3.8) yr =  d˜ r + wr . From (3.8), it can be observed that due to the spatial common sparsity of wireless MIMO channels, the equivalent CIR vector d˜ r exhibits the structured sparsity [20]. Furthermore, the temporal correlation of wireless channels indicates that such spatial common sparsity in MIMO systems remains virtually unchanged over R successive OFDM symbols, where R is determined by the coherence time of the path delays [10]. Hence, wireless MIMO channels exhibit the spatio-temporal common

32

3 Compressive Sensing Sparse Channel Estimation …

sparsity during R successive OFDM symbols. Considering (3.8) during R adjacent OFDM symbols with the same pilot pattern, we have Y = D + W,

(3.9)

  where Y = yr ,yr +1 ,. . .,yr +R−1 ∈ C N p ×R is the measurement matrix, D =

 d˜ r ,d˜ r +1 ,. . .,d˜ r +R−1 ∈ C M L×R is the equivalent CIR matrix, and W= wr , wr +1 , . . .  , wr +R−1 ∈ C N p ×R is the AWGN matrix. It should be pointed out that D can be expressed as (3.10) D = [DT1 , DT2 , . . . , DTL ]T , where Dl for 1 ≤ l ≤ L has the size of M × R, and the mth row and r th column element of Dl is the channel gain of the lth path delay associated with the mth TA in the r th OFDM symbol. It is clear that the equivalent CIR matrix D in (3.10) exhibits the structured sparsity due to the spatio-temporal common sparsity of wireless MIMO channels, and this intrinsic sparsity in D can be exploited for better CE performance. In this way, we can jointly estimate channels associated with M TAs in R OFDM symbols by jointly processing the received pilots of R OFDM symbols. By exploiting the structured sparsity of D in (3.9), we propose the ASSP algorithm as described in Algorithm 3.1 [32] to estimate channels for massive MIMO systems. Developed from the classical SP algorithm [21], the proposed ASSP algorithm exploits the structured sparsity of D for further improved sparse signal recovery performance. For Algorithm 3.1, some notations should be further detailed. First, both Z ∈ 

C M L×R and D ∈ C M L×R are consisted of L sub-matrices with the equal size of 







M × R, i.e., Z = [ZT1 , ZT2 , . . . , ZTL ]T and D = [DT1 , DT2 , . . . , DTL ]T . Second, we have T  T

  T T D˜ = D(1) , D(2) , . . . , D(| and ˜ = (1) , (2) , . . . , (| ˜ ˜ ˜ | ˜ c) ˜ ˜ ˜ | ˜ ) , where c

˜ ˜ ˜ | ˜ c ) are elements in the set . ˜ Third, s (·) is a set, whose (1) < (2) < · · · < (| elements are the indices of the largest s elements of its argument. to reliably Finally, acquire the channel sparsity level, we stop the iteration if Rk F > Rs−1  F or    √ ˜ k , and pth is the Dl˜ ≤ M R pth , where Dl˜ is the smallest Dl for l ∈  F

F

F

noise floor according to [18]. The proposed stopping criteria will be further discussed in Sect. 3.4.2. Here we further explain the main steps in Algorithm 3.1 as follows. First, for step 2.1∼2.7, the ASSP algorithm aims to acquire the solution D to (9) with the fixed sparsity level way, which is similar to the classical SP algorithm. a greedy s in Second, Rk−1 F ≤ Rk F indicates that the s-sparse solution D to (9) has been obtained, and then the sparsity level is updated to find the (s + 1)-sparse solution D. Finally, if the stopping criteria are met, the iteration quits, and we consider the estimated solution to (3.9) with the last sparsity level as the estimated channels, i.e., 

D = Ds−1 .

3.3 Proposed SCS-Based Spatio-Temporal Joint Channel Estimation Scheme

33

Algorithm 3.1 Proposed ASSP Algorithm Input: Noisy measurement matrix Y and sensing matrix . m=M,t=r +R−1  . Output: The estimation of channels hm,t m=1,t=r • Step 1 (Initialization) The initial channel sparsity level s = 1, the iterative index k = 1, the support set k−1 = ∅, and the residual matrices Rk−1 = Y and Rs−1  F = + inf. • Step 2 (Solve the Structured Sparse Matrix D to (9)) repeat 1. (Correlation) Z =  H Rk−1 ;   ˜  k = k−1 ∪ s {Zl  F } L ; 2. (Support Estimate)  l=1 



3. (Support Pruning) D˜  k =  †˜  k Y; D(˜  k )c = 0;     L   k s Dl ˜ =  ; F l=1



4. (Matrix Estimate) D˜ k = 5. (Residue Update)

Rk k

 †˜ k Y; 





D(˜ k )c = 0;

= Y −  D; 

6. (Matrix Update) D = D; if Rk−1 F > Rk F ˜ k ; k = k + 1; 7. (Iteration with Fixed Sparsity Level) k =  else 

 k−1

8. (Update Sparsity Level) Ds = D ; Rs = Rk−1 ; s = k−1 ; s = s + 1; end if until stopping criteria are met  m=M,t=r +R−1  • Step 3 (Obtain Channels) D = Ds−1 and obtain the estimation of channels hm,t m=1,t=r according to (4)-(9).

Compared to the SP algorithm and the model-based SP algorithm [22], the proposed ASSP algorithm has the following distinctive features: • The classical SP algorithm reconstructs one high-dimensional sparse vector from one low-dimensional measurement vector without exploiting the structured sparsity of the sparse vector. The model-based SP algorithm reconstructs one highdimensional sparse vector from one low-dimensional measurement vector by exploiting the structured sparsity of the sparse vector for improved performance. In contrast, the proposed ASSP algorithm recovers the high-dimensional sparse matrix with the inherently structured sparsity from the low-dimensional measurement matrix, whereby the inherently structured sparsity of the sparse matrix is exploited for the improved matrix reconstruction performance. • Both the classical SP algorithm and model-based SP algorithm require the sparsity level as the priori information for reliable sparse signal reconstruction. In contrast, the proposed ASSP algorithm does not need this priori information, since it can adaptively acquire the sparsity level of the structured sparse matrix. By exploiting the practical physical property of wireless channels, the proposed stopping criteria enable ASSP algorithm to estimate channels with good MSE performance, which

34

3 Compressive Sensing Sparse Channel Estimation …

Fig. 3.3 Space-time adaptive pilot scheme, where M = 128, N G = 2, f d = 4, and the adjacent antenna spacing λ/2 are considered as an example [32]. a 2-D antenna array at the BS; b space-time adaptive pilot scheme

will be detailed in Sect. 3.4.2. Moreover, simulation results in Sect. 3.5 verify its accurate acquisition of channel sparsity level. Hence, the conventional SP algorithm and model-based SP algorithm can be considered as two special cases of the proposed ASSP algorithm. It should be pointed out that, most of the state-of-the-art CS-based CE schemes usually require the channel sparsity level as the priori information for reliable CE [10, 11, 19]. In contrast, the proposed ASSP algorithm removes this unrealistic assumption, since it can adaptively acquire the sparsity level of wireless MIMO channels.

3.3.3 Space-Time Adaptive Pilot Scheme As we have demonstrated in Sect. 3.2, the spatial common sparsity of MIMO channels is due to the co-located antenna array at the BS. However, for massive MIMO with large antenna array, such common sparsity may not be ensured for antennas spaced apart. To address this problem, we propose that M TAs are divided into NG antenna

3.3 Proposed SCS-Based Spatio-Temporal Joint Channel Estimation Scheme

35

groups, where MG = M/NG antennas with close distance in the spatial domain are assigned to the same antenna group, so that the spatial common sparsity of wireless MIMO channels in each antenna group can be guaranteed. For example, we consider the M = 128 planar antenna array as shown in Fig. 3.3a, which can be divided into two array groups according to the criterion above. If we consider f c = 2 GHz, f s = 10 MHz, and the maximum√distance for a pair of antennas in each antenna group as shown in Fig. 3.3a is√4 2λ, their maximum difference of √ path delays from the common scatterer is 4 c2λ = 4 2/ f c = 0.0028 µs, which is negligible compared with the system sample period Ts = 1/ f s = 0.1 µs. For a certain antenna group, pilots of different TAs are non-orthogonal and occupy the identical subcarriers, while pilots of different antenna groups are orthogonal in the time domain or frequency domain, which can be illustrated in Fig. 3.3b. For the specific parameter NG , we should consider the geometry and scale of the antenna array at the BS, f c , and f s . On the other hand, wireless MIMO channels exhibit the temporal correlation. Such temporal channel correlation indicates that during the coherence time of path gains, channels in several successive OFDM symbols can be considered to be quasistatic, and the CE in one OFDM symbol can be used to estimate channels of several adjacent OFDM symbols. This motivates us to further reduce the pilot overhead and increase the available spectrum and energy resources for effective data transmission. To be specific, as illustrated in Fig. 3.3, every f d adjacent OFDM symbols share the common pilots, where f d is determined by the coherence time of path gains or the mobility of served users. By exploiting such temporal channel correlation, we can use large f d to reduce the pilot overhead. To estimate channels of OFDM symbols without pilots, we can use interpolation algorithms according to the estimated channels of adjacent OFDM symbols with pilots, e.g., we can adopt the linear interpolation algorithm as follows hˆ m,r = [( f p + 1 − r )hˆ m,1 + (r − 1)hˆ m, f p +1 ]/ f p ,

(3.11)

where 1 < r ≤ f p , hˆ m,1 and hˆ m, f p +1 are the estimated channels of the first and ( f p + 1)th OFDM symbols, respectively, and hˆ m,r is the interpolated CE of the r th OFDM symbol. The proposed space-time adaptive pilot scheme considers both the geometry of the antenna array at the BS and the mobility of served users, which can achieve the reliable CE and further reduce the required pilot overhead. For the space-time adaptive pilot scheme, the proposed ASSP algorithm is used at the user to estimate channels associated with different TAs in each antenna group, where the received pilots associated with different antenna groups are processed separately. In Sect. 3.5, the simulation results will show that the proposed space-time adaptive pilot scheme can further reduce the required pilot overhead with a negligible performance loss, even for the high speed scenario where the users’ mobile velocity is 60 km/h.

36

3 Compressive Sensing Sparse Channel Estimation …

3.3.4 CE in Multi-Cell Massive MIMO In this subsection, we extend the proposed CE scheme from the single-cell scenario to the multi-cell scenario. We consider a cellular network composed of L = 7 hexagonal cells, each consisting of a central M-antenna BS and K single-antenna users that share the same bandwidth, where the users of the central target cell suffer from the interference of the surrounding L − 1 interfering cells. One straightforward solution to solve the pilot contamination from the interfering cells is the frequency-division multiplexing (FDM) scheme, i.e., pilots of adjacent cells are orthogonal in the frequency domain. FDM scheme can perfectly mitigate the pilot contamination if the training time used for CE is less than the channel coherence time, but it can lead to the L times pilot overhead in multi-cell system than that in single-cell system. An alternative solution is the time-division multiplexing (TDM) scheme [23], where pilots of adjacent cells are transmitted in different time slots. The pilot overhead with TDM scheme in multi-cell scenario is the same with that in single-cell scenario. However, the downlink precoded data from adjacent cells may degrade the CE performance of users in the target cell. In Sect. 3.5, we will verify that the TDM scheme can be the viable approach to mitigate the pilot contamination in multi-cell FDD massive MIMO systems due to the obviously reduced pilot overhead and the slightly performance loss compared to the FDM scheme.

3.4 Performance Analysis In this section, we first provide the design of the proposed non-orthogonal pilot scheme for reliable CE under the framework of CS theory. Then we analyze the convergence analysis and complexity of the proposed ASSP algorithm.

3.4.1 Non-Orthogonal Pilot Design Under the Framework of CS Theory In CS theory, design of the sensing matrix  in (3.9) is very important to effectively and reliably compress the high-dimensional sparse signal D. For the problem of CE, the design of  is converted to the design of the pilot placement ξ and the pilot M , since the sensing matrix  is only determined by the parameters sequences {pm }m=1 M ξ and {pm }m=1 . According to CS theory, the small column correlation of  is desired for the reliable sparse signal recovery [20], which enlightens us to appropriately M . design ξ and {pm }m=1 M For the specific pilot design, we commence by considering the design of {pm }m=1 to achieve the small cross-correlation for columns of l given any l, since this kind M , i.e., of cross-correlation is only determined by {pm }m=1

3.4 Performance Analysis

37

H (l) (ψ m 1 ,l )H ψ m 2 ,l = (l(m 1 ) )H l(m 2 ) = ((l) m 1 ) m 2 H (l) H = (pm 1 ◦ F(l) p ) (pm 2 ◦ F p ) = (pm 1 ) pm 2 .

(3.12)

where F p = F L |ξ and 1 ≤ m 1 < m 2 ≤ M.     N p ,M To achieve the small (ψ m 1 ,l )H ψ m 2 ,l , we consider θκ,m κ=1,m=1 to follow the independent and identically distributed (i.i.d.) uniform distribution U [0, 2π), where e jθκ,m denotes the κth element of pm ∈ C N p ×1 . For the proposed  pilot sequences, the l2 -norm of each column of  is a constant, i.e., ψ m,l 2 = N p . Meanwhile, we have   (ψ m ,l )H ψ m ,l  (pm 1 )H pm 2 1 2 = lim = 0, (3.13) lim N p →∞ ψ m ,l N p →∞ Np ψ m 2 ,l 2 1 2 which indicates that for the limited N p in practice, the proposed pilot sequences can achieve the good cross-correlation of columns of l for any l according to the random matrix theory (RMT). M , we further investigate the cross-correlation of Given the proposed {pm }m=1 and ψ with l  = l , which enlightens us to design ξ to achieve the small ψ 1 2 m ,l m ,l 2 2  1 1 H (ψ m ,l ) ψ m ,l . In typical massive MIMO systems (e.g., M ≥ 64), we usually have 1 1 2 2 N p > L, which is due to the two following reasons. First, since the number of pilots for estimating the channel associated with one TA is at least one, the number of the total pilot overhead N p can be at least 64. Second, since the maximum channel delay spread is 3 ∼ 5 µs and the typical system bandwidth is 10 MHz if we refer to the LTE-A system parameters, we have L ≤ 64 [24]. Based on the condition of N p > L, we to adopt the widely used uniformly-spaced pilots with the pilot interval   propose   N to acquire the small (ψ m 1 ,l1 )H ψ m 2 ,l2 . Specifically, we consider ξ is selected Np from the set of {1, 2, . . . , N } with the equal interval, and the inner product of ψ m 1 ,l1 and ψ m 2 ,l2 can be expressed as (l1 ) H (l2 ) (ψ m 1 ,l1 )H ψ m 2 ,l2 = (m ) m 2 = (pm 1 ◦ F(lp1 ) )H (pm 2 ◦ F(lp2 ) ) 1 H  Np  2π = exp j l1 I (κ) + jθκ,m 1 N κ=1   2π × exp j l2 I (κ) + jθκ,m 2 N   Np  2π = exp j l˜I (κ) + jθκ,m , N κ=1

(3.14)

p where {I (κ)}κ=1 = ξ is the indices set of pilot subcarriers, 1 ≤ l˜ = l2 − l1 ≤ L − 1, Np and θκ,m =θκ,m 2 −θκ,m 1 . Furthermore,  {I (κ)}κ=1 is selected  the set of  since  from

N

{1, 2, . . . , N } with the equal interval

N Np

, I (κ) = I0 + (κ − 1)

N Np

N p , where I0 is the subcarrier index of the first pilot with 1 ≤ I0 < (3.14) can be also expressed as



for 1 ≤ κ ≤  N . Hence, Np

38

3 Compressive Sensing Sparse Channel Estimation …

(ψ m 1 ,l1 ) ψ m 2 ,l2 = H

Np 

 exp

κ=1

Let ε =

N Np





N Np



    2π ˜ N + jθκ,m . j l I0 + (κ − 1) N Np

(3.15)

with ε ∈ [0, 1), we can further obtain

(ψ m 1 ,l1 ) ψ m 2 ,l2 = c0 H

Np 

 exp

κ=1

2π ˜ j lκ N

  N − ε + jθκ,m , Np



(3.16)

      ˜ I0 − N where c0 = exp j 2π . To investigate (ψ m 1 ,l1 )H ψ m 2 ,l2  with l1 = l2 , l N Np we consider the following two cases. For the first case, if m 1 = m 2 , then θκ,m = 0, and (3.16) can be simplified as (ψ m 1 ,l1 )H ψ m 2 ,l2 = c0

Np 

 exp

κ=1

where η = obtain

Np N

j

 2π ˜ lκ (1 − ηε) , Np

(3.17)

< 1 denotes the pilot occupation ratio. Thus, ηε ≈ 0, and we can

  ˜ c0 1 − e j2πl(1−ηε) (ψ m 1 ,l1 )H ψ m 2 ,l2 lim = lim   = 0, ˜ j 2π l(1−ηε) N p →∞ N p →∞ Np N p 1 − e Np 

(3.18)

 ˜

˜ j l j l(1−ηε) ≈ e N p = 1 guarantees the validity of (3.18) due to 1 ≤ l˜ ≤ where e N p L − 1 and L < N p . For the second case, if m 1 = m 2 , then (3.16) can be expressed as 2π



(ψ m 1 ,l1 ) ψ m 2 ,l2 = H

Np 

  exp j θ˜κ ,

(3.19)

κ=1

where θ˜κ = 2π l˜I (κ) +θκ,m for 1 ≤ κ ≤ N p follow the i.i.d. distribution U [0, 2π). N Similar to (3.13), we further have Np 

  exp j θ˜κ

(ψ m 1 ,l1 ) ψ m 2 ,l2 κ=1 = lim N p →∞ N p →∞ Np Np H

lim

= 0.

(3.20)

According to RMT, the asymptotic orthogonality of (3.13), (3.18), and (3.20) M can achieve the good cross-correlation indicates that the proposed ξ and {pm }m=1 between any two columns of  with the limited N p in practice. Moreover, compared with the conventional random pilot placement scheme widely used in CS-based CE

3.4 Performance Analysis

39

schemes [8], the proposed uniformly-spaced pilot placement scheme can be more easily implemented in practical systems due to its regular pattern. Moreover, it can also facilitate massive MIMO to be backward compatible with current cellular networks, since the uniformly-spaced pilot placement scheme has been widely used in existing cellular networks [25]. Finally, its reliable sparse signal recovery performance can be verified through simulations in Sect. 3.5.

3.4.2 Convergence Analysis of Proposed ASSP Algorithm For the proposed ASSP algorithm, we first provide the convergence with the correct sparsity level s = P. Then we provide the convergence for the case of s = P, where the proposed stopping criteria are also discussed. It should be pointed out that conventional SP algorithm and model-based SP algorithm analyze the convergence for the recovery of a single sparse vector. By contrast, we provide the convergence for the reconstruction of structured sparse matrix. The convergence for the case of s = P can be guaranteed due to the following theorem. Theorem 3.1 For Y = D + W and the ASSP algorithm with the sparsity level s = P, we have ˆ (3.21) ≤ c P W F , D−D F

k R < c Rk−1 + c W F , P P F F

(3.22)

ˆ is the estimation of D with s = P, and c P , c , and c are constants. where D P P Here c P , cP , and cP are determined by the structured restricted isometry property (SRIP) constants δ P , δ2P , and δ3P , which will be further detailed in Appendix A. The proof of Theorem 3.1 will be provided in Appendix A. Moreover, we investigate the convergence of the case with s = P. We consider D = Ds + (D − Ds ), where the matrix Ds preserves the largest s sub-matrices L {Dl }l=1 according to their F-norms and sets other sub-matrices to 0. In this way, (3.9) can be further expressed as Y =  Ds +  (D − Ds ) + W =  Ds + W ,

(3.23)

where W =  (D − Ds ) + W. For the case of s = P, we may not reliably recon

struct the P-sparse signal D even the s-sparse signal Ds is estimated. However, with the appropriate SRIP, Theorem 3.1 indicates that we can acquire partial correct support set from the estimated s-sparse matrix, i.e., s ∩ T = φ, where s is the support set of the estimated s-sparse matrix, T is the true support set of D, and φ denotes the null set. Hence s ∩ T = φ can reduce the number of iterations for the

40

3 Compressive Sensing Sparse Channel Estimation …

convergence with the sparsity level s + 1, since the first iteration with the sparsity level s + 1 uses s as the priori information (Step 2.2 in Algorithm 3.1). It should be pointed out that the proof of Theorem 3.1 does not rely on the estimated support set with the last sparsity level. Additionally, by exploiting the practical channel property, the proposed stopping criteria enable ASSP algorithm to achieve good MSE performance, and we will discuss the proposed stopping criteria as follows. The stopping criterion Rk F > Rs−1  F is clear as it implies that the residue of the current sparsity level is larger than that of the last sparsity level, and stopping the iteration can help the algorithm to the good MSE performance. On the other hand, the stopping criterion acquire √ ˜ path is dominated by AWGN. That is to Dl˜ ≤ MG R pth implies that the lth F

say, the channel sparsity level is over estimated, although MSE performance with the current sparsity level is better than that with the last sparsity level. Actually, the improvement of MSE performance is due to “reconstructing” noise.

3.4.3 Computational Complexity of ASSP Algorithm In each iteration of the proposed ASSP algorithm, the computational complexity mainly comes from the several operations as follows, where the space-time adaptive pilot scheme with MG TAs in each antenna group is considered. For Step 2.1, the correlation operation has the complexity of O(R L MG N p ). For Step 2.2, both the support merger and s (·) have the complexity of O(L) [26, 27], while the norm operation has the complexity of O(R L MG ). For Step 2.3, the Moore-Penrose matrix inversion operation has the complexity of O(2N p (MG s)2 + (MG s)3 ) [28], s (·) has the complexity of O(L), and the norm operation has the complexity of O(R L MG ). For Step 2.4, the Moore-Penrose matrix inversion operation has the complexity of O(2N p (MG s)2 + (MG s)3 ). For Step 2.5, the residue update has the complexity of O(R L MG N p ). To quantitatively compare the computational complexity of different operations, we consider the parameters used in Fig. 3.4 when the performance of the proposed ASSP algorithm approaches that of the oracle LS algorithm. In this case, the ratios of the complexity of the correlation operation, the support merger or s (·) operation, the norm operation, and the residue update to that of the Moore-Penrose matrix inversion operation are 2.3 × 10−2 , 1.7 × 10−6 , 5.7 × 10−5 , and 2.3 × 10−2 , respectively. Therefore, the main computational complexity of the ASSP algorithm comes from the Moore-Penrose matrix inversion operation with the complexity of O(2N p (MG s)2 + (MG s)3 ).

3.5 Simulation Results

41

Fig. 3.4 MSE performance comparison of different CE algorithms against pilot overhead ratio and SNR [32]

3.5 Simulation Results In this section, a simulation study was carried out to investigate the performance of the proposed CE scheme for FDD massive MIMO systems. To provide a benchmark for performance comparison, we consider the oracle LS algorithm by assuming the true channel support set known at the user and the oracle ASSP algorithm4 by assuming the true channel sparsity level known at the user. Moreover, to investigate the performance gain from the exploitation of the spatial common sparsity of CIRs, we provide the MSE performance of adaptive subspace pursuit (ASP) algorithm, which is a special case of the proposed ASSP algorithm without leveraging such spatial common sparsity of CIRs. Simulation system parameters were set as: system carrier was f c = 2 GHz, system bandwidth was f s = 10 MHz, DFT size was N = 4096, and the length of the guard interval was N g = 64, which could combat the maximum delay spread of 6.4 µs [24, 29]. We consider the 4 × 16 planar antenna array (M = 64), and MG = 32 is considered to guarantee the spatial common sparsity of channels in each antenna group, the number of pilots to estimate channels for one antenna group is N p , and the pilot overhead ratio is η p = (N p M)/(N f p MG ). The International Telecommunications Union Vehicular-A (ITU-VA) channel model with

4

The oracle ASSP algorithm is a special case of the proposed ASSP algorithm, where the initial channel sparsity level s is set to the true channel sparsity level, Step 2.8 is not performed, the  k−1 stopping criterion is Rk−1 F ≤ Rk F , and D=D in Step 3.

42

3 Compressive Sensing Sparse Channel Estimation …

Fig. 3.5 Estimated channel sparsity level of the proposed ASSP algorithm against SNR and pilot overhead ratio [32]

P = 6 paths was adopted [24]. Finally, pth was set as 0.1, 0.08, 0.06, 0.05, and 0.04 for SNR = 10 dB, 15 dB, 20 dB, 25 dB, and 30 dB, respectively. Figure 3.4 compares the MSE performance of the ASSP algorithm, the oracle ASSP algorithm, the ASP algorithm, and the oracle LS algorithm over static ITUVA channel. In the simulation, we only consider the CE for one OFDM symbol with R = 1 and f p = 1. From Fig. 3.4, it can be observed that the ASP algorithm performs poorly. The proposed ASSP algorithm outperforms the ASP algorithm, since the spatial common sparsity of MIMO channels is leveraged for the enhanced CE performance. Moreover, for η p ≥ 19.04%, the ASSP algorithm and the oracle ASSP algorithm have the similar MSE performance, and their performance approaches that of the oracle LS algorithm. This indicates that the proposed ASSP algorithm can reliably acquire the channel sparsity level and the support set for η p ≥ 19.04%. Moreover, the low pilot overhead implies that the average pilot overhead to estimate the channel associated with one TA is N p_avg = N p /MG = 12.18, which approaches 2P = 12, the minimum number of observations to reliably recover a P-sparse signal [30]. Therefore, the good sparse signal recovery performance of the proposed non-orthogonal pilot scheme and the near-optimal CE performance of the proposed ASSP algorithm are confirmed. From Fig. 3.4, we observe that the ASSP algorithm outperforms the oracle ASSP algorithm for η p < 19.04%, and its performance is even better than the performance bound obtained by the oracle LS algorithm with N p_avg < 2P at SNR = 10 dB. This is because the ASSP algorithm can adaptively acquire the effective channel sparsity level, denoted by Peff , instead of P can be used to achieve better CE performance. Consider η p = 17.09% at SNR = 10 dB as an example, we can find that Peff = 5 with high probability for the ASSP algorithm if we refer to Fig. 3.5. Hence, the average

3.5 Simulation Results

43

Fig. 3.6 MSE performance comparison of the proposed pilot placement scheme and the conventional random pilot placement scheme [32]

pilot overhead for each TA N p_avg = N p /MG = 10.9 is still larger than 2Peff = 10. From the analysis above, we come to the conclusion that, when N p is insufficient to estimate channels with P, the ASSP algorithm will estimate sparse channels with Peff < P, where path gains accounting for the majority of the channel energy will be estimated, while those with the small energy are discarded as noise. It should be pointed out that the MSE performance fluctuation of the ASSP algorithm at SNR = 10 dB is caused by the fact that Peff increases from 5 to 6 when η p increases, which leads some strong noise to be estimated as the channel paths, and thus degrades the MSE performance. Figure 3.5 depicts the estimated channel sparsity level of the proposed ASSP algorithm against SNR and pilot overhead ratio, where the vertical axis and the horizontal axis represent the used pilot overhead ratio and the adaptively estimated channel sparsity level, respectively, and the chroma denotes the probability of the estimated channel sparsity level. In the simulation, we consider R = 1 and f p = 1 without exploiting the temporal channel correlation. Clearly, the proposed ASSP algorithm can acquire the true channel sparsity level with high probability when SNR and pilot overhead ratio increase. Moreover, even in the case of insufficient number of pilots which cannot guarantee the reliable recovery of sparse channels, the proposed ASSP algorithm can still acquire the channel sparsity level with a slight deviation from the true channel sparsity level. Figure 3.6 compares the MSE performance of the proposed pilot placement scheme and the conventional random pilot placement scheme [8], where the proposed ASSP algorithm and the oracle LS algorithm are used. In the simulation, we

44

3 Compressive Sensing Sparse Channel Estimation …

Fig. 3.7 MSE performance comparison of the ASSP algorithm with different R’s over time-varying ITU-VA channel with the mobile speed of 60 km/h [32]

consider R = 1, f p = 1, and η p = 19.53 %. Clearly, the proposed pilot placement scheme and the conventional random pilot placement scheme have very similar performance. Due to the regular pilot placement, the proposed uniformly-spaced pilot placement scheme can be more easily implemented in practical systems. Moreover, the uniformly-spaced pilot placement scheme has been used in LTE-A systems, which can facilitate massive MIMO to be backward compatible with current cellular networks [25]. Figure 3.7 provides the MSE performance comparison of the proposed ASSP algorithm with (R = 4) and without (R = 1) exploiting the temporal common support of wireless channels, where the time-varying ITU-VA channel with the user’s mobile speed of 60 km/h is considered. In the simulation, f p = 1, and R = 1 or 4 denotes the joint processing of the received pilot signals in R successive OFDM symbols. It can be observed that the CE exploiting the temporal channel correlation performs better than that without considering such channel property, especially at low SNR, since more measurements can be used for the improved CE performance. Additionally, by jointly estimating MIMO channels associated with multiple OFDM symbols, we can further reduce the required computational complexity. To be specific, the main computational burden comes from the Moore-Penrose matrix inversion operation as discussed in Sect. 3.4.3, and the joint processing of received pilot signals in R OFDM symbols can share the Moore-Penrose matrix inversion operation, which

3.5 Simulation Results

45

Fig. 3.8 MSE performance comparison of ASSP algorithm with different f d ’s over time-varying ITU-VA channel with the mobile speed of 60 km/h [32]

indicates that the complexity can be reduced to 1/R of the complexity without using the temporal channel correlation. Figure 3.8 investigates the performance of the proposed space-time adaptive pilot scheme with different f p ’s in practical massive MIMO systems, where R = 1, the time-varying ITU-VA channel with the user’s mobile speed of 60 km/h is considered, and the pilot overhead ratios with different f p ’s are provided. In the simulation, f d = 1 and f d = 5 are considered, and the linear interpolation algorithm is used to estimate channels for OFDM symbols without pilots. From Fig. 3.8, it can be observed that the case with f d = 5 only suffers from a negligible performance loss compared to that with f d = 1 at SNR = 30 dB. While for SNR ≤ 20 dB, the case with f d = 5 is better than that with f d = 1, since the linear interpolation can reduce the effective noise. By exploiting the temporal channel correlation, the proposed spacetime adaptive pilot scheme can substantially reduce the required pilot overhead for CE without the obvious performance loss. Figure 3.9 provides the MSE performance comparison of several CE schemes for FDD massive MIMO systems, where we consider the CE for one OFDM symbol with R = 1 and f p = 1. The Cramer-Rao lower bound (CRLB) of conventional linear CE schemes (e.g., MMSE algorithm and LS algorithm) is also plotted as the performance benchmark, where CR L B = 1/SNR [10]. The ASP algorithm does not

46

3 Compressive Sensing Sparse Channel Estimation …

Fig. 3.9 MSE performance comparison of different CE schemes for FDD massive MIMO system [32]

perform well due to the insufficient pilots. The TFJ training based scheme [10] works poorly since the mutual interferences of time-domain TS of different TAs degrade the CE performance when M is large. Both the MMSE algorithm [31] and the proposed ASSP algorithm achieves 9 dB gain over the scheme proposed in [10], and both of them approach the CRLB of conventional linear algorithms. It is worth mentioning that the proposed scheme enjoys the significantly reduced pilot overhead compared with the MMSE algorithm, since the MMSE algorithm work well only when (3.8) is well-determined or over-determined. Finally, since the proposed ASSP algorithm can adaptively acquire the channel sparsity level and discards the MPCs buried by the noise at low SNR for improved CE, we can find the proposed scheme even works better than the oracle ASSP algorithm at low SNR. Figures 3.10 and 3.11 compare the downlink BER performance and average achievable throughput per user, respectively, where the BS using zero-forcing (ZF) precoding is assumed to know the estimated downlink channels. In the simulations, the BS with M = 64 antennas simultaneously serves K = 8 users using 16quadrature amplitude modulation (QAM), and the ZF precoding is based on the estimated channels corresponding to Fig. 3.9 under the same setup. It can be observed that the proposed CE scheme outperforms its counterparts. Figure 3.12 compares the average achievable throughput per user of different pilot decontamination schemes. In the simulations, we consider a multi-cell massive MIMO system with L = 7, M = 64, K = 8 sharing the same bandwidth, where the average achievable throughput per user in the central target cell suffering from the

3.5 Simulation Results

47

Fig. 3.10 BER performance comparison of different CE schemes for FDD massive MIMO systems [32]

Fig. 3.11 Comparison of average achievable throughput per user of different CE schemes for FDD massive MIMO systems [32]

48

3 Compressive Sensing Sparse Channel Estimation …

Fig. 3.12 Comparison of average achievable throughput per user of different pilot decontamination schemes for multi-cell FDD massive MIMO systems [32]

pilot contamination is investigated. Moreover, we consider R = 1, f d = 7, the path loss factor is 3.8 dB/km, the cell radius is 1 km, the distance D between the BS and its users can be from 100 m to 1 km, the SNR (the power of the unprecoded signal from the BS is considered in SNR) for cell-edge user is 10 dB, the mobile speed of users is 3 km/h. The BSs using ZF precoding is assumed to know the estimated downlink channels achieved by the proposed ASSP algorithm. For the FDM scheme, pilots of L = 7 cells are orthogonal in the frequency domain. The optimal performance is achieved by the FDM scheme when the users are static. Pilots of L = 7 cells in TDM are transmitted in L = 7 successive different time slots. In TDM scheme, the CE of users in central target cells suffers from the precoded downlink data transmission of other cells, where two cases are considered. The “cell-edge” case indicates that when users in the central target cell estimate the channels, the precoded downlink data transmission in other cells can guarantee SNR = 10 dB for their cell-edge users. While the “ergodic” case indicates that when users in the central target cell estimate the channels, the precoded downlink data transmission in other cells can guarantee SNR = 10 dB for their users with the the ergodic distance D from 100 m to 1 km. The negligible performance gap between the FDM scheme and the optimal one is due to the variation of time-varying channels, but it suffers from the high pilot overhead. The TDM scheme with the “cell-edge” case performs worst. While the performance of the TDM scheme with the “ergodic” case approaches that of the optimal one. The simulation results in Fig. 3.12 indicates that the TDM scheme with low pilot overhead can achieve the good performance when dealingthe pilot contamination in

References

49

multi-cell FDD massive MIMO systems. Moreover, if some appropriate scheduling strategies are considered [23], the performance of the TDM scheme can be further improved.

3.6 Summary In this chapter, we propose the SCS-based spatio-temporal joint CE scheme for FDD massive MIMO systems, whereby the intrinsically spatio-temporal common sparsity of wireless MIMO channels is exploited to reduce the pilot overhead. First, the nonorthogonal pilot scheme at the BS and the ASSP algorithm at the user can reliably estimate channels with significantly reduced pilot overhead. Then, the space-time adaptive pilot scheme can further reduce the required pilot overhead according to the mobility of users. Moreover, we discuss the proposed CE scheme in multi-cell scenario. Additionally, we discuss the non-orthogonal pilot design to achieve the reliable CE under the framework of CS theory, and the convergence analysis as well as the complexity analysis of the proposed ASSP algorithm are also provided. Simulation results show that the proposed CE scheme can achieve much better CE performance than its counterparts with substantially reduced pilot overhead, and it only suffers from a negligible performance loss when compared with the performance bound.

References 1. Larsson, E.G., Edfors, O., Tufvesson, F., Marzetta, T.L.: Massive MIMO for next generation wireless systems. IEEE Commun. Mag. 52(2), 186–195 (2014) 2. Lu, L., Li, G.Y., Swindlehurst, A.L., Ashikhmin, A., Zhang, R.: An overview of massive MIMO: benefits and challenges. IEEE J. Sel. Top. Signal Process. 8(5), 742–758 (2014) 3. Rusek, F., Persson, D., Lau, B.K., Larsson, E.G., Marzetta, T.L., Edfors, O., Tufvesson, F.: Scaling up MIMO: opportunities and challenges with very large arrays. IEEE Signal Process. Mag. 30(1), 40–60 (2012) 4. Zhang, J., Zhang, B., Chen, S., Mu, X., El-Hajjar, M., Hanzo, L.: Pilot contamination elimination for large-scale multiple-antenna aided OFDM systems. IEEE J. Sel. Top. Signal Process. 8(5), 759–772 (2014) 5. Björnson, E., Hoydis, J., Kountouris, M., Debbah, M.: Massive MIMO systems with non-ideal hardware: energy efficiency, estimation, and capacity limits. IEEE Trans. Inf. Theor. 60(11), 7112–7139 (2014) 6. Cho, Y.S., Kim, J., Yang, W.Y., Kang, C.G.: MIMO-OFDM wireless communications with MATLAB. Wiley (2010) 7. Xu, Y., Yue, G., Mao, S.: User grouping for massive MIMO in FDD systems: new design methods and analysis. IEEE Access 2, 947–959 (2014) 8. Bajwa, W.U., Haupt, J., Sayeed, A.M., Nowak, R.: Compressed channel sensing: a new approach to estimating sparse multipath channels. Proc. IEEE 98(6), 1058–1076 (2010) 9. Berger, C.R., Wang, Z., Huang, J., Zhou, S.: Application of compressive sensing to sparse channel estimation. IEEE Commun. Mag. 48(11), 164–174 (2010)

50

3 Compressive Sensing Sparse Channel Estimation …

10. Dai, L., Wang, Z., Yang, Z.: Spectrally efficient time-frequency training OFDM for mobile large-scale MIMO systems. IEEE J. Sel. Areas Commun. 31(2), 251–263 (2013) 11. Gui, G., Adachi, F.: Stable adaptive sparse filtering algorithms for estimating multiple-inputmultiple-output channels. IET Commun. 8(7), 1032–1040 (2014) 12. Hu, D., Wang, X., He, L.: A new sparse channel estimation and tracking method for timevarying OFDM systems. IEEE Trans. Vehic. Technol. 62(9), 4648–4653 (2013) 13. Berger, C.R., Zhou, S., Preisig, J.C., Willett, P.: Sparse channel estimation for multicarrier underwater acoustic communication: from subspace methods to compressed sensing. IEEE Trans. Signal Process. 58(3), 1708–1721 (2009) 14. Gao, Z., Dai, L., Yuen, C., Wang, Z.: Asymptotic orthogonality analysis of time-domain sparse massive MIMO channels. IEEE Commun. Lett. 19(10), 1826–1829 (2015) 15. Gao, Z., Zhang, C., Wang, Z., Chen, S.: Priori-information aided iterative hard threshold: a low-complexity high-accuracy compressive sensing based channel estimation for TDS-OFDM. IEEE Trans. Wirel. Commun. 14(1), 242–251 (2014) 16. Santos, T., Karedal, J., Almers, P., Tufvesson, F., Molisch, A.F.: Modeling the ultra-wideband outdoor channel: measurements and parameter extraction method. IEEE Trans. Wirel. Commun. 9(1), 282–290 (2010) 17. Telatar, I.E., Tse, D.N.C.: Capacity and mutual information of wideband multipath fading channels. IEEE Trans. Inf. Theor 46(4), 1384–1400 (2000) 18. Wan, F., Zhu, W.P., Swamy, M.: Semi-blind most significant tap detection for sparse channel estimation of OFDM systems. IEEE Trans. Circ. Syst. I Reg. Pap. 57(3), 703–713 (2009) 19. Qi, C., Wu, L.: Uplink channel estimation for massive MIMO systems exploring joint channel sparsity. Electron. Lett. 50(23), 1770–1772 (2014) 20. Duarte, M.F., Eldar, Y.C.: Structured compressed sensing: from theory to applications. IEEE Trans. Signal Process. 59(9), 4053–4085 (2011) 21. Dai, W., Milenkovic, O.: Subspace pursuit for compressive sensing signal reconstruction. IEEE Trans. Inf. Theor. 55(5), 2230–2249 (2009) 22. Baraniuk, R.G., Cevher, V., Duarte, M.F., Hegde, C.: Model-based compressive sensing. IEEE Trans. Inf. Theor. 56(4), 1982–2001 (2010) 23. Fernandes, F., Ashikhmin, A., Marzetta, T.L.: Inter-cell interference in noncooperative TDD large scale antenna systems. IEEE J. Sel. Areas Commun. 31(2), 192–201 (2013) 24. Technical specification group radio access network; evolved universal terrestrial radio access (e-ultra); physical channels and modulation. Release 13, TS 36.211 V13, 2 (2016) 25. Nam, Y.H., Akimoto, Y., Kim, Y., Lee, M.i., Bhattad, K., Ekpenyong, A.: Evolution of reference signals for LTE-advanced systems. IEEE Commun. Mag. 50(2), 132–138 (2012) 26. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press (2022) 27. Gao, X., Dai, L., Hu, Y., Zhang, Y., Wang, Z.: Low-complexity signal detection for large-scale MIMO in optical wireless communications. IEEE J. Sel. Areas Commun. 33(9), 1903–1912 (2015) 28. Björck, Å., et al.: Numerical Methods in Matrix Computations, vol. 59. Springer (2015) 29. Dai, L., Gao, X., Su, X., Han, S., Chih-Lin, I., Wang, Z.: Low-complexity soft-output signal detection based on Gauss-Seidel method for uplink multiuser large-scale MIMO systems. IEEE Trans. Vehic. Technol. 64(10), 4839–4845 (2014) 30. Donoho, D.L., Elad, M.: Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization. Proc. Nat. Acad. Sci. 100(5), 2197–2202 (2003) 31. Barhumi, I., Leus, G., Moonen, M.: Optimal training design for MIMO OFDM systems in mobile wireless channels. IEEE Trans. Signal Process. 51(6), 1615–1624 (2003) 32. Gao, Z., Dai, L., Dai, W., Shim, B., Wang, Z.: Structured compressive sensing-based spatiotemporal joint channel estimation for FDD massive MIMO. IEEE Trans. Commun. 64(2) 601–617 (2016)

Chapter 4

Compressive Sensing CSI Acquisition and Feedback in FDD Massive MIMO Systems

Abstract This chapter presents an adaptive channel estimation (CE) and feedback scheme for FDD based massive MIMO systems. This scheme can adaptively adjust the training overhead and pilot design, aiming at accurately estimating and feeding back the downlink CSI with significantly reduced overhead. In particular, a compressive sensing based adaptive CSI acquisition scheme is introduced by exploiting the spatially common sparsity of massive MIMO channels, where the time slot overhead is adaptively controlled relying on the sparsity level of the channels. Furthermore, a distributed sparsity adaptive matching pursuit (DSAMP) algorithm is developed to jointly estimate the channels of multiple subcarriers. Then, a closed-loop channel tracking scheme based on the temporal channel correlation is provided to adaptively design the non-orthogonal pilot and enhance the CE. Besides, this chapter also provides the performance analysis of the considered scheme as theoretical support and guidance. Finally, simulation results indicate that the considered scheme outperforms its counterparts and can approach the performance bound, highlighting its effectiveness.1

4.1 Introduction By exploiting the increased degree of freedom in the spatial domain, massive multiinput multi-output (MIMO) can enhance the spectrum efficiency and energy efficiency by orders of magnitude [1, 2]. To harvest the benefits of massive MIMO, the BS needs the accurate CSI in the downlink for beamforming, resource allocation, and other operations. However, it is challenging for the BS to acquire the accurate downlink CSI in FDD based massive MIMO, since the overhead used for the downlink CE and feedback can be prohibitively high. Most of the researches sidestep this challenge by assuming the TDD protocol. In TDD based massive MIMO, the CSI in the UL can be more easily acquired at the BS due to the limited number of users, and the channel reciprocity property can be exploited to realize the downlink CE using the UL CE [1–4]. However, in TDD massive MIMO, the CSI acquired in the UL may 1

The work introduced in this chapter is based on the reference [28].

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3_4

51

52

4 Compressive Sensing CSI Acquisition …

not be always accurate for the downlink due to the calibration error of RF chains [5]. In addition, FDD protocol still dominates current wireless networks, where the downlink CE is necessary, since the channel reciprocity does not hold. Thus, it is of great importance to explore an efficient approach to enable massive MIMO to be backward compatible with current wireless networks [6]. CEs in small-scale MIMO are usually based on orthogonal pilots [7–10]. In LTEA, for example, pilots associated with different BS antennas occupy the different frequency-domain subcarriers [9]. Pilot signals can be also orthogonal in the time or code domain. However, the overhead of orthogonal pilots increases with the number of BS antennas, which becomes unaffordable for massive MIMO. Besides, the CS based channel feedback schemes for massive MIMO were proposed to reduce the feedback overhead by exploiting the spatial correlation of CSI [11, 12], but these schemes do not consider downlink CE. Recent study and experiments have shown that the wireless channels between the BS and users exhibit a small angle spread seen from the BS [4, 13, 14]. Due to the small angle spread and large dimension of the channels, massive MIMO channels exhibit the sparsity in the virtual angular domain [15]. Moreover, since the spatial propagation characteristics of the wireless channels within the system bandwidth are nearly unchanged, such sparsity is shared by subchannels of different subcarriers when the widely used OFDM is considered. This phenomenon is referred to as the spatially common sparsity within the system bandwidth [16]. Besides, due to the temporal correlation of the channels [16], massive MIMO channels are quasistatic in several adjacent time slots or one time block consisting of multiple time slots. Moreover, the support set of the sparse channels in the virtual angular domain is almost unchanged in multiple time blocks, which is referred to as the spatially common sparsity during multiple time blocks. By exploiting the spatially common sparsity and the temporal correlation of massive MIMO channels, this chapter proposes an adaptive CE and feedback scheme with low overhead. The proposed scheme consists of two stages: a CS based adaptive CSI acquisition with the adaptive training overhead and a follow-up closed-loop channel tracking with the adaptive pilot design. Specifically, the BS transmits the proposed non-orthogonal pilot. The users simply feed back the received non-orthogonal pilot signals to the BS. According to the feedback signals, the CS based adaptive CSI acquisition scheme acquires the downlink CSI at the BS with the adaptive training time slot overhead. For this stage, a DSAMP algorithm is proposed to acquire the CSI, whereby the spatially common sparsity of massive MIMO channels within the system bandwidth is exploited. By exploiting the spatially common sparsity of massive MIMO channels during multiple time blocks, the closed-loop channel tracking scheme is proposed to track the channels in the second stage. For this stage, the BS can adaptively adjust the pilot signals according to the previous acquired CSI, and a simple LS algorithm is used to estimate the channels with improved performance. Additionally, we generalize the results for the conventional multiple-measurementvectors (MMV) to the generalized MMV (GMMV) and provide the CRLB of the proposed scheme, which enlightens us to design the non-orthogonal pilot signals. Simulation results verify that the proposed scheme is superior to its counterparts, and it is capable of approaching the performance bound.

4.2 System Model

53

Notation: Scalar variables are denoted by normal-face letters, while boldface lower and √ upper-case symbols denote column vectors and matrices, respectively, and j = −1 is the imaginary axis. The Moore-Penrose inversion, transpose and conjugate transpose operators are given by (·)† , (·)T and (·)∗ , respectively, while · is the integer ceiling operator and (·)−1 is the inverse operator. The 0 -norm and 2 -norm are given by  · 0 and  · 2 , respectively, and || is the cardinality of the set . The support set of the vector a is denoted by supp{a}, [a]i denotes the ith entry of the vector x, and [A]i, j denotes the ith-row and jth-column element of the matrix A, while I K is the K × K identity matrix. The rank of A is denoted by rank{A} and Tr{·} is the matrix trace operator, while E{·} is the expectation operator and var{·} is the variance of a random variable. Finally, (a) denotes the entries of a whose indices are defined by , while (A) denotes a sub-matrix of A with column indices defined by .

4.2 System Model 4.2.1 Massive MIMO in the Downlink In a typical massive MIMO system, the BS employing M antennas simultaneously serves K single-antenna users [2], where M  K . For the subchannel at the nth subcarrier, where 1 ≤ n ≤ N and N is the size of the OFDM symbol, the received signal yk,n of the kth user can be expressed as T xn + wk,n , yk,n = hk,n

(4.1)

where hk,n ∈ C M×1 denotes the downlink channel between the kth user and the M antennas at the BS, xn ∈ C M×1 is the transmitted signal after precoding, and wk,n is T  the associated AWGN. The received signal of the K users yn = y1,n y2,n · · · y K ,n ∈ C K ×1 can be collected together as yn = Hn xn + wn ,

(4.2)

T  in which Hn = h1,n h2,n · · · h K ,n ∈ C K ×M is the downlink channel matrix, and T  wn = w1,n w2,n · · · w K ,n ∈ C K ×1 is the corresponding AWGN vector.

4.2.2 Massive MIMO Channels in Virtual Angular Domain We model the channel vector hk,n by using the virtual angular domain representation [15, 16] hnT A∗B xn + wn , (4.3) yn = hnT xn + wn = 

54

4 Compressive Sensing CSI Acquisition …

Fig. 4.1 Channel vector representation in the virtual angular domain, where the BS employs the ULA with half wave-length spacing, M = 8, and two clusters of scatterers are considered as an example [28]

where the user index k in yk,n , hk,n and wk,n is dropped to simplify the notations, hnT A∗B and A B ∈ C M×M is the unitary matrix representing the transforwhile hnT =  mation matrix of the virtual angular domain at the BS side. A B is determined by the geometrical structure of the BS’s antenna array. To intuitively explain the channel vector  hn , a simple example is illustrated in Fig. 4.1, where the BS employs the ULA with the antenna spacing of d = λ/2 and λ is the wave-length. In this case, A B becomes the DFT matrix [16]. The channel vector in the virtual angular domain then simply means to ‘sample’ the channel in the angular domain at equi-spaced angular intervals at the BS side, or equivalently to represent the channel in the virtual angular domain coordinates. More specifically, the mth element of  hn is the channel gain consisting of the aggregation of all the paths, whose transmit/receive directions are within an angular window around the mth angular coordinate [16]. As the BS is usually elevated high with few scatterers around, while users are located at low elevation with relatively rich local scatterers, the angle spread at the BS side is small [4, 13, 14]. Since the angle spread is limited at the BS, a small part of the elements in  hn contain almost all the multipath signals reflected, diffracted, or refracted by scatterers around the user. If we take the typical angular-domain spread of 10◦ and the ULA with M = 128 as an example [4], the uniformly virtual angular domain sampling interval is ϕs = 180◦ /M = 1.406◦ [15], and the vast majority of the channel energy is concentrated on around 8 = 10◦ /1.406◦  virtual angular domain coordinates, which is far smaller than the total dimension M = 128 of the channel vector. Consequently,  hn exhibits the sparsity [16], namely,

4.2 System Model

55

Fig. 4.2 The virtual angular-domain channel vectors within the system bandwidth exhibit the common sparsity [28]

   |n | = supp  hn  = Sa M,

(4.4)

where n is the support set, and Sa is the sparsity level. Moreover, since the spatial propagation characteristics of the channels within the system bandwidth (e.g. 10 MHz in typical LTE-A systems) are almost unchanged, the subchannels associated with different subcarriers share very similar scatterers in the propagation environment [16]. Hence the small angle spreads of the subchan N nels within the system bandwidth are very similar. Consequently,  hn n=1 have the common sparsity, namely,       h2 = · · · = supp  h N = , supp  h1 = supp 

(4.5)

which is illustrated in Fig. 4.2.

4.2.3 Temporal Correlation of Wireless Channels Since the user mobility is not very high in massive MIMO systems, the channels remain static for the duration of a block that consists of J consecutive time slots, while the channels change from block to block. Here, one time slot represents one OFDM (q,t) (q) (q,t) symbol. This block fading implies that hn = hn for 1 ≤ t ≤ J , where hn is the (q) channel at the tth time slot of the qth block and hn denotes the quasi-static channel (q,t) (q) hn for in the qth block. Similarly, there exists the quasi-static relationship  hn =  (q,t) (q) (q,t) hn being the virtual angular representations of hn and 1 ≤ t ≤ J , with  hn and  (q) hn , respectively. For massive MIMO channels, J < M due to the limited coherence time and the large number of BS antennas. For example, consider massive MIMO systems with: the carrier frequency f c = 2 GHz, the system bandwidth Bs = 10 MHz, the OFDM

56

4 Compressive Sensing CSI Acquisition …

size N = 2048, the number of BS antennas M = 128, and the maximum delay spread τmax = 6.4 µs (need the guard interval N g = 64) [2, 17]. Suppose that the maximum mobile velocity of the supported users is v = 36 km/h. Then the maximum Doppler Hz, where c is the velocity of electromagnetic frequency shift is f d = v f c /c = 66.67

wave. Hence the coherence time Tc = 9/ 16π f d2 ≈ 6.3 ms [18], or the coherence time slots J = Tc Bs /(N + N g ) ≈ 30, which is much smaller than M. Since the channels change from block to block, they must be estimated in every time block, which may impose very high complexity and overhead. Fortunately, experiments and theoretical analysis have shown that although the channels vary continuously from one block to another, the variation rate of the channel angle spread is much lower than that of the associated channel gains [15]. This implies that  (q+1)  (q+Q−1)  (q) hn = · · · = supp  hn , supp  hn = supp 

(4.6)

where Q is the number of consecutive time blocks over the duration of which the common support of virtual angular domain channels holds. For the example of Fig. 4.1, assume that the distance between the BS and the user is L BU = 250 m and v = 36 km/h. Further assume the case of the mobile direction of the user being perpendicular to the direction connecting the BS and the user. Then, over the duration of Q = 5 successive time blocks, the maximum variance in the virtual angular domain is around θ = arctan (QTc v/L BU ) ≈ 0.072◦ . Such a small variance of the angle spread is negligible, compared to the resolution of the virtual angular domain ϕs = 1.406◦ . If v < 36 km/h and/or the mobile direction of the user is not perpendicular to the direction connecting the BS and the user, Q can be larger than 5.

4.2.4 Challenges of CE and Feedback Consider the downlink CE in the qth time block. To reliably estimate the channel of the nth subcarrier, the user should jointly utilize the received pilot signals over (q,t) be the received several successive time slots, say, G time slots, for CE. Let yn (q,t) pilot of (4.3) at the nth subcarrier in the tth time slot, and yn for 1 ≤ t ≤ G can  (q,1) (q,2) [q,G] (q,G) T = yn yn · · · yn ∈ CG×1 . Then be collected together in the vector yn yn[q,G] = Xn[q,G] hn(q) + wn[q,G] ,

(4.7)

 (q,1) (q,2) [q,G] (q,G) T (q,t) where Xn = xn xn · · · xn ∈ CG×M with xn ∈ C M×1 being the trans (q,1) (q,2) [q,G] (q,G) T = wn w n · · · wn ∈ mitted pilot signals in the tth time slot, and wn CG×1 is the corresponding AWGN vector. To accurately estimate the channel from (4.7), the value of G used in conventional algorithms, such as the MMSE algorithm, is heavily dependent on the value of M. Usually, G can be larger than J , which leads to the poor CE performance [10]. Moreover, to minimize the MSE of the channel

4.3 Spatially Common Sparsity Based Adaptive …

57

Fig. 4.3 a Time-domain orthogonal pilot [10], b time-frequency orthogonal pilot in LTE-A [9], and c proposed non-orthogonal pilot, assuming M = 8 [28] [q,G]

estimate, Xn should be a unitary matrix scaled by a transmit power factor [10]. [q,G] is a diagonal matrix with equal-power diagonal elements. Such a Usually, Xn pilot design is illustrated in Fig. 4.3a, which is called the time-domain orthogonal pilot. It should be pointed out that in MIMO-OFDM systems, to estimate the channel associated with one TA, P pilot subcarriers should be used, and usually P = N g is considered since Nc = N /N g adjacent subcarriers are correlated [10]. Hence the total pilot overhead to estimate the complete MIMO channel is Ptotal = P G = N g M. Similarly, LTE-A adopts the time-frequency orthogonal pilots as shown in Fig. 4.3b, which also needs Ptotal = P M = N g M. These two kinds of orthogonal pilots are equivalent, since both of them are based on the framework of Nyquist sampling theorem and have the same pilot overhead. Hence we only consider the time-domain orthogonal pilot in this chapter, and we will propose an efficient non-orthogonal pilot scheme. Codebook based channel feedback schemes are widely adopted in small-scale MIMO systems. However, to obtain the fine-grain spatial channel structures in massive MIMO systems, the codebook size can be huge. Moreover, the storage and encoding of large dimension codebook at the user is challenging. To overcome this difficulty, we combine the CE and feedback, whereby the CSI acquisition is mainly realized at the BS which has sufficient computation capability. By exploiting the spatially common sparsity and temporal correlation of massive MIMO channels, the proposed scheme can significantly reduce the required overhead and complexity for CE and feedback.

4.3 Spatially Common Sparsity Based Adaptive Channel Estimation and Feedback Scheme The procedure of the proposed adaptive CE and feedback scheme is first summarized. Step 1: In each time slot, the BS transmits a non-orthogonal pilot to the user, and the user directly feeds back the received pilot signal to the BS. Except for Step 4, the pilot signal is designed in advance.

58

4 Compressive Sensing CSI Acquisition …

Step 2: The BS uses the proposed DSAMP algorithm to jointly reconstruct multiple sparse virtual angular domain channels of high dimension from the feedback signals of low dimension collected in multiple time slots. Step 3: The BS judges the reliability of the estimated sparse channels according to a pre-specified criterion. If the given criterion is met, the BS stops transmitting pilot in the following time slots, and the acquired CSI at the BS is used for precoding and user scheduling in the current time block. Otherwise, the BS goes back to Step 1 until the feedback signals are sufficient for acquiring the reliable CSI. and the estimated Step 4: Since the BS has acquired the estimated support set  sparsity level Sa ,it can directly use the LS algorithm to estimate the channels in every time block of the following Q − 1 blocks. Here, the time slot overhead required in Step 1 can be reduced to G = Sa , and the pilot signals can be adaptively for further improving performance. adjusted according to  It is seen that the proposed adaptive CE and feedback scheme consists of two stages: the CS based adaptive CSI acquisition in the qth time block, which includes Step 1 to Step 3, and the following closed-loop channel tracking in the following Q − 1 time blocks, which includes Step 1 and Step 4. We now detail all the key technical components.

4.3.1 Non-orthogonal Pilot for Downlink CE The proposed non-orthogonal pilot scheme is illustrated in Fig. 4.3c. Similar to the time-domain orthogonal pilot scheme, P subcarriers are dedicated to pilots in each OFDM symbol. However, the proposed scheme allows the non-orthogonal pilot signals associated with different BS antennas to occupy the completely identical frequency-domain subcarriers. The orthogonal pilot based conventional designs usually require G ≥ M. By contrast, the proposed non-orthogonal pilot for CS based adaptive CSI acquisition is capable of providing the efficient compression and reliable recovery of sparse signals. Therefore, G is mainly determined by Sa M. The non-orthogonal pilot of the first stage is designed in advance, which will be discussed in Sect. 4.4.1. According to the CSI acquired in the first stage, the non-orthogonal pilot used for closed-loop channel tracking is adaptively designed to minimize both G and the MSE performance of CSI acquisition, which will be illustrated in Sect. 4.3.4. For the placement of pilot subcarriers, the widely used equi-spaced pilot is considered, and the specific reason is given in Sect. 4.4.1.3. For convenience, we denote ξ = {ξ1 , ξ2 , . . . , ξ P } as the index set of the pilot subcarriers, where ξ p for 1 ≤ p ≤ P denotes the subcarrier index dedicated to the pth pilot subcarrier. It is worth pointing out that the pth pilot subcarrier is shared by the pilot signals of the M TAs as illustrated in Fig. 4.3c.

4.3 Spatially Common Sparsity Based Adaptive …

59

4.3.2 CS Based Adaptive CSI Acquisition Scheme In the qth time block, as indicated in Step 1, the user directly feeds back the received pilot signals to the BS without performing downlink CE where the feedback channel can be considered as the AWGN channel [11, 12, 19, 20]. According to (4.7), at the BS, the fed back signal2 (at the ξ p th subcarrier) in the tth time slot can be expressed as

T ∗ (q,t) = h¯ (q) A B s p + v (q,t) , 1 ≤ p ≤ P, (4.8) r (q,t) p p p (q,t) (q,t) (q) (q) where r p = yξ p is the pth feedback pilot signal in the tth time slot, h¯ p =  hξ p is the virtual angular domain channel vector associated with the pth pilot subcarrier, (q,t) (q,t) (q,t) (q,t) s p = xξ p is the pilot SV transmitted by the M BS antennas, and v p = wξ p is the effective noise which aggregates both the downlink channel’s AWGN and feedback channel’s AWGN. Due to the quasi-static property of the channel during one time block, the feedback signals in G successive time slots can be jointly exploited to acquire the downlink CSI at the BS, which can be expressed as

∗ T (q) [q,G] [q,G] A B h¯ p +v p = [q,G] h¯ (q) = S[q,G] , r[q,G] p p p p +v p

(4.9)

 (q,1) (q,2)  [q,G] (q,G) T [q,G] (q,2) (q,G) T = rp rp . . . rp , S p = s(q,1) sp · · · sp for 1 ≤ p ≤ P, where r p p 

 T [q,G] (q,1) (q,2) (q,G) T [q,G] [q,G] A∗B ∈ CG×M . The ∈ CG×M , v p = vp vp · · · vp and p = Sp

2 

2 

[q,G]

[q,G] (q) E v p , according to system’s SNR is defined as SNR = E

p h¯ p 2 2 (4.9). By exploiting the spatially common sparsity within the system bandwidth, the proposed DSAMP algorithm can reconstruct the sparse angular domain channels of multiple pilot subcarriers, as will be detailed in Sect. 4.3.3. Algorithm 4.1 CS Based Adaptive CSI Acquisition Scheme 1: Determine the initial time slot overhead G 0 , and set the iteration index i = 0. 2: repeat [q,G ] [q,G ] 3: Collect r p i and p i in (4.9) for given G i , 1 ≤ p ≤ P. % G i is the required overhead at the i th iteration.

4:

(q)

h¯ p ∀ p by using the proposed DSAMP algorithm (Algorithm 4.2). Acquire the channel vectors

G i+1 = G i + 1; i = i + 1.

2 P  [q,G ] ¯ (q) i−1 ]

r[q,G hp 6: until − p i−1

p

/(P G i−1 )≤ ε. % If the error is smaller than the threshold ε, end repeat;

5:

p=1

7:

2

2

otherwise, continue transmitting the pilot in the next time slot. G 0 = G i − 1. % Optional, determine the initial time slot overhead for the next CS based adaptive CSI acquisition.

The delay of the feedback signal is negligible, compared with the relatively long channel coherence time.

60

4 Compressive Sensing CSI Acquisition …

For practical massive MIMO systems, the sparsity level Sa of the channels in the virtual angular domain can be time-varying. If Sa is relatively small, a small time slot overhead G is sufficient to acquire an accurate CSI estimate, while if Sa is relatively large, a large G is required to guarantee the reliable sparse signal recovery. We propose the CS based adaptive CSI acquisition as presented in Algorithm 4.1 [28], which can adaptively adjust G to acquire the reliable CSI at the BS efficiently. At the first CS based adaptive CSI acquisition, we need to empirically determine the initial time slot overhead G 0 . Since the typical angle spread is about 10◦ [4], for massive MIMO with M = 128, the effective sparsity level Sa = 8. Thus, we may set G 0 = 8 to start. Given G i , the DSAMP algorithm (Algorithm 4.2 [28]) acquires

2 (q) (q) P [q,G i ] [q,G i ]

¯ ¯ the set of channel vectors h p ∀ p. If p=1 r p − p hp

/(P G i ) is larger 2 than the predefined threshold ε, the sparse signal recovery is judged to be unreliable. Hence, the training time slots increase by one, and a set of the feedback pilot signals (q,G ) (q,G ) and transmitted pilot signals, r p i+1 and s p i+1 ∀ p, are collected in the (G i+1 )th [q,G ] [q,G ] time slot, which are combined with the previously collected r p i and p i to enlarge the dimension of the measurement vectors sequentially, yielding  i+1 ] r[q,G = p

[q,G ]

rp i (q,G ) r p i+1





[q,G i ]

p



i+1 ] = ⎣  (q,G i+1 ) T ∗ T ⎦ and [q,G p sp AB

to improve the CE. Furthermore, an appropriate initial time slot overhead for the next CS based adaptive CSI acquisition is automatically determined at the end.

4.3.3 Proposed DSAMP Algorithm for CE Given the measurements (4.9), the CSI can be acquired by solving the following optimization 

min

(q) h¯ p ,1≤ p≤P

[q,G]

s.t. r p

P p=1



¯ (q) 2 1/2

h p 0

[q,G] ¯ (q) hp ,

= p

P (q) ∀ p and h¯ p

(4.10)

p=1

share the common sparse support set. The DSAMP algorithm, listed in Algorithm 4.2, is used to solve the optimization (4.10) to simultaneously acquire multiple sparse channel vectors at different pilot subcarriers. This algorithm is developed from the sparsity adaptive matching pursuit (SAMP) algorithm [21]. Specifically, for each stage with the fixed sparsity level T , line 8 selects the T potential non-zero elements; line 9 estimates the values associated with the support set i−1 ∪  using LS; line 10 selects T most likely

4.3 Spatially Common Sparsity Based Adaptive …

61

Algorithm 4.2 Proposed DSAMP Algorithm [q,G]

Input: Noisy feedback signals r p threshold pth .

[q,G]

and sensing matrices p

in (4.10), 1 ≤ p ≤ P; termination (q)

Output: Estimated channel vectors in the virtual angular domain at multiple pilot subcarriers h¯ p , ∀ p. 1: T = 1; i = 1; j = 1. % T , i , j are the sparsity level of the current stage, iteration index, and stage index, respectively. M×1 , ∀ p. % c and t are intermediate variables, and clast is the CE of the last stage. 2: c p = t p = clast p p p p =0∈C

 = ∅; lmin =  3: 0 =  =  = = l = 0. % i is the estimated support set in the i th iteration,  ,   , ,  are sets, lmin and  and l denote the support indexes. [q,G]

4: b0p = r p ∈CG×1, ∀ p. % bip is the residual of the i th iteration.

P

last 2 last 5: p=1 b p = +∞. % b p is the residual of the last stage. 2 6: repeat   [q,G] ∗

7:

ap = p

bi−1 p , ∀ p.% Signal proxy is saved in a p .  

  P 2  8:  = arg max ,  = T . % Identify support. p=1 a p   2    † 

[q,G] [q,G] 9: t p i−1 ∪ = p r p , ∀ p. % LS estimation. i−1 ∪  

 

P

2  10: = arg max  2 , = T . % Prune support. p=1 t p   † 

[q,G] [q,G] 11: cp = p r p , ∀ p. % LS estimation. 12:

bp = rp

13:

[q,G]

[q,G]

− p c p , ∀ p. % Computethe residual.  P   2

c p  , lmin = arg min l ∈ . % Find the support of the minimum average energy according p=1 l 2  l

to the estimated c p .

14: 15: 16: 17:

  2 

if Pp=1 c p l /P < pth then min 2 Quit iteration. % Support index associated with AWGN may be included in .

2 

2 

else if P blast < P b p then p=1

p

2

p=1

2

Quit iteration. % Larger residual of the current stage than that of the last stage indicates that it is unnecessary to continue the iteration.

18: 19:

i−1 2  P

2 p=1 b p ≤ p=1 b p 2 then 2 last j= j +1; T= j; clast p = c p , b p = b p , ∀ p. % Begin a new stage. The larger residual of current iteration

else if

P

than that of last iteration indicates that iteration at current stage converges.

20: else 21: i = ; bip = b p , ∀ p; i = i + 1. % Continue the iteration at the current stage. 22: end if

  2 

23: until Pp=1 c p l /P < pth min

2

(q) h¯ p = clast 24: p , ∀ p. % Obtain the final CE.

62

4 Compressive Sensing CSI Acquisition …

supports. Lines 7–12 and line 21 together aim to find T virtual angular domain coordinates which contain most of the channel energy. In particular, Lines 7–12 remove wrong indices added in the previous iteration and add the indices associated with the potential true indices. If line 18 is triggered, the algorithm updates T and begins a new stage. The algorithm is halted when the stopping criteria, indicated in lines 14–17 and line 23, are met. Compared to the classical SAMP algorithm [21] which recovers one highdimensional sparse signal from single low-dimensional received signal, the proposed DSAMP algorithm can simultaneously recover multiple high-dimensional sparse signals with the common support set by jointly processing multiple low-dimensional received signals. In terms of termination condition, the SAMP algorithm stops the iteration once the residual is smaller than a threshold pth . By contrast, the proposed DSAMP algorithm has two halting criteria. Specifically, if the energy associated with one virtual angular coordinate in the estimated support set is smaller than pth or the residual of the current stage is larger than that of the previous stage, the algorithm stops. The proposed halting criteria ensure the robust signal recovery performance, which will be discussed in Sect. 4.4.4.2 and confirmed by simulations. By using the DSAMP algorithm at the BS, we can acquire the estimates of the (q) virtual angular domain channels at the pilot subcarriers, i.e., h¯ p for 1 ≤ p ≤ P. Consequently, the actual channel at the ξ p th subcarrier dedicated to the pth pilot signal can be acquired according to (4.8), yielding T (q) ∗ T (q) (q)  hξ p = A B h¯ p . hξ p = A∗B

(4.11)

4.3.4 Closed-Loop Channel Tracking with Adaptive Pilot Design Since the channels in Q successive time blocks share the spatially common sparsity, in the following Q − 1 time blocks, we can use the simple LS algorithm to estimate the channels at the BS from the feedback pilot signals by utilizing the estimated   (q)

 acquired in the qth time = supp Sa =  support set  h¯ p and the sparsity level block. Specifically, for the qb th time block, where q + 1 ≤ qb ≤ q + Q − 1, the BS first transmits a non-orthogonal pilot to the user, and the user again directly feeds back the received pilot signal to the BS. At the BS, similar to (4.9), the feedback [q ,G] pilot signal associated with the pth pilot subcarrier r p b can be expressed as T b ) [qb ,G] [qb ,G] b ,G] b) = S[qp b ,G] A∗B h¯ (q = [qp b ,G] h¯ (q , r[q p p + vp p + vp [q ,G]

(q )

[q ,G]

(4.12)

where S p b , h¯ p b and v p b are the pilot signal matrix, virtual angular domain channel, and effective noise in the qb th time block, respectively. If  and Sa are known, the CSI can be acquired using the LS algorithm as

4.4 Performance Analysis

63

 (qb )   †  b ,G] = [qp b ,G] r[q , h¯ p p 

(4.13)



(q )

which is an unbiased estimator for h¯ p b that is capable of approaching the CRLB and [22]. The BS can use  Sa , obtained in the qth time block, to calculate this LS estimate. (q ) As will be shown in Sect. 4.4.6, to acquire the estimate of h¯ p b , the required time slot overhead can be reduced to Sa . Moreover, the non-orthogonal pilot used for channel tracking (for the time blocks of q + 1 ≤ qb ≤ q + Q − 1) is very different from that used in the qth time Specifically, to minimize the MSE performance [qblock. ,G]

should be a unitary matrix scaled by a power of the CE with G = Sa , p b  √ factor G. Therefore, for the closed-loop channel tracking, we can design the nonorthogonal pilot signal to guarantee this condition, and reduce G to Sa while attaining the best MSE performance for the CE. Specifically, let G = Sa and U Sa ∈ C Sa ×Sa be a unitary matrix. Then    √ T 

[qp b ,G] = S[qp b ,G] A∗B = GU Sa , 



[q ,G]

which yields the required non-orthogonal pilot matrix S p b

(4.14)

  † √ T A∗B = GU Sa . 

4.4 Performance Analysis The performance analysis of the proposed scheme includes: (1) the non-orthogonal pilot design for the CS based adaptive CSI acquisition; (2) the theoretical limit of the required time slot overhead for the CS based adaptive CSI acquisition; (3) the placement of pilot subcarriers; (4) the computational complexity and convergence of the DSAMP algorithm; (5) the performance bound of the proposed scheme; (6) the required time slot overhead and the performance analysis for the adaptive non-orthogonal pilot based closed-loop channel tracking; and (7) the selection of thresholds for Algorithms 4.1 and 4.2.

4.4.1 Non-orthogonal Pilot Design for CS Based Adaptive CSI Acquisition [q,G]

In the qth time block, the measurement matrices p ∀ p in (4.9) are very important for guaranteeing the reliable CSI acquisition at the BS. Usually, G M. Since [q,G] [q,G] ∗ T A B and A B is determined by the geometrical structure of the = Sp

p [q,G] antenna array at the BS, the pilot signals S p ∀ p transmitted by the BS should be designed to guarantee the desired robust CE and feedback.

64

4.4.1.1

4 Compressive Sensing CSI Acquisition …

Restricted Isometry Property

In CS theory, RIP is used to evaluate the quality of the measurement matrix, in terms of the reliable compression and reconstruction of sparse signals. It is proven in [23] that the measurement matrix with its elements following the independent and identically distributed (i.i.d.) complex Gaussian distributions satisfies the RIP and enjoys a satisfying performance in compressing and recovering sparse signals.

4.4.1.2

Processing Multiple Measurement Vectors in Parallel

The optimization problem (4.10) is essentially different from the single-measurementvector (SMV) and MMV problems in CS. The SMV recovers single high-dimensional sparse signal f from its lowdimensional measurement signal d, which may be formulated as d = f, where

∈ C D×F , D < F, and the support set  = supp{f} with the sparsity level || = S F. On the other hand, the MMV can simultaneously recover multiple highdimensional sparse signals with the common support set and common measurement matrix from multiple low-dimensional measurement signals, which may be formulated as D = F, with D = [d1 d2 · · · d L ], F = [f1 f2 · · · f L ], supp {f1 } = supp {f2 } = · · · = supp {f L } = , and the sparsity level || = S. By contrast, our problem (4.10) can jointly reconstruct multiple high-dimensional sparse signals with the common support set but having different measurement matrices, i.e., dl = l fl , 1 ≤ l ≤ L ,

(4.15)

where l ∈ C D×F , ∀l. Therefore, our problem can be regarded as the GMMV problem, which includes the SMV and MMV problems as its special cases. Specifically, if the multiple measurement matrices are identical, our GMMV becomes the conventional MMV, and furthermore if L = 1, it reduces to the conventional SMV. Typically, the MMV has the better recovery performance than the SMV, due to the potential diversity from multiple sparse signals [23]. Intuitively, the recovery performance of multiple sparse signals with different measurement matrices, as defined in the GMMV, should be better than that with the common measurement matrix as given in the MMV. This is because the further potential diversity can benefit from different measurement matrices for the GMMV. To prove this, we investigate the uniqueness of the solution to the GMMV problem. First, we introduce the concept of ‘spark’ and the 0 -minimization based GMMV problem associated with (4.15). Definition 4.1 [23] The smallest number of columns of which are linearly dependent is the spark of the given matrix , denoted by spark( ). Problem 4.1 min

L 

fl ,∀l l=1

fl 20 , s.t. dl = l fl , supp {fl } = , ∀l.

4.4 Performance Analysis

65

For the above 0 -minimization based GMMV problem, the following result can be obtained. Theorem 4.1 For l , 1 ≤ l ≤ L, whose elements obey an i.i.d. continuous distribution, there exist full rank matrices l for 2 ≤ l ≤ L satisfying ( l ) = l ( 1 ) if we select ( 1 ) as the bridge, where  is the common support set. Consequently, fl for 1 ≤ l ≤ L will be the unique solution to Problem 4.1 if   2S < spark ( 1 ) − 1 + rank  D ,

(4.16)

  where  D = d1 2−1 d2 · · ·  L−1 d L . Proof Consider (4.15) with l ∈ C D×F for1 ≤ l ≤ L, whose elements follow an i.i.d. continuous distribution. The common support set is  = supp {fl } with the sparsity level || = S. This GMMV can be expressed as dl = ( l ) (fl ) = Zl (fl ) , 1 ≤ l ≤ L .

(4.17)

The random matrix Zl = ( l ) ∈C D×S is a tall matrix, as D>S. Clearly, rank {Zl } = S with high probability, since the measure of the set {Zl ∈ Z : rank{Zl } < S} is zero [24]. If we take ( 1 ) as the bridge, then there exist the full rank matrices l , 2 ≤ l ≤ L, satisfying ( l ) = l ( 1 ) , and thus l−1 dl = ( 1 ) (fl ) = 1 fl .

(4.18)

In this way, the GMMV is converted to the ‘equivalent’ MMV  D = 1 F,

(4.19)

where F = [f1 f2 · · · f L ]. Applying the existing result for the MMV given in [25], (4.16) can be directly obtained.  From Theorem 4.1, it is clear that the achievable diversity gain introduced  by  diversifying measurement matrices and sparse vectors is determined by rank  D .   The larger rank  D is, the more reliable recovery of sparse signals can be achieved. Hence, compared to the SMV and MMV, more reliable recovery performance can be achieved by the proposed GMMV. For the special case that multiple sparse signals are identical, the MMV reduces to the SMV since rank (D) = 1, and there is no diversity gain by introducing multiple identical sparse signals. However, the GMMV in this case can still achieve diversity gain which comes from diversifying measurement matrices.

66

4.4.1.3

4 Compressive Sensing CSI Acquisition …

Pilot Design for CS Based Adaptive CSI Acquisition

According to the above discussions, a measurement matrix whose elements follow an i.i.d. Gaussian distribution satisfies the RIP. Furthermore, diversifying measurement matrices can further improve the recovery performance of sparse signals. This enlightens us to appropriately design pilot signals. Specifically, each element of pilot signals is given by  [q,G]  =ejθt,m, p , 1 ≤ t ≤ G, 1 ≤ m ≤ M, Sp t,m

(4.20)

[q,G]

where S p ∈ CG×M , and each θt,m, p has the i.i.d. uniform distribution in [0, 2π ), namely, the i.i.d. U [0, 2π ). Note that the pilot signals for the CS based adaptive CSI acquisition are fixed once they have been designed. Moreover, when designing the pilot signals, the worst case of G = M has to be considered. It is readily seen that [q,G] of (4.10), ∀ p, the designed pilot signals (4.20) guarantee that the elements of p obey the i.i.d. complex Gaussian distribution with zero mean and unit variance, i.e., the i.i.d. CN(0, 1). Hence, the proposed pilot signal design is ‘optimal’, in terms of the reliable compression and recovery of sparse angular domain channels.

4.4.2 Time Slot Overhead for CS Based Adaptive CSI Acquisition [q,G]  According to Theorem 4.1, for the optimization  D = 1  F with  problem (4.10),  [q,G] −1 [q,G]  [q,G] (q) (q) (q) (q)  D = r1 2 r2 · · · P−1r P and F = h¯ 1 h¯ 2 · · · h¯ P . Since supp h¯ p  =

Sa , it is clear that   rank  D ≤ rank {F} ≤ Sa . [q,G]

Moreover, as 1

(4.21)

∈ CG×M , [q,G]

∈ {2, 3, . . . , G + 1} . spark 1

(4.22)

Substituting (4.21) and (4.22) into (4.16) yields G ≥ Sa + 1. Therefore, the smallest required time slot overhead is G = Sa + 1. As discussed in Sect. 4.3.2, an appropriate value of G that ensures the reliable CSI acquisition is adaptively determined by Algorithm 4.1. By increasing the number of measurement vectors P, the required time slot overhead for reliable CE can be reduced, since more measurement matrices and sparse signals can increase rank  D .

4.4 Performance Analysis

67

4.4.3 Frequency-Domain Placement of Pilot Signals Like any OFDM channel estimator, the proposed adaptive CE and feedback scheme only estimates the channels at pilot subcarriers. Channels at data subcarriers are usually acquired based on the estimated channels at pilot subcarriers by using the off-the-shelf interpolation algorithms [18]. Clearly the frequency-domain placement of pilot signals ξ significantly influences the achievable performance of an interpolation algorithm. Additionally, due to the frequency-domain correlation of wireless channels, the channels of adjacent subcarriers exhibit strong correlation. Hence, two adjacent subcarriers both dedicated to the pilot may result in  D to be rank deficient. We adopt the widely used uniformly-spacing pilot placement with the spacing equal to the coherence bandwidth [18], which can reduce the correlation between different virtual angular domain channels, so that more diversity gain from the multiple sparse channels can be achieved.

4.4.4 Performance Analysis of Proposed DSAMP Algorithm 4.4.4.1

Complexity

The computational complexity of the proposed DSAMP algorithm (Algorithm 4.2) in each iteration mainly depends on the following operations. Signal proxy (line 7): The

matrix-vector multiplication involved has the complexity on the order of O P M G . (lines 8, 10, 13, 14, 16, 18, and 23): The computational com2 -norm operation

plexity is O P . Identifying or Pruning (lines 8 and 10): The cost to locate the largest T entries of a size-M vector is O(M) [23]. LS operation

LS solution has the computational complexity on the (lines 9 and 11): order of O P(2GT 2 + T 3 ) due to the joint recovery of P sparse signals [26]. Residual

computation (line 12): The complexity of computing the residual is O P MG . Obviously the matrix inversion implemented in Algorithm 4.2 for LS operation contributes to most of the computational complexity. Table 4.1 compares the com-

Table 4.1 Computational complexity to estimate one sparse signal [28] Algorithm Number of complex multiplications in each iteration OMP SP SAMP DSAMP

2G M 2G M 2G M 2G M

+ M + 2Gi 2 + i 3 + G + M + 2Sa + 2(2G Sa2 + Sa3 ) + G + M + 3Sa + 2(2G j 2 + j 3 ) + G + M + 3Sa + 2(2G j 2 + j 3 )

Note i denotes the iteration index, and j denotes the stage index.

68

4 Compressive Sensing CSI Acquisition …

plexity of the proposed DSAMP algorithm, classical OMP algorithm, SP algorithm [23], and SAMP algorithm, in terms of the number of required complex multiplications in each iteration to estimate one sparse signal. It is clear that the four algorithms have the same order of computational complexity.

4.4.4.2

Stopping Criteria

For the conventional SAMP algorithm, the iterative procedure stops when the residual is less than a given threshold. By contrast, the proposed DSAMP algorithm has two halting criteria, and meeting either of them will trigger the termination of the iterative procedure. Regarding the first halting criterion, when the average energy of the wireless channels at a certain virtual angular coordinate is lower than the noise floor (lines 14 and 23), the iterative procedure stops. When the residual of the current stage becomes larger than that of the previous stage (line 16), the second halting criterion is met and the algorithm also terminates. Due to Sa M, after coordinates accounting for the majority of the channel energy is achieved, the next iteration will include a virtual angular coordinate that is dominated by the AWGN. The energy of such a new coordinate is usually lower than the noise floor. The first stopping criterion is designed to detect this situation and to terminate the algorithm when an appropriate number of virtual angular domain coordinates have been tracked. As for the second halting criterion, the DSAMP algorithm is similar to the conventional SP algorithm in each stage with the fixed sparsity level, which can guarantee the sparse signal recovery with the exact sparsity level. The residual of the stage with the exact sparsity level is usually smaller than that with the incorrect sparsity level. Therefore, the DSAMP algorithm stops at the stage when the smallest residual is reached, which tends to be the stage associated with the exact sparsity level of the channels in the virtual angular domain.

4.4.5 Performance Bound of CE By omitting q, p, and ξ p in (4.11) for simplicity, the variance of the CE can be expressed as 



2   

T ¯ ∗ T ¯ 2 h − h 2 = E A∗B var h = E h − A B h 2 

2  

¯ ¯ h − h = var = E h¯ . 2

(4.23)

Consider the CRLB for the estimation problem associated with (4.9) given the true channel h¯ and the support set . Again for notational simplicity, q, G, and p in

4.4 Performance Analysis

69

[q,G] [q,G] [q,G] r p , p , and v p are omitted. Since the distribution of v is CN 0, σ 2 IG , the conditional probability density function (PDF) of r given h¯ is ¯ 2 r−( ) (h)

 2 1 − σ2 pr|h¯ r|h¯ = ,

G e πσ2

(4.24)

where σ 2 is the power of the effective noise. The

element at the si th-row and s j thcolumn of the Fisher information matrix I h¯  associated with this estimation problem is    1  I h¯  s ,s = 2 (( ) )∗ ( ) si ,s j , i j σ

(4.25)

where 1 ≤ si , s j ≤ ||. Therefore, we have

−1     ∗ −1 var h¯ ≥Tr I h¯  = σ 2 Tr .

 

(4.26)

Let λ1 , λ2 , . . . , λ Sa be the Sa eigenvalues of the matrix (( ) )∗ ( ) ∈ C Sa ×Sa . It is clear that

−1   Sa −1 = Tr (( ) )∗ ( ) λi , (4.27) i=1

which can be calculated after the pilot signals, the geometrical structure of the BS antenna array, and the support set of the channel vectors in the virtual angular domain are given. However, the support set  is ‘random’ since the channel vectors in practice are random and the elements of the measurement matrix obey the i.i.d. CN(0, 1). Thus we should consider the ‘expectation’ of the CRLB defined by    Sa λi−1 . E var h¯ ≥ E σ 2 i=1

(4.28)

For the matrix (( ) )∗ ( ) with the elements of obeying the i.i.d. CN(0, 1), its Sa obey the following joint distribution [27] eigenvalues {λi }i=1 ⎛ Sa  Sa − λi 

⎝ pλ˜ λ1 , λ2 , . . . , λ Sa = e i=1 i=1

⎞ Sa

2 λiG−Sa  λ j − λi ⎠ . (Sa − i)! i! j>i

(4.29)

Consequently, the expectation of the CRLB can be written as Sa  #∞ #∞ 

E var λi−1 pλ˜ λ1 , . . . , λ Sa dλ1 · · · dλ Sa . h¯ ≥ · · · σ 2 0

0

i=1

(4.30)

70

4 Compressive Sensing CSI Acquisition …

Since the computation of (4.30) can be highly complex, in practice we adopt the performance of the oracle LS estimator as the performance bound in the simulation study.

4.4.6 Adaptive Pilot Design and Required Time Slot Overhead for Closed-Loop Channel Tracking For the simplicity of analysis, the true support set  and the sparsity level Sa of the virtual angular domain channels are assumed to have been acquired by the CS based adaptive CSI acquisition. Clearly, if Sa is known, the smallest time slot overhead for CSI acquisition can be reduced to G = Sa . With the known , by exploiting the arithmetic-harmonic means inequality [26], (4.27) can be further expressed as Tr



−1  ≥ (( ) )∗ ( )

Sa2 Sa2 ,  = Sa  Tr (( ) )∗ ( ) λi

(4.31)

i=1

where the equality holds if and only if λ1 = λ2 = · · · = λ Sa . This indicates that to (( ) )∗ ( ) should be a diagonal matrix with   diagonal elements

identical approach the lower bound. In particular, for  with Tr (( ) )∗ ( ) = Sa G,  var h¯ ≥ σ 2 Sa /G,

(4.32)



and the lower bound of (4.32) is attained if  is a unitary matrix scaled by √ the factor G. This has inspired us to design the pilot signal matrix as S =  √ ∗ T  † AB GU Sa , where U Sa ∈ C Sa ×Sa is a unitary matrix. With this non  orthogonal pilot matrix, the lower bound of (4.32) is attained, i.e., var h¯ = σ 2 Sa /G = σ 2 .

4.4.7 Selection of Thresholds for Algorithms 4.1 and 4.2 4.4.7.1

pth in Algorithm 4.2

Consider the case  that for the stage of T = Sa + 1 in Algorithm 4.2, the final estimated support set, denoted as T , is the proper superset of the true support set , i.e., T  . This case implies that T includes the support index associated

  2 $  with the noise. Define the test statistic as ρ2 = Pp=1 c p l 2 P with  l ∈ T (lines

4.4 Performance Analysis

71

4.2 Fig. 4.4 The selection of

given SNR = 15 dB and P = 64 [28]: threshold pth in Algorithm a estimated PDFs of gρ2 ρ2 |H20 ,  and gρ2 ρ2 |H21 ,  ; and b MSE performance of the DSAMP algorithm as the function of pth

13 and 14 in Algorithm 4.2). Two complete hypotheses for the case  are defined as: 1  / . Furthermore, denote the PDFs of l ∈ , and H H20 , indicating  2 , indicating

l∈

0 1 0 ρ2 under H2 and H2 as gρ2 ρ2 |H2 ,  and gρ2 ρ2 |H21 ,  , respectively. By using the ksdensity function of MATLAB, we can obtain the estimated PDFs according to Monte-Carlo simulations, since the closed-form expressions are

difficult to obtain.

Figure 4.4a depicts the estimated PDFs of gρ2 ρ2 |H20 ,  and gρ2 ρ2 |H21 ,  with typical values of Sa and G, given SNR = 15 dB and P = 64. Figure 4.4b provides the MSE performance of Algorithm 4.2 as the function of pth , which indicates that pth = 0.02 achieves good MSE performance given typical values of Sa and G. Following a similar procedure, suitable values of pth for different SNRs can be obtained.

4.4.7.2

ε in Algorithm 4.1

Consider the test statistic ρ1 =

(q) 2 $

[q,G] [q,G] ¯p −

h

r

(G P) (line 6 in Algop p p=1

P

2

rithm 4.1 with iteration index i omitted) and the two complete hypotheses H10 and H11 , (q) where H10 indicates that the support set of { h¯ p } Pp=1 is correct, and H11 is complemen

† [q,G] 

2 $ tary to H10 . Under H10 , ρ1 = Pp=1 I −   v p (G P). However, 2

under H11 , the closed-form expression of ρ1 is difficult to derive. Similar to Fig. 4.4, Fig. 4.5a provides the estimates of the PDF gρ1 ρ1 |H10 with typical values of Sa and G, given SNR = 15 dB, G 0 = 10 and P = 64. According to Neyman-Pearson criterion [22], an appropriate threshold ε should minimize the probability of false alarm given the probability of miss. Figure 4.5b depicts the simulated probability of

72

4 Compressive Sensing CSI Acquisition …

Fig. 4.5 The selection of threshold ε in Algorithm 4.1 given

SNR = 15 dB, G 0 = 10, and P = 64 [28]: a estimated PDF of gρ1 ρ1 |H10 ; and b Pr H11 |H10 and Pr H10 |H11 as the functions of ε



false alarm Pr H11 |H10 and the miss probability Pr H10 |H11 of Algorithm 4.1 as the functions of ε, where pth = 0.02 is used in the simulation.

The results

of Fig. 4.5b indicate that ε = 0.03 minimizes both Pr H11 |H10 and Pr H10 |H11 given typical values of Sa and G. Similarly, appropriate values of ε for different SNRs can be obtained.

4.5 Simulation Results Massive MIMO system with the ULA of M = 128 antennas and d = λ/2 was considered. The spatial angle spread varied from 10◦ to 20◦ [4, 13], and thus the effective sparsity level in the virtual angular domain Sa was in the range of 8 to 14. In the simulations, f c = 2 GHz, Bs = 10 MHz, N = 2048, and v = 36 km/h, while the channels in the virtual angular domain exhibited the spatially common sparsity over Q = 5 time blocks. The length of the guard interval was 64, which indicates that the system can combat the maximum delay spread of 6.4 µs [17], and thus we adopted P = 64 [10]. The threshold parameters, ε in Algorithm 4.1 and pth in Algorithm 4.2, were selected according to Sect. 4.4.7. Specifically, we set pth to 0.06, 0.02, 0.01, 0.008, and 0.005, while ε to 0.08, 0.03, 0.0.09, 0.003, and 0.001, respectively, at the SNR of 10 dB, 15 dB, 20 dB, 25 dB and ≥ 30 dB. The oracle LS estimator and the CRLB were used as the benchmarks for the CS based adaptive CSI acquisition and the following closed-loop channel tracking, respectively. The time slot overhead G employed in the closed-loop channel tracking scheme was set to the estimated sparsity level obtained by the CS based adaptive CSI acquisition stage. The joint OMP (J-OMP) based CSI acquisition scheme [19] was also adopted for comparison.

4.5 Simulation Results

73

Fig. 4.6 Performance comparison of different CS algorithms as functions of the sparsity level Sa given G = 30, P = 64 and SNR = 20 dB [28]: a MSE performance, and b computational complexity

Figure 4.6 compares the MSE performance and complexity of four CS algorithms under various sparsity levels Sa . In the simulations, P = 64 sparse signals with the length of M = 128 had the common sparsity, the measurement dimension was G = 30, and SNR = 20 dB, while P measurement matrices were mutually independent with elements obeying the i.i.d CN(0, 1). Note that the conventional OMP and SP algorithms require Sa as the priori information. Figure 4.6a shows that the DSAMP algorithm achieves the best MSE performance and it approaches the oracle LS estimator for Sa ≤ 14.3 This is because the DSAMP algorithm jointly estimates P sparse signals by exploiting the common sparsity. Moreover, Fig. 4.6b shows that the complexity of the DSAMP algorithm is slightly higher than those of its counterparts, but all the four CS algorithms have the same order of complexity. We defined the sparse signal detection probability as the probability of correctly acquiring the support set of sparse signal (channel). Figure 4.7 compares the detection probabilities as functions of the measurement dimension G achieved by the SMV, MMV, and GMMV in noiseless scenario. In the simulation, the length of mul3

The DSAMP algorithm suffers from certain performance loss, compared to the oracle LS estimator in the noisy scenario with G ≤ 2Sa . For G = 2Sa , the case of i−1 ∩  = ∅ (line 9) and = i−1 (line 13) may repeatedly appear due to noise, resulting in the failure of the backtracking function of lines 7 ∼ 12. For G < 2Sa , the case of i−1 ∩  = ∅ can lead to a poor LS estimation (line 9) due to | | > G. Also see [21].

74

4 Compressive Sensing CSI Acquisition …

Fig. 4.7 Comparison of the sparse signal detection probabilities of the SMV, MMV and the proposed GMMV as functions of G [28]

tiple sparse signals was M = 128 with the common sparsity level Sa = 8, and the DSAMP algorithm was employed to recover sparse signals. In particular, the SMV recovers single sparse signal from single measurement matrix, and the MMV jointly recovers P sparse signals with the multiple identical measurement matrices, where the elements of the measurement matrix obey the i.i.d. CN(0, 1). By contrast, the GMMV recovers P sparse signals with mutually independent measurement matrices in parallel, where the elements of the measurement matrices also obey the i.i.d. CN(0, 1). From Fig. 4.7, it is clear that the joint processing of multiple sparse signals with the common support set and diversifying measurement matrices significantly enhance the performance of sparse signal recovery. For example, to obtain the detection probability of one with P = 64, the MMV requires G = 17, but the proposed GMMV only needs G = 11, which indicates a reduction of approximately 35% in the required time slot overhead. Even the GMMV with P = 4 outperforms the MMV with P = 64. Figure 4.8 compares the MSE performance of the J-OMP scheme [19] with fixed G, the DSAMP algorithm with fixed G, and the CS based adaptive CSI acquisition scheme (Algorithm 4.1), where Sa = 8 was considered. The oracle LS estimator with the known support set of the sparse channel vectors was adopted as the performance bound. From Fig. 4.8, it can be seen that the J-OMP based CSI acquisition scheme performs poorly. By contrast, the proposed DSAMP algorithm is capable of approaching the oracle LS performance bound when G > 2Sa . However, there still exists a significant performance gap between the DSAMP algorithm and the oracle LS estimator for G ≤ 2Sa . This is because the unreliable sparse signal

4.5 Simulation Results

75

Fig. 4.8 MSE performance of different CE and feedback schemes as functions of the time overhead G and SNR [28]

recovery may occur when the time slot overhead G is insufficient, which degrades the MSE performance. Fortunately, the proposed CS based adaptive CSI acquisition scheme can adaptively adjust G to acquire the robust CE. Observe from Fig. 4.8 that the proposed CS based adaptive CSI acquisition scheme approaches the oracle LS performance bound even for G ≤ 2Sa . Note that for Algorithm 4.1, we only plot the MSE associated with G ≤ 2Sa , because Algorithm 4.1 actually determines an appropriate G ≤ 2Sa adaptively. Figure 4.9 depicts the distributions of the adaptively determined time slot overhead G by the CS based adaptive CSI acquisition, given different sparsity level Sa and SNRs. In Algorithm 4.1, G 0 was set to 8. The results of Fig. 4.9 show that the proposed scheme can adaptively determine an appropriate G according to Sa . As pointed out in Sect. 4.2.4, to reliably acquire CSI, the required G in conventional schemes can be as large as G = M = 128. By exploiting the spatially common sparsity and temporal correlation of massive MIMO channels, the proposed scheme can effectively estimate the channels associated with hundreds of antennas at the BS with a dramatically reduced time slot overhead. Considering Sa = 8 at SNR = 30 dB for example, our scheme only uses a time slot overhead of G ≈ 10 to acquire CSI at the BS, which represents a reduction in the required G by about 92%, compared to conventional schemes. Figure 4.10 plots the distributions of the acquired sparsity level Sa by the proposed CS based adaptive CSI acquisition scheme, under the same settings of Fig. 4.9. The results of Fig. 4.10 show that the proposed scheme can accurately acquire the true Sa may be smaller than Sa at low SNR. This sparsity level Sa . Note that the acquired

76

4 Compressive Sensing CSI Acquisition …

Fig. 4.9 Distributions of adaptively selected time slot overhead by the CS based adaptive CSI acquisition scheme for different sparsity levels and SNRs [28]

Fig. 4.10 Distributions of the acquired sparsity level Sa by the CS based adaptive CSI acquisition scheme (which is then used as the time slot overhead for the proposed closed-loop channel tracking scheme) for different sparsity levels and SNRs [28]

4.5 Simulation Results

77

Fig. 4.11 MSE performance comparison of the CS based adaptive CSI acquisition stage and closedloop channel tracking stage for different sparsity levels Sa at SNR = 20 dB, where the required G¯ for each case is indicated [28]

is because some virtual angular domain coordinates whose channel energy is lower than the noise floor may be discarded by the DSAMP algorithm. Because we set the time slot overhead G to Sa in the closed-loop channel tracking, Fig. 4.10 also provides the probability distributions of the time slot overhead used in the closedloop channel tracking stage. As expected, the required time slot overhead in this stage is smaller than the time slot overhead actually used in the CS based adaptive CSI acquisition stage, which is confirmed by comparing Fig. 4.10 to Fig. 4.9. Figure 4.11 compares the MSE and required average time slot overhead G¯ of the CS based adaptive CSI acquisition with those of the closed-loop channel tracking for different Sa at SNR = 20 dB. The initial overhead G 0 = 10 was set for the CS based adaptive CSI acquisition. It is clear that benefiting from the accurately estimated sparsity level information provided by the CS based adaptive CSI acquisition, the closed-loop channel tracking enjoys the better MSE performance with a smaller required time slot overhead. For Sa = 14, the required G¯ by the CS based adaptive CSI acquisition and following closed-loop channel tracking are 20.43 and 14.02, respectively. Since the acquired CSI by the CS based adaptive CSI acquisition is utilized to adaptively adjust the pilot signal for enhancing performance, the closedloop channel tracking approaches the CRLB, as can be seen in Fig. 4.11. Also note ¯ a increases slightly as that for the CS based adaptive CSI acquisition, the ratio G/S Sa increases. Hence the MSE performance of the CS based adaptive CSI acquisition improves slightly as the true sparsity level Sa increases. Figure 4.12 provides the MSE performance comparison for different CE and feedback schemes, given Sa = 8 and various SNRs. Both the J-OMP based CSI acqui-

78

4 Compressive Sensing CSI Acquisition …

Fig. 4.12 MSE performance comparison of different CE and feedback schemes for various SNRs and true sparsity level Sa = 8, where the required average time slot overhead G¯ for each case is indicated [28]

sition scheme [19] and the DSAMP algorithm used the fixed G = 15. For the CS based adaptive CSI acquisition scheme, G 0 = 13 was considered. The required average time slot overheads for the proposed scheme are also marked in Fig. 4.12. Again, it is clear that the proposed CS based adaptive CSI acquisition stage (Algorithm 4.1), which uses the DSAMP algorithm with fixed G to adaptively determine an appropriate time slot overhead, outperforms the J-OMP based CSI acquisition scheme and DSAMP algorithm with a reduced time slot overhead requirement. By utilizing the accurately estimated channel sparsity information provided by the CS based adaptive CSI acquisition scheme, the closed-loop channel tracking stage can adaptively adjust the pilot signal to approach the CRLB with a further reduced time slot overhead. Specifically, the proposed scheme can reliably acquire the CSI of this massive MIMO system, approaching the CRLB, with an average time slot overhead G¯ < 2Sa . Figure 4.13 compares the downlink BER performance with ZF precoding, where the precoding is based on the estimated CSI corresponding to Fig. 4.12 under the same setup. In the simulations, the BS simultaneously served 16 users using 16-quadrature amplitude modulation signaling, and the effective noise in CSI acquisition was only introduced in the downlink channel. It can be observed that the proposed CE and feedback scheme outperforms its counterparts, and its BER performance is capable of approaching that of the CRLB.

4.6 Summary

79

Fig. 4.13 Downlink BER performance with ZF precoding, where the CSI at the BS is acquired by different CE and feedback schemes [28]

4.6 Summary In this chapter, we develop an adaptive CE and feedback scheme for FDD massive MIMO, which achieves robust and accurate CSI acquisition at the BS, while dramatically reducing the overhead for CE and feedback. The proposed scheme consists of two stages, the CS based adaptive CSI acquisition and the following closed-loop channel tracking. By exploiting the spatially common sparsity of massive MIMO channels within the system bandwidth, the CS based adaptive CSI acquisition can acquire the high-dimensional CSI from a small number of non-orthogonal pilots. The closed-loop channel tracking, which exploits the spatially common sparsity of massive MIMO channels over multiple consecutive time blocks, can effectively utilize the acquired CSI in the first stage to approach the CRLB. Besides, we generalize the MMV to the GMMV in CS theory and provided the CRLB of the proposed scheme, which enlightens us to design the non-orthogonal pilot for different stages of the proposed scheme. Simulation results confirm that our scheme can reliably acquire the CSI of massive MIMO systems, specifically, approaching the performance bound with an adaptively determined time slot overhead.

80

4 Compressive Sensing CSI Acquisition …

References 1. Lu, L., Li, G.Y., Swindlehurst, A.L., Ashikhmin, A., Zhang, R.: An overview of massive MIMO: benefits and challenges. IEEE J. Sel. Topics Signal Process. 8(5), 742–758 (2014) 2. Rusek, F., Persson, D., Lau, B.K., Larsson, E.G., Marzetta, T.L., Edfors, O., Tufvesson, F.: Scaling up MIMO: opportunities and challenges with very large arrays. IEEE Signal Process. Mag. 30(1), 40–60 (2012) 3. Hoydis, J., Ten Brink, S., Debbah, M.: Massive MIMO in the UL/DL of cellular networks: how many antennas do we need? IEEE J. Sel. Areas Commun. 31(2), 160–171 (2013) 4. Yin, H., Gesbert, D., Filippou, M., Liu, Y.: A coordinated approach to channel estimation in large-scale multiple-antenna systems. IEEE J. Sel. Areas Commun. 31(2), 264–273 (2013) 5. Choi, J., Love, D.J., Bidigare, P.: Downlink training techniques for FDD massive MIMO systems: open-loop and closed-loop training with memory. IEEE J. Sel. Topics Signal Process. 8(5), 802–814 (2014) 6. Dai, L., Wang, Z., Yang, Z.: Spectrally efficient time-frequency training OFDM for mobile large-scale MIMO systems. IEEE J. Sel. Areas Commun. 31(2), 251–263 (2013) 7. Angelosante, D., Biglieri, E., Lops, M.: Sequential estimation of multipath MIMO-OFDM channels. IEEE Trans. Signal Process. 57(8), 3167–3181 (2009) 8. Simko, M., Diniz, P.S., Wang, Q., Rupp, M.: Adaptive pilot-symbol patterns for MIMO OFDM systems. IEEE Trans. Wirel. Commun. 12(9), 4705–4715 (2013) 9. Nam, Y.H., Akimoto, Y., Kim, Y., Lee, M.i., Bhattad, K., Ekpenyong, A.: Evolution of reference signals for LTE-advanced systems. IEEE Commun. Mag. 50(2), 132–138 (2012) 10. Minn, H., Al-Dhahir, N.: Optimal training signals for MIMO OFDM channel estimation. IEEE Trans. Wirel. Commun. 5(5), 1158–1168 (2006) 11. Cheng, P., Chen, Z.: Multidimensional compressive sensing based analog CSI feedback for massive MIMO-OFDM systems. In: 2014 IEEE 80th Vehicular Technology Conference (VTC2014Fall), pp. 1–6. IEEE (2014) 12. Kuo, P.H., Kung, H., Ting, P.A.: Compressive sensing based channel feedback protocols for spatially-correlated massive antenna arrays. In: 2012 IEEE Wireless Communications and Networking Conference (WCNC), pp. 492–497. IEEE (2012) 13. Nam, J., Adhikary, A., Ahn, J.Y., Caire, G.: Joint spatial division and multiplexing: opportunistic beamforming, user grouping and simplified downlink scheduling. IEEE J. Sel. Topics Signal Process. 8(5), 876–890 (2014) 14. Hu, A., Lv, T., Gao, H., Zhang, Z., Yang, S.: An esprit-based approach for 2-d localization of incoherently distributed sources in massive MIMO systems. IEEE J. Sel. Topics Signal Process. 8(5), 996–1011 (2014) 15. Zhou, Y., Herdin, M., Sayeed, A.M., Bonek, E.: Experimental study of MIMO channel statistics and capacity via the virtual channel representation. Univ. Wisconsin-Madison, Madison, WI, USA, Tech. Rep. 5, 10–15 (2007) 16. Tse, D., Viswanath, P.: Fundamentals of Wireless Communication. Cambridge University Press (2005) 17. Correia, L.M.: Mobile Broadband Multimedia Networks: Techniques, Models and Tools for 4G. Elsevier (2010) 18. Cho, Y.S., Kim, J., Yang, W.Y., Kang, C.G.: MIMO-OFDM Wireless Communications with MATLAB. Wiley (2010) 19. Rao, X., Lau, V.K.: Distributed compressive CSIT estimation and feedback for FDD multi-user massive MIMO systems. IEEE Trans. Signal Process. 62(12), 3261–3271 (2014) 20. Rao, X., Lau, V.K., Kong, X.: CSIT estimation and feedback for FDD multi-user massive MIMO systems. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3157–3161. IEEE (2014) 21. Do, T.T., Gan, L., Nguyen, N., Tran, T.D.: Sparsity adaptive matching pursuit algorithm for practical compressed sensing. In: 2008 42nd Asilomar Conference on Signals, Systems and Computers, pp. 581–587. IEEE (2008)

References

81

22. Steven, M.K.: Fundamentals of Statistical Signal Processing, vol. 10, p. 151045. PTR PrenticeHall, Englewood Cliffs, NJ (1993) 23. Duarte, M.F., Eldar, Y.C.: Structured compressed sensing: from theory to applications. IEEE Trans. Signal Process. 59(9), 4053–4085 (2011) 24. Billingsley, P.: Probability and Measure. Wiley (2008) 25. Chen, J., Huo, X.: Theoretical results on sparse representations of multiple-measurement vectors. IEEE Trans. Signal Process. 54(12), 4634–4643 (2006) 26. Björck, Å., et al.: Numerical Methods in Matrix Computations, vol. 59. Springer (2015) 27. Couillet, R., Debbah, M.: Random Matrix Methods for Wireless Communications. Cambridge University Press (2011) 28. Gao, Z., Dai, L., Wang, Z., Chen, S.: Spatially common sparsity based adaptive channel estimation and feedback for FDD massive MIMO. IEEE Trans. Signal Process. 64(23) 6169–6183 (2015)

Chapter 5

Compressive Sensing Sparse Channel Estimation in Broadband Millimeter-Wave Massive MIMO Systems

Abstract For hybrid precoding based mmWave massive MIMO systems, the number of RF chains is usually much smaller than the number of antennas for costeffectiveness, which leads to challenges in channel estimation (CE). Simultaneously, the frequency-selective fading (FSF) characteristic of practical mmWave channels cannot be ignored. Therefore, this chapter introduces a multi-user uplink CE scheme tailored for mmWave massive MIMO systems considering FSF channels. This scheme leverages the structured sparsity of mmWave FSF channels in the angle domain through a distributed compressive sensing (DCS)-based CE approach. Besides, the considered algorithm adopts the adaptive measurement matrix to solve the power leakage problem caused by continuous angles of arrival or departure (AoA/AoD). Numerical results demonstrated the effectiveness of the considered solution.1

5.1 Introduction For millimeter-wave (mmWave) massive MIMO with the phase shifter network based hybrid precoding [1–3] or electromagnetic lens based beamspace MIMO [4, 5], the number of RF chains is usually much smaller than that of antennas for reduced hardware cost and power consumption. However, such architectures will lead to the challenging CE due to only a limited number of RF chains but hundreds of antennas [6]. To solve this problem, this chapter proposes a multi-user UL CE scheme for mmWave massive MIMO systems, where the broad-band FSF channel is converted to multiple parallel narrow-band flat fading channels when OFDM is considered. Specifically, the mmWave channels exhibit the obviously angle-domain sparsity due to the much higher path loss for non-line-of-sight (NLOS) paths than that for line-ofsight (LOS) paths [7]. Moreover, this sparsity is almost unchanged within the system bandwidth according to our derivation. By exploiting such angle-domain structured sparsity of mmWave FSF channels, we propose a DCS-based CE scheme, where both the transmit pilot signal and receive CE algorithm are elaborated under the 1

The work introduced in this chapter is based on the reference [10].

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3_5

83

84

5 Compressive Sensing Sparse Channel Estimation …

DCS theory for improved performance. By contrast, conventional scheme in [7] fails to leverage the structured sparsity of channels. Moreover, by using the grid matching pursuit strategy with adaptive measurement matrix, the proposed algorithm can solve the power leakage problem caused by the continuous AoA/AoD. Simulation results verify the good performance of the proposed scheme. Notation: the boldface lower and upper-case symbols denote column vectors and matrices, respectively. The Moore-Penrose inversion, transpose, conjugate transpose, integer ceiling, and expectation operators are given by (·)† , (·)T , (·)∗ , ·, and E{·}, respectively. || is the cardinality of the set . The support set of the vector a is denoted by supp{a}. ⊗ is the Kronecker product, and vect (·) is the vectorization operation according to the columns of the matrix. [a]i denotes the ith entry of the vector a, and [A]i, j denotes the ith-row and jth-column element of the matrix A.

5.2 System Model We consider a typical mmWave massive MIMO-OFDM system over FSF channels BS RF chains as shown in Fig. 5.1 [8], where the BS employs NaBS antennas but only NRF BS = K to support K user equipments (UEs), and each UE has NaUE with NaBS  NRF UE UE antennas but only NRF RF chain with NaUE  NRF = 1. The hybrid analog-digital precoding at the BS can be used to realize the spatial multiplexing of multiple data streams with low hardware cost and energy consumption [8]. Particularly, the UL FSF channel associated with the kth user in the delay domain can be modeled as [8] Hkd (τ ) =

L k −1

  d Hl,k δ τ − τl,k ,

(5.1)

l=0 BS

d where L k is the number of multipath, τl,k is the delay of the lth path, Hl,k ∈ C Na is given by

Fig. 5.1 Illustration of a multi-user broad-band mmWave massive MIMO system [10]

×NaUE

5.3 DCS-Based CE Scheme

85

  ∗   d d sin(ϕl,k )/λ , Hl,k =αl,k aBS d sin(θl,k )/λ aUE

(5.2)

αl,k is the complex gain of the lth path, and θl,k ∈ [0, 2π ] and ϕl,k ∈ [0, 2π ] are azimuth AoA/AoD if we consider the typical ULA. For path gains, we consider Rician fading channels consisting of one LOS path (the 0th path) and L k − 1 NLOS paths (the lth path for 1 ≤ l ≤ L k − 1), where path gains follow the mutually independent complex Gaussian distribution with zero means, and K factor denotes the ratio between the power of LOS path and the power of NLOS paths. In addition, 

  T d sin(θl,k ) = e j2πn BS d sin(θl,k )/λ n BS ∈[0,1,...,N BS −1] aBS a λ     d sin(ϕl,k ) T = e j2πn UE d sin(ϕl,k )/λ n UE ∈[0,1,...,N UE −1] aUE a λ

(5.3)

are steering vectors at the BS and the kth user, respectively, where λ denotes the wavelength and d is the antenna spacing.

5.3 DCS-Based CE Scheme In this section, we propose a DCS-based CE scheme to jointly estimate the FSF channels.

5.3.1 UL Pilot Training We consider that the TS used for CE adopt OFDM to combat the FSF channels, where L k −1,K L k −1,K − min τl,k l=0,k=1 ) fs lengths of cyclic prefix (CP) and DFT are L CP > (max τl,k l=0,k=1 and P > L CP , respectively, where f s is the sampling rate. At the BS, after the CP removal and DFT operation, the received signal at the pth (1 ≤ p ≤ P) subcarrier of the tth OFDM symbol in the frequency domain can be expressed as (t) (t) ∗ r(t) p = (ZRF ZBB, p )

K 

(t) (t) (t) (t) H p,k FRF,k FBB, p,k s p,k + v p , f

(5.4)

k=1 NRF ×1 where r(t) is the received signal dedicated to the pth pilot subcarrier in p ∈C BS BS NRF ×NRF the tth OFDM symbol, Z(t) is the digital combining matrix, Z(t) BB, p ∈ C RF ∈ BS BS (t) (t) NaBS ×NRF (t) NaBS ×NRF is the RF combining matrix, Z p = ZRF ZBB, p ∈ C is the composite C combining matrix at the BS, BS

86

5 Compressive Sensing Sparse Channel Estimation …

f

H p,k =

L k −1

d − j2π f s τl,k p/P Hl,k e

l=0

=

L k −1

αl,k e



j2π f s τl,k p P

 aBS

l=0

   d sin(θl,k ) ∗ d sin(ϕl,k ) aUE , λ λ

(5.5)

denotes the frequency-domain channel matrix associated with the pth pilot subUE UE UE UE UE (t) (t) NRF ×NRF NRF ×1 ∈ C Na ×NRF , FBB, , s(t) are carrier for the kth UE, FRF,k p,k ∈ C p,k ∈ C the RF precoding matrix, digital precoding matrix, and transmitted TS for the kth (t) (t) (t) (t) NaUE ×1 = FRF,k FBB, is considered as the pilot signal UE, respectively, f p,k p,k s p,k ∈ C (t) transmitted by the kth user, and v p is the AWGN at the BS. Note that RF precoding/combining is the same for all subcarriers, since the RF phase shifter network can provide constant phase shift response over a wide frequency range [8]. Since the path loss for NLOS paths is much larger than that for LOS paths in mmWave systems, the mmWave channels appear the obvious sparsity in the angular domain, which indicates small L k and large K factor in mmWave systems, e.g., L k = 4 and K factor = 20 dB [2]. Hence, we can transform the frequency-domain channel f matrix H p,k in (5.5) into the sparse angle-domain channel matrix Hap,k as [7] Hap,k = A∗BS H p,k AUE , f

(5.6)

where ABS ∈ C Na ×Na and AUE ∈ C Na ×Na are the DFT matrices by quantizing the virtual angular domain with the resolutions of 1/NaBS at the BS and 1/NaUE at the f user, respectively. By vectorizing H p,k , we can further obtain BS

BS

UE

UE

  f f h p,k = vect H p,k = (A∗UE )T ⊗ ABS vect Hap,k = Ahap,k ,

(5.7)

where A = (A∗UE )T ⊗ ABS and hap,k = vect Hap,k . Due to the sparsity of Hap,k , only a minority of elements of hap,k dominate the majority of the channel energy, and thus we have



p,k = supp ha = Sk N BS N UE , (5.8) p,k a a where p,k is the support set, and Sk is the sparsity level in the angular domain. Note that if we consider the quantized AoA/AoD have the same resolutions as AUE and ABS , we have Sk = L k [7].

5.3 DCS-Based CE Scheme

87

According to (5.6)–(5.8), (5.4) can be further expressed as r(t) p

=

∗ (Z(t) p )

K 

(t) ABS Hap,k A∗UE f p,k + v(t) p

k=1 ∗ (t) ¯ a ¯ ∗ ¯ (t) = (Z(t) p ) ABS H p AUE f p + v p   a  ∗ ¯ p + v(t) ¯ ∗UE f¯ p(t) T ⊗ (Z(t) vect H = A ) A BS p p

(5.9)

(t) ¯a =  (t) p hp + vp ,

where

  ¯ ap = Hap,1 , Hap,2 , . . . , Hap,K ∈ C NaBS ×K NaUE , H

¯ ∗UE = diag A∗UE , A∗UE , . . . , A∗UE ∈ C K NaUE ×K NaUE , A UE (t) T (t) T (t) T T ) , (f p,2 ) , . . . , (f p,K ) ] ∈ C K Na ×1 , f¯ p(t) = [(f p,1  a ¯ p ∈ C K NaBS NaUE ×1 , h¯ ap = vect H  ∗ (t) T BS ∗ NRF ×K NaBS NaUE ¯ ¯ ⊗ (Z(t) .  (t) p = AUE f p p ) ABS ∈ C

(5.10)

Furthermore, we consider the mmWave channels remain unchanged in G successive OFDM symbols within the channel coherence time [7]. By jointly using the received pilot signals in G successive OFDM symbols, we can obtain ˜ p h¯ ap + v˜ p , r˜ p = 

(5.11)

T (2) T (G) T T G NRF ×1 where r˜ p = [(r(1) is the aggregate received sigp ) , (r p ) , . . . , (r p ) ] ∈ C BS (1) (2) (G) T T T T ×K NaBS NaUE ˜ p = [( p ) , ( p ) , . . . , ( p ) ] ∈ CG NRF is the aggregate meanal,  BS

T (2) T (G) T T surement matrix, and v˜ p = [(v(1) p ) , (v p ) , . . . , (v p ) ] is aggregate AWGN. The a ˜ p h¯ p 22 }/E{ ˜v p 22 } according to (5.11). system’s SNR can be defined as SNR = E{ 

5.3.2 DCS-Based CE To accurately estimate channels from (5.11), G in conventional algorithms, such as the MMSE algorithm, is heavily dependent on the dimension of h¯ ap , i.e., K NaUE NaBS . BS Usually, G NRF ≥ K NaUE NaBS is required, which leads G to be much larger than the channel coherence time [7]. Fortunately, the sparsity of mmWave massive MIMO channels motivates us to leverage the CS theory to estimate channels with much reduced pilot overhead. Moreover, according to (5.5), it can be observed that f {H p,k } Pp=1 share the same AoA/AoD, and thus {hap,k } Pp=1 obtained after (5.6) and (5.7) have the structured sparsity within the system bandwidth, i.e., a



a

= supp h2,k = · · · = supp haP,k = k . supp h1,k

(5.12)

88

5 Compressive Sensing Sparse Channel Estimation …

Specifically, given (5.11) and the sparse constraints of (5.8) and (5.12), the channels can be estimated with standard DCS tool. However, due to the continuous AoA/AoD and the limited angle-domain resolution of ABS and AUE , the sparsity of h¯ ap may be impaired due to the power leakage problem [7], which will result in the poor CE performance. To this end, we propose a DGMP algorithm as listed in Algorithm 5.1 [10] including outer loop and inner loop. In each iteration of outer loop (steps 2.1–2.3 and 2.19– 2.21), according to correlation operation (step 2.1), the UE index k˜ (step 2.2) and ¯ p (step 2.3) associated with the most possible path adaptive measurement matrix ϒ ˜ are acquired and input to inner loop; according to the output of inner loop, the kth UE’s transmit/receive steering vectors are acquired (steps 2.19–2.20), and |K| UEs’ LOS path gains and residue b p are updated (step 2.21). The iteration of outer loop stops when AoA/AoD and path gains of all K UEs’ LOS paths are estimated. For ˜ UE’s inner loop (steps 2.4–2.18), the AoA/AoD estimation associated with the kth LOS path is improved with the grid matching strategy. Specifically, according to the ¯ p from outer loop, AoA/AoD indices n BS and n UE of the most possible inputs k˜ and ϒ path are acquired (step 2.6), and the corresponding correlation value is recorded as ˜ p (steps β (step 2.5); we construct the local over-complete measurement matrix ϒ BS 2.7–2.11), where the local resolution of AoA associated with the index n and AoD associated with the index n UE is increased by (2J − 1) times; according to correlation operation (step 2.12), finer AoA/AoD indices m BS and m UE can be acquired ¯ p is adaptively updated, where the grid of AoA/AoD candidates (step 2.13); finally, ϒ is adjusted according to m BS and m UE (step 2.14–2.18). The iteration of inner loop stops when |βlast − β| < ε. ˜ p and b p for 1 ≤ p ≤ P, the DGMP algorithm exploits With the joint process of  the structured sparsity for improved performance, which can be found in steps 2.1, ¯ p with grid matching 2.4, and 2.12. Moreover, the adaptive measurement matrix ϒ pursuit strategy can achieve high resolution estimation of AoA/AoD. Additionally, the near-LOS mmWave channel property is exploited, where only K UEs’ LOS paths are estimated. Compared to the adaptive CS algorithm [7] estimating single sparse narrow-band channel from single received signal, the proposed DGMP algorithm jointly estimates multiple sparse subchannels from multiple received signals, whereby the angle-domain structured sparsity of mmWave FSF channels is exploited for improved performance. Moreover, the grid matching pursuit strategy ¯ p can solve the problem of power leakage caused by (steps 2.4–2.18) with adaptive ϒ the continuous AoA/AoD, which is different from the classical DCS algorithms [9].

5.3.3 Pilot Design According to DCS Theory ˜ p , ∀ p in (5.11) are very important for guaranteeing the The measurement matrices  (2) T BS T ˜ p = [( (1) reliable CE. Usually, we have G NRF K NaUE NaBS . Since  p ) , ( p ) , (t) ∗ ∗ ∗ T T (t) ∗ ¯ ∗ ¯ (t) T ¯∗ . . . , ( (G) p ) ] ,  p = (AUE f p ) ⊗ (Z p ) ABS , AUE = diag AUE , AUE , . . . , AUE ,

5.3 DCS-Based CE Scheme

89

Algorithm 5.1 Proposed DGMP Algorithm ˜ p in (5.11) ∀ p, AoA/AoD resolution factor J , Input: Received signals r˜ p and sensing matrices  and error threshold ε. k,LOS k,LOS Output: The steering vector estimation of kth UE’s LOS path aˆ BS and aˆ UE , and the estimation   set of path gains αˆ ∈ C1×K , where αˆ k denotes the gain estimate of kth UE’s LOS path. 

• Step 1 (Initialization) The residue b p = r˜ p , the iteration index k = 1, ˜ p



:, j

  ˜p = 

for 1 ≤ j ≤ K NaUE NaBS , ∀ p, and the matrix p and set K are set to be empty. • Step 2 (Estimate steering vectors and gains of K UEs’ LOS paths) for k ≤ K do       ∗  2  P    ˜ p bp  , ρ ρ = arg max /(NaUE NaBS ) ∈ /K   p=1 ρ  ρ  2   ˜ ρ/(NaUE NaBS ) , K = K ∪ k˜ ; k=   ¯p=  ˜p ϒ BS UE BS UE ;

1. 2. 3.

˜ :,(k−1)N a Na

repeat

;

+1:k˜ Na Na

     ∗  2  ϒ ¯ p bp  ; ρ = arg max  p=1 ρ 2 ρ      ∗  2   ϒ ¯ p bp  ; βlast = β , β = P p=1  ρ 2   n UE = ρ/NaBS , n BS = ρ − (n UE − 1)NaBS ;   ˜ = aUE (n UE + jUE )/NaUE A ; UE 2J   jUE ∈[−J,−J +1,...,J ] j BS BS BS ˜ = aBS (n + A ; BS 2J )/Na jBS ∈[−J,−J +1,...,J ] T  (t) ˜ ˜ (t) ˜ ∗ (t) ϒ ⊗ (Z p )∗ A BS ; p = AUE f ˜

4. 5. 6. 7. 8. 9.

 P

p,k

T ˜ (2) T ˜ (G) T T 10. ϒ˜ p = [(ϒ˜ (1) p ) , (ϒ p  ) , . . . , (ϒ p ) ] ;

      , 1 ≤ j ≤ (2J − 1)2 , ˜p ˜p = ϒ / ϒ :, j  :, j 2      2 P   (ϒ ˜ p )∗ b p  ; η = arg max p=1   η  η 

11. ϒ˜ p 12.



:, j

∀ p;

2

BS UE − 1)(2J − 1); 13. m UE = η/(2J  − 1), m = η − (m  UE −1 −J +m UE 14. A˜ UE= aUE (n + )/NaUE  2J

n UE ∈ 0,1,...,N

15. 16.

; −1

UE    BS ˜ = aBS (n BS + −J +m −1 )/NaBS A  ; BS 2J n BS ∈ 0,1,...,NBS −1 T  (t) (t) ˜ ˜ ∗ f (t) ϒp = A ⊗ (Z p )∗ A BS ; UE p,k˜

(2) T (G) T T T ) ] ; 17. ϒ p = [(ϒ (1) p ) , (ϒ p  ) , . . . , (ϒ  p



18. ϒ¯ p



:, j

      , 1 ≤ j ≤ N UE N BS , ∀ p ; = ϒp / ϒ p a a :, j  :, j 2

until |βlast − β| < ε ˜

−J +m BS −1 )/NaBS ); 2J UE −1 aUE ((n UE + −J +m )/NaUE ); 2J   

p , ϒ p :,η , αˆ K = ( p )† r˜ p , b p =

k,LOS 19. aˆ BS = aBS ((n BS +

20.

˜ k,LOS = aˆ UE 

21. p = end for

r˜ p − α K p ;

 



 

 ˜p /  :, j  :, j 2

90

5 Compressive Sensing Sparse Channel Estimation …

and AUE , ABS are determined by the geometrical structure of the antenna arrays, both (t) P,K ,G P,G } p=1,k=1,t=1 transmitted by the K users and {Z(t) {f p,k p } p=1,t=1 at the BS should be elaborated to guarantee the desired robust CE. According to [9], a measurement matrix whose elements follow an independent identically distributed (i.i.d.) Gaussian distribution can achieve the good performance ˜ p, ∀ p for sparse signal recovery. Furthermore, diversifying measurement matrices  can further improve the recovery performance of sparse signals according to DCS theory [9]. This enlightens us to appropriately design pilot signals for mmWave (t) (t) (t) massive MIMO systems. Specifically, as discussed above, Z(t) p = ZRF ZBB, p , f p,k = (t) (t) (t) (t) (t) (t) ˜(t) ˜(t) FBB, FRF,k p,k s p,k = FRF,k s p,k if we define s p,k = FBB, p,k s p,k (1 ≤ k ≤ K , 1 ≤ t ≤ G, 1 ≤ p ≤ P). Hence, we propose that each element of pilot signals is given by 

Z(t) RF



1

i 1 , j1

BS = e jφi1 , j1 ,t , 1 ≤ i 1 ≤ NaBS , 1 ≤ j1 ≤ NRF ,

  2 (t) UE = e jφi2 , j2 ,t,k , 1 ≤ i 2 ≤ NaUE , 1 ≤ j2 ≤ NRF , FRF,k i 2 , j2   4 BS BS = e jφi4 , j4 , p,t , 1 ≤ i 4 ≤ NRF , 1 ≤ j4 ≤ NRF , Z(t) BB, p i 4 , j4   3 UE = e jφi3 , p,t,k , 1 ≤ i 3 ≤ NRF , s˜(t) p,k

(5.13)

i3

where φi11 , j1 ,t , φi22 , j2 ,t,k , φi33 , p,t,k , and φi44 , j4 , p,t follow the i.i.d. uniform distribution U [0, 2π ). Note that elements of RF precoding/combining matrices should meet the constant modulus property, and different subcarriers share the same RF precoding/combining. It is readily seen that the designed pilot signals guarantee that the ˜ p obey the i.i.d. complex Gaussian distribution with zero mean. Moreelements of  ˜ p with different p are diversified. Hence, the proposed pilot signal design is over,  optimal in terms of the joint recovery of multi-user’s sparse angle-domain channels in the UL.

5.4 Simulation Results In this section, we investigate the performance of the proposed DCS-based CE. In simulations, carrier frequency f c = 30 GHz, f s = 0.25 GHz, the maximum delay UE = 1, NaBS = spread τmax = 100 ns, L CP = τmax f s = 25, P = 32, NaUE = 32, NRF BS −3 128, NRF = 4, d = λ/2, K factor = 20 dB, J = 10, ε = 10 , K = 4, L k = 4 for 1 ≤ k ≤ K . The case with the ideal AoD/AoD known at the BS is used as the performance benchmark for comparison. The adaptive CS-based CE scheme [7] is also adopted for comparison. Figure 5.2 investigates the downlink spectral efficiency (bit per channel use [bpcu]) by using the hybrid analog-digital precoding scheme in [1], where the channels were estimated by the adaptive CS scheme [7] and the proposed DGMP algo-

5.5 Summary

91

Fig. 5.2 Comparison of spectral efficiency performance of different CE schemes against the training overhead G and SNR [10]

rithm. The case with ideal AoA/AoD was adopted as the performance bound. From Fig. 5.2, it can be observed that the adaptive CS scheme performs poorly, since it does not exploit the structured sparsity of mmWave massive MIMO channels. In contrast, the proposed DGMP algorithm can approach the performance bound with ideal AoA/AoD when G ≥ 20. This is because the proposed DCS-based CE scheme can leverage the angle-domain structured sparsity of mmWave FSF channels within the system bandwidth. By contrast, to approach the performance bound, the conventional adaptive CS algorithm requires larger G, e.g., G > 90 is required at SNR = 0 dB. Hence, the proposed scheme can substantially reduce the required training overhead for FSF CE compared to its counterpart. Figure 5.3 compares the downlink BER performance, where 16-QAM is used, and G for adaptive CS algorithm and DGMP algorithm are 40, and 30, respectively. It can be observed that the proposed CE scheme outperforms its counterpart with reduced training overhead, and its BER performance is very close to the performance bound with ideal AoA/AoD.

5.5 Summary In this chapter, we propose a DCS-based UL CE scheme for the multi-user mmWave massive MIMO with hybrid precoding, and it can effectively combat mmWave FSF channels. Specifically, we have designed an efficient pilot scheme and proposed a reliable DGMP algorithm under the framework of DCS theory, whereby the angle-

92

5 Compressive Sensing Sparse Channel Estimation …

Fig. 5.3 BER performance comparison of different CE schemes [10]

domain structured sparsity of mmWave FSF channels is exploited for the reduced training overhead. Moreover, by using the grid matching pursuit strategy with adaptive measurement matrix, the proposed algorithm can effectively solve the power leakage problem. Simulation results confirm that our scheme can accurately estimate the FSF channels in mmWave massive MIMO with much lower pilot overhead than the existing scheme.

References 1. Alkhateeb, A., Leus, G., Heath, R.W.: Limited feedback hybrid precoding for multi-user millimeter wave systems. IEEE Trans. Wirel. Commun. 14(11), 6481–6494 (2015) 2. Gao, Z., Dai, L., Mi, D., Wang, Z., Imran, M.A., Shakir, M.Z.: Mmwave massive-MIMO-based wireless backhaul for the 5G ultra-dense network. IEEE Wirel. Commun. 22(5), 13–21 (2015) 3. Gao, X., Dai, L., Han, S., Chih-Lin, I., Heath, R.W.: Energy-efficient hybrid analog and digital precoding for mmWave MIMO systems with large antenna arrays. IEEE J. Sel. Areas Commun. 34(4), 998–1009 (2016) 4. Brady, J., Behdad, N., Sayeed, A.M.: Beamspace MIMO for millimeter-wave communications: system architecture, modeling, analysis, and measurements. IEEE Trans. Antennas Propag. 61(7), 3814–3827 (2013) 5. Zeng, Y., Zhang, R.: Millimeter wave MIMO with lens antenna array: a new path division multiplexing paradigm. IEEE Trans. Commun. 64(4), 1557–1571 (2016) 6. Han, S., Chih-Lin, I., Xu, Z., Wang, S.: Reference signals design for hybrid analog and digital beamforming. IEEE Commun. Lett. 18(7), 1191–1193 (2014) 7. Alkhateeb, A., El Ayach, O., Leus, G., Heath, R.W.: Channel estimation and hybrid precoding for millimeter wave cellular systems. IEEE J. Sel. Topics Signal Process. 8(5), 831–846 (2014)

References

93

8. Alkhateeb, A., Heath, R.W.: Frequency selective hybrid precoding for limited feedback millimeter wave systems. IEEE Trans. Commun. 64(5), 1801–1818 (2016) 9. Eldar, Y.C., Kutyniok, G.: Compressed Sensing: Theory and Applications. Cambridge University Press (2012) 10. Gao, Z., Hu, C., Dai, L., Wang, Z.: Channel estimation for millimeter-wave massive MIMO with hybrid precoding over frequency-selective fading channels. IEEE Commun. Lett. 20(6) 1259–1262 (2016)

Chapter 6

Subspace-Based Super-Resolution Sparse Channel Estimation in Millimeter-Wave Massive MIMO Systems

Abstract This chapter introduces super-resolution sparse channel estimation (CE) schemes for both narrowband and wideband mmWave massive MIMO systems with hybrid precoding. Specifically, for narrowband case, a two-dimensional (2D) unitary estimating signal parameters via rotational invariance techniques (ESPRIT) algorithm is adopted to accurately estimate the angle of arrivals/departures (AoAs/AoDs) exploiting the inherent sparsity of angle domain channels. Then, the discussion is extended to the wideband mmWave hybrid full-dimensional MIMO-OFDM systems, in which the introduced closed-loop sparse CE scheme leverages the channel sparsity in both angle and delay domains to enhance performance. This scheme includes the downlink and uplink CE stage, where the multi-dimensional unitary ESPRIT (MDUESPRIT) algorithm is used to estimate the AoAs at user devices (UD) in downlink and estimate the AoDs and UDs’ delays at the base station in uplink. Furthermore, the channel parameters acquired at the two stages are paired by a maximum likelihood method and the path gains are then estimated using the least-square approach. The spectrum estimation techniques in hybrid MIMO ensure the super-resolution estimations of the AoAs/AoDs and delays with low training overhead. Finally, the superiority of the considered schemes over state-of-the-art approaches is verified by numerical experiments.1

6.1 Introduction Millimeter-wave (mmWave) massive MIMO has been widely regarded as one of the most important technologies for the next generation wireless communications due to its large under-utilized bandwidth at mmWave frequency band [1]. For mmWave MIMO systems, a large number of antennas with the small form factor can be deployed at the BS and mobile station (MS) for achieving the large array gain to combat the severe path loss of mmWave channels. However, for the large antenna array in mmWave MIMO, the conventional full digital precoding can be unaffordable, since each antenna requires the associated expensive RF chain and high power-consumed 1

The work introduced in this chapter is based on the reference [38], [39].

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3_6

95

96

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

ADCs [1, 2]. To reduce the hardware cost and power consumption as well as achieve the spatial multiplexing, the hybrid precoding with a much smaller number of RF chains than that of antennas has been proposed. The hybrid precoding consists of the digital precoding at baseband and analog precoding at RF front end, which can be exploited to realize both beamforming and spatial multiplexing [3, 4]. Nonetheless, for such a hybrid MIMO system, it is challenging to estimate the high-dimensional mmWave channel from the low-dimensional effective measurements observed from the limited number of RF chains, since the training overhead for CE can be excessively high [5]. Moreover, the low signal-to-noise ratio (SNR) before beamforming can further degrade the performance of CSI acquisition [6]. To this end, this chapter first proposes a 2D unitary ESPRIT based CE algorithm for mmWave hybrid massive MIMO systems. By designing the training signals at both BS and MS, we can obtain a low-dimensional effective channel having the shift-invariance of array response with low pilot overhead. Then, we obtain the superresolution estimates of the AoAs and AoDs jointly by exploiting the shift invariance of array response preserved in this low-dimensional effective channel matrix with the aid of the proposed CE algorithm. Moreover, the path gains are estimated by applying the LS estimator. Finally, the high-dimensional mmWave MIMO channel is reconstructed according to the acquired AoAs, AoDs, and the corresponding path gains. Furthermore, we develop a closed-loop sparse CE scheme for multi-user wideband mmWave FD-MIMO systems. The proposed closed-loop solution includes the downlink CE stage followed by the UL CE stage as illustrated in Fig. 6.7, where the channel reciprocity in TDD based systems is exploited [7–9]. In TDD based systems, downlink AoAs (AoDs) are UL AoDs (AoAs). At the downlink CE stage, the horizontal/vertical AoAs of sparse MPCs are first estimated at each UD and they are fed back to the BS with limited quantization accuracy through the feedback link. At this stage, we design a common random transmit precoding matrix at the BS to transmit the training signals for omnidirectional channel sounding and we design the receive combining matrix at each UD to visualize the high-dimensional hybrid array as a lowdimensional digital array, which facilitates the use of the MDU-ESPRIT algorithm to estimate channel parameters. Similarly, at the UL CE stage, the horizontal/vertical AoDs and delays associated with different UDs are successively estimated by using the MDU-ESPRIT algorithm at the BS. Owing to the channel reciprocity, the AoAs estimated at UD side can be utilized as a priori to design the multi-beam transmit precoding matrix to improve the received SNR for UL CE. A ML approach is adopted at the BS to pair the channel parameters acquired at these two stages and, consequently, the associated path gains can readily be obtained using the LS estimator. Finally, the mmWave channel associated with each UD can be separately reconstructed based on the dominant channel parameters estimated above. Simulation results show that the proposed super-resolution CE schemes achieve better performance than conventional schemes with a reduced pilot overhead. Notation: The boldface lower and upper-case symbols denote column vectors and matrices, respectively. (·)∗ , (·)T , (·)H , (·)−1 , (·)† , · and · denote the conjugate,

6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband …

97

transpose, Hermitian transpose, matrix inversion, Moore-Penrose inversion operators, integer ceiling and integer floor operators, respectively. a1 and a2 are 1 -norm and 2 -norm of a, respectively, while  A F is Frobenius norm of A, and |Q|c is the cardinality of the set Q. The Kronecker and Khatri-Rao product operations are denoted by ⊗ and , respectively. I n denotes the n × n identity matrix and O m×n is the null matrix of size m × n, while 1n (0n ) denotes the vector of size n with all the elements being 1 (0). U n denotes a unitary matrix with size n × n, and J n denotes an exchange matrix with size n × n that reverses the order of rows of I n . diag(a) is the diagonal matrix with the elements of a at its diagonal entries, vdiag( A) denotes the vector consisting of the main diagonal elements of A, and Bdiag([ A1 · · · An ]) denotes the block diagonal matrix with A1 , . . . , An as its block diagonal entries. The expectation and determinant operators are denoted by E(·) and det(·), respectively. The modulo operation mod(m, n) returns the remainder of dividing m by n, and mod(Q, n) returns the set containing mod(m, n) ∀m ∈ Q of the ordered set Q. The operator find(a = 0) returns the set containing the indices of nonzero elements of a, and mat(a; m, n) converts the vector a of size mn into the matrix of size m × n by successively selecting every m elements of a as its columns. The operator vec( A) stacks the columns of A on top of each another, [ A]m,n denotes the mth-row and nth-column element of A, and a[m:n] is the vector consisting the mth to nth elements of a, while A[:,m:n] is the sub-matrix containing the mth to nth columns of A. A[Q,:] denotes the sub-matrix containing the rows of A indexed in the ordered set Q, and A[Q,i] is the ith column of A[Q,:] . Finally, Re{·} and Im{·} denote the real part and imaginary part of the corresponding arguments, respectively.

6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband Millimeter-Wave Massive MIMO Systems 6.2.1 System Model We consider a typical mmWave massive MIMO UL system with the hybrid precoding, as shown in Fig. 6.1b, where both MS and BS are equipped with NMS and MS BS and NRF RF chains, respectively [1, 3, 4, 10–14]. NS NBS antennas but only NRF MS independent data streams are employed by MS and BS, such that NS ≤ NRF ≤ NMS BS and NS ≤ NRF ≤ NBS . In the UL transmission, the received signal y ∈ C NS ×1 at the BS can be expressed as (6.1) y = W H H Fs + W H n, where W = W RF W BB ∈ C NBS ×NS is the hybrid combiner. W RF ∈ C NBS ×NRF and BS W BB ∈ C NRF ×NS denote the analog and digital combiners, respectively. H∈C NBS ×NMS is the UL channel matrix. F = F RF F BB ∈ C NMS ×NS is the hybrid precoder where MS MS F RF ∈ C NMS ×NRF and F BB ∈ C NRF ×NS denote the analog and digital precoders, BS

98

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

Fig. 6.1 Block diagram of transceiver for a typical mmWave massive MIMO system [38]: a full digital precoding, b hybrid analog-digital precoding

respectively. Note that every entry of F RF the  and W RF should satisfy  constraint       of constant modulus, i.e. [F RF ]m,n = 1/ NMS and [W RF ]m,n = 1/ NBS for the (m, n)th elements of F RF and W RF , respectively, since both F RF and W RF are realized by the analog RF phase shifters. To guarantee the total transmit power constraint, MS . s ∈ C NS ×1 the digital precoder F BB is further normalized as F RF F BB 2F = NRF NBS ×1 denotes the transmitted baseband signal from the MS, and n ∈ C is the complex AWGN following the distribution CN (0, σn2 I) at the BS. Due to the severe path loss for non-line-of-sight (NLOS) paths and thus the limited significant scatterers, the geometric mmWave channel with sparse MPCs is adopted [1, 4, 10, 11, 13–17]. We consider only L dominated paths corresponding to L

6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband …

99

different scatterers contribute to the channel matrix H, which can be written as  H=

L NBS NMS  H αl aBS (θl )aMS (ϕl ), L l=1

(6.2)

where αl ∼ CN (0, σα2 ) is the complex gain of the lth path, θl and ϕl are the azimuth angles of AoA and AoD pair of the lth path, respectively. Here, a typical ULA is considered at both BS and MS [1, 10, 11, 13–16], so the steering vectors aBS (θl ) and aMS (ϕl ) associated with the lth path can be respectively written as aBS (θl ) = 

1

 T 1, e j2π sin(θl ) , . . . , e j2π(NBS −1) sin(θl ) ,

NBS T 1  1, e j2π sin(ϕl ) , . . . , e j2π(NMS −1) sin(ϕl ) , aMS (ϕl ) =  NMS

(6.3)

where  = d/λ denotes the normalized spacing of adjacent antennas, λ is the wavelength, and d is the spacing of adjacent antennas. Furthermore, the mmWave channel matrix H can be rewritten in a more compact form as H H = ABS D AMS ,

(6.4)

(θ1 ), . . . , aBS (θ L )] ∈ C NBS ×L , AMS = [aMS (ϕ1 ), . . . , aMS (ϕ L )] ∈ where ABS = [aBS NMS ×L , and D = NBS NMS /L diag(α) is a diagonal matrix with α=[α1 , . . . , α L ]T . C

6.2.2 Proposed 2D Unitary Esprit Based Super-Resolution Channel Estimation Scheme In this section, we propose a 2D unitary ESPRIT based super-resolution CE scheme for mmWave massive MIMO with hybrid precoding. Specifically, the UL training signals will be designed at first to estimate a low-dimensional effective channel having the same shift-invariance of array response as the high-dimensional mmWave MIMO channel matrix with low pilot overhead. Then the super-resolution estimates of AoAs and AoDs are jointly obtained by exploiting the 2D unitary ESPRIT based CE algorithm. Moreover, the path gains are estimated by applying the LS estimator. The high-dimensional mmWave MIMO channel therewith will be reconstructed according to the acquired AoAs, AoDs, and path gains. Finally, the computational complexity of the proposed CE scheme is compared with its conventional counterparts.

100

6.2.2.1

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

Design of Training Signals for UL Channel Estimation

To estimate the high-dimensional mmWave MIMO channel, we will first estimate the AoAs and AoDs with high accuracy. Specifically, we will design the UL training signals consisting of the analog RF part and digital baseband part at both BS and MS. So that a low-dimensional effective channel having the same shift-invariance of array response as the high-dimensional mmWave channel can be acquired. We begin by considering  the UL CE in multiple time slots, where the received signal Y = y1 , . . . , y TMS ∈ C NS ×TMS in TMS time slots or one time block can be expressed as Y = W H H F S + W H N,

(6.5)

  where S = s1, . . . , s TMS ∈ C NS ×TMS denotes the transmitted pilot signal block, N =  n1 , . . . , nTMS ∈ C NBS ×TMS is the AWGN in TMS time slots, and we consider the channel matrix remains unchanged in the stage of CE. Furthermore, to improve the CE performance, we further consider to exploit NbT NbR time blocks jointly, so that the R T aggregated received signal  Y ∈ C Nb NS ×Nb TMS in NbT NbR time blocks can be expressed as ⎤ ⎡ Y 1,1 · · · Y 1,NbT ⎢ .. ⎥ = W ¯ H  H H  FS¯ + W N, (6.6) Y = ⎣ ... . . . . ⎦ Y NbR ,1 · · · Y NbR ,NbT where Y i, j ∈ C NS ×TMS , for i = 1, . . . , NbR and j = 1, . . . , NbT , is the received signal in the ((i − 1)NbT + j)th time block,    = W 1 , . . . , W N R ∈ C NBS ×NbR NS , W b   T  F = F 1 , . . . , F NbT ∈ C NMS ×Nb NS ,

(6.7)

are the aggregated hybrid combiner and precoder, respectively, and they will be T T designed later. S¯ = diag [S, . . . , S] ∈ C Nb NS ×Nb TMS is the aggregated pilot signal transmitted by the MS with NbT identical pilot signal blocks S on the block diagonal, R T ¯ = diag[ W  ] ∈ C NbR NBS ×NbR NS , and  N ∈ C Nb NBS ×Nb TMS is the aggregated AWGN W matrix. Thus, the total number of pilot overhead required for CE is T = TMS NbR NbT . As we have discussed before, for mmWave MIMO with hybrid precoding, each baseband observation contains the signals from different antennas due to the RF phase shift network (PSN). Hence, directly using the conventional ESPRIT algorithms can be difficult, since the shift-invariance of array response in these baseband observations is destroyed. To solve this problem, we will design the aggregated precoder   , so that the shift-invariance of array response in the baseband F and combiner W

6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband …

101

observations can be preserved. Particularly, we consider the aggregated precoder  F  with the following forms, i.e. and combiner W  F = αf  = αw W

 

I NbT NS O (NMS −NbT NS )×NbT NS I NbR NS O (NBS −NbR NS )×NbR NS

 , 

(6.8) ,

 , respectively, to guarantee the where α f and αw are the scale factors for  F and W  H H constraints of constant modulus and power. As a result, W F can be considered ¯ ∈ C NbR NS ×NbT NS by substituting as the low-dimensional effective channel matrix H (6.9), i.e. ¯ =W  H H r cl H F ⎡ ⎤ H 1,1 · · · H 1,NbT NS (6.9) ⎢ ⎥ .. .. .. = αw α f ⎣ ⎦, . . . H NbR NS ,1 · · · H NbR NS ,NbT NS where H m,n represents the (m, n)th element of H. From (6.9), we can observe ¯ come from the that elements of the low-dimensional effective channel matrix H R elements of the high-dimensional channel matrix H in first Nb NS rows and first ¯ and H share the same shift-invariance of array response. NbT NS columns. Hence, H In this way, we can use ESPRIT algorithms to acquire the super-resolution estimates ¯ instead of of AoAs and AoDs from the low-dimensional effective channel matrix H the original high-dimensional mmWave MIMO channel matrix H for the reduced pilot overhead. Clearly, to facilitate the usage of ESPRIT algorithms, according to (6.7), the ana NbT  NbT   and F BB, j j=1 as well as the analog and log and digital precoders F RF, j j=1   NbR  NbR  digital combiners W RF,i i=1 and W BB,i i=1 should be well designed to guarantee (6.8). For clarity in what follows, we neglect the constraints of constant modulus of analog phase shifters network and total transmit power (namely, α f and αw ) for the precoder here. To be specific, a unitary matrix  andMScombiner MS MS = MS ∈ C NRF ×NRF is considered as the set of the UL training sigu1 , . . . , u NRF U NRF nals, such as a DFT matrix, which has the orthogonality among different columns, MS MS for m = 1, . . . , NRF , while umH un = 0 for m = n. Furthermore, i.e. umH um = NRF we consider the digital precoder F BB, j , for j = 1, . . . , NbT , comes from the first NS MS , i.e. columns of U NRF   F BB, j = u1 , . . . , u NS

(6.10)

MS − 1 is considered. While for the analog precoder F RF, j ∈C NMS ×NRF , where NS = NRF we consider it has the following expression MS

102

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

H  F RF, j = F 1RF, j , F BB, j , F 2RF, j ,

(6.11)

    MS , . . . , u MS MS , . . . , u MS and F 2RF, j = u NRF are composed where F 1RF, j = u NRF NRF NRF       NMS − j NS

( j−1)NS

MS , respectively. Based on the designed of (( j − 1)NS ) and (NMS − j NS ) identical u NRF digital and analog precoders F BB, j and F RF, j in (6.10) and (6.11), we can have R F j = F RF, j F BB, j . Similarly, the digital combiner  W BB,i , fori = 1, . . . , Nb , comes BS from the first NS columns of U NRF , i.e. W BB,i = u1 , . . . , u NS . The analog combiner   H  BS BS , . . . , u BS W RF,i = W 1RF,i , W BB,i , W 2RF,i ∈ C NBS ×NRF , where W 1RF,i = u NRF NRF ∈   BS BS NRF ×(NBS −i NS ) MS , . . . , u BS C NRF ×((i−1)NS ) and W 2RF,i = u NRF are composed of NRF ∈ C BS , respectively. Then, we can have W i = ((i − 1)NS ) and (NBS − i NS ) identical u NRF   NbT T W RF,i W BB,i . According to (6.7), Nb precoding matrices F j j=1 and NbR combining R

Nb constitute the aggregated precoder  F and the aggregated combiner matrices {W i }i=1  W , respectively. Finally, we have

    F = F 1 , . . . , F NbT = α f

I NbT NS



, O (NMS −NbT NS )×NbT NS     I NbR NS  = W 1 , . . . , W N R = αw . W b O (NBS −NbR NS )×NbR NS

(6.12)

As a consequence, (6.6) can be further written as  Y = αw α f 



 I NbR NS O (NBS −NbR NS )×NbR NS H  I NbT NS ¯ H S¯ + W N

O (NMS −NbT NS )×NbT NS

(6.13)

¯ S¯ + W ¯ H =H N. ¯ having the To accurately acquire the low-dimensional effective channel matrix H same shift-invariance of array response as the high-dimensional channel matrix H from the aggregated received signal  Y , we can use the LS estimator to obtain its esti H −1 † H  ¯ = Y S¯ S¯ S¯ . For convenience, here we consider the transmit mate as H Y S¯ =  pilot signal block S as a unitary matrix, which has the perfect autocorrelation property (i.e. SS H = NS I NS with TMS = NS ). In this way, the low-dimensional effective ¯ = Y S¯ H /NS . channel matrix can be written as H ¯ preserving the shift-invariance of Based on above design, we can acquire H array response. This motivates us to exploit the ESPRIT algorithm to estimate the AoAs/AoDs jointly, which will be illustrated in the following subsection.

6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband …

6.2.2.2

103

2D Unitary ESPRIT Based Channel Estimation Algorithm

To jointly obtain the super-resolution estimates of AoAs and AoDs, we propose a 2D unitary ESPRIT based CE algorithm at the receiver, which includes the following several main steps and is summarized in Algorithm 6.1 [38]. Note that for the low¯ in (6.9), we consider it has the size of dimensional effective channel matrix H NR × NT for convenience. Construct Hankel Matrix and Extend Data: To alleviate the influence of coherent signals caused by multiple AoAs or AoDs close to each other, we consider the spatial smoothing and the forward backward averaging techniques in [18]. By leveraging these two techniques, we can take full advantage of obtained data, and acquire a robust AoAs and AoDs estimation to mitigate the performance loss due to rankdeficiency of the data matrix when multiple AoAs or AoDs are close to each other. Specifically, we introduce integers m 1 and m 2 as the stacking parameters, where 2 ≤ m 1 ≤ NT and 1 ≤ m 2 ≤ NR − 1. Meanwhile, for 1 ≤ i ≤ m 2 and 1 ≤ j ≤ m 1 , ¯ (i, j) ∈ C(NR −m 2 +1)×(NT −m 1 +1) , which is a we define the left/right-shifted matrix as H ¯ and it can be written as submatrix of H, ⎡ ¯ (i, j) = ⎢ H ⎣

¯ i, j H .. .

¯ NR −m 2 +i, j H

⎤ ¯ i,NT −m 1 + j ··· H ⎥ .. .. ⎦, . . ¯ NR −m 2 +i,NT −m 1 + j ··· H

(6.14)

¯ i, j represents the (i, j)th element of H ¯ in (6.9). Furthermore, we can conwhere H m 1 (NR −m 2 +1)×m 2 (NT −m 1 +1) struct a H ankel matrix H ∈ C as ⎡ (1,1) ⎤ ¯ (m 2 ,1) ¯ ··· H H ⎢ . ⎥ .. .. ⎥. H=⎢ (6.15) . . ⎣ .. ⎦ (1,m 1 ) (m 2 ,m 1 ) ¯ ¯ ··· H H According to [18], the extend data matrix He ∈ Cm 1 (NR −m 2 +1)×2m 2 (NT −m 1 +1) can be written as   He = H J m 1 (NR −m 2 +1) H∗ .

(6.16)

Real Processing: To reduce the computational complexity in the following steps, He is further transformed into the real matrix by left-multiplying and right-multiplying a transform matrix T L and T R to He , respectively, so that the corresponding eigenvalues are real [19, 20]. Particularly, this manipulation can be expressed as He,R = T L He T R ∈ Rm 1 (NR −m 2 +1)×2m 2 (NT −m 1 +1) , and the transformation matrices T L and T R can be respectively expressed as

104

6 Subspace-Based Super-Resolution Sparse Channel Estimation … H T L = Q m1 ⊗ Q NHR −m 2 +1 ,   I m 2 (NT −m 1 +1) j I m 2 (NT −m 1 +1) , TR = I m 2 (NT −m 1 +1) − j I m 2 (NT −m 1 +1)

(6.17)

Algorithm 6.1 2D Unitary ESPRIT Based CE Algorithm Input: ¯ the stacking parameters m 1 and m 2 , and the The low-dimensional effective channel matrix H, number of paths L; Output:  L  L The estimated AoAs θˆl l=1 and AoDs ϕˆl l=1 of channel. 1: Construct the H ankel matrix H as shown in (6.15); 2: Obtain the extended matrix He as shown in (6.16); 3: Implement the real processing to achieve He,R in the real domain expressed as He,R = T L He T R , where T L and T R are shown in (6.17); e,R ; 4: Let He,R = UV H and take the first L columns of the left singular matrix U, denoted as U 5: Diagonalize to jointly estimate the AoAs/AoDs pairs θˆl and ϕˆl according to (6.21) from the  = Im {}.  = Re {},  EVD of matrix  in (6.23), where  = T T −1 with 

where Q is a particular le f t J-r eal matrix according to [19], satisfying J Q ∗ = Q, and it has the sparse and unitary properties, defined as 1 Q 2n = √ 2 Q 2n+1



 In j In , Jn − j Jn

⎤ ⎡ j In I 0 1 ⎣ Tn √n =√ 0n 2 0nT ⎦ . 2 J 0 −j J n n n

(6.18)

Rank Reduction: In the absence of noise, He,R has only L  min{m 1 (NR − m 2 + 1), 2m 2 (NT − m 1 + 1)} effective rank. However, in the presence of noise, such low rank property of He,R is destroyed. To mitigate the noise, the singular value decomposition (SVD) of He,R , i.e. He,R = UV H , will be used to distinguish the signal subspace and noise subspace. In order to extract the information of AoAs and AoDs in the real matrix He,R , we subsequently take the first L columns of the left sine,R ∈ Rm 1 (NR −m 2 +1)×L , to approximate the dominant gular matrix U, denoted as U L-dimensional column span of He,R . Joint Diagonalization: According to [19, 20], for a certain non-singular matrix T ∈ R L×L , we can obtain  = E θ,I U e,R T  e,R T , E θ,R U (6.19)  = E ϕ,I U e,R T  e,R T , E ϕ,R U     where E θ,R = Re {E θ }, E θ,I = Im {E θ }, E ϕ,R = Re E ϕ , and E ϕ,I = Im E ϕ with

6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband …

105

    E θ = I m 1 ⊗ Q NHR −m 2 0 I NR −m 2 Q NR −m 2 +1 ,     E ϕ = Q mH1 −1 0 I m 1 −1 Q m 1 ⊗ I NR −m 2 +1 ,  ∈ R L×L and   ∈ R L×L are diagonal matrices, and they respectively. In (6.19),  can be expressed as  = diag(θ˜1 , . . . , θ˜L ),  (6.20)  = diag (ϕ˜1 , . . . , ϕ˜ L ) ,  where

θ˜l = tan(π  sin(θˆl )),    ϕ˜l = tan π  sin ϕˆl ,

(6.21)

for l = 1, . . . , L, respectively. Since T is an invertible square matrix, (6.19) can be further written as †   −1 = E θ,R U e,R E θ,I U e,R , T T (6.22) †   −1 = E ϕ,R U e,R E ϕ,I U e,R . T T  −1 and T T  −1 can be jointly diagonalized, which can be According to (6.21), T T expressed as      −1  −1 + j T T  = T T † †   e,R E ϕ,I U e,R + j E θ,R U e,R E θ,I U e,R . = E ϕ,R U

(6.23)

† †   e,R E θ,I U e,R and E ϕ,R U e,R E ϕ,I U e,R have the same eigenvectors, Since E θ,R U i.e. the column vectors of T , θ˜l and ϕ˜l corresponding to the same eigenvector in , for l = 1, . . . , L, are associated with the same path. That is to say, the estimated AoAs/AoDs pairs θˆl and ϕˆl can be naturally paired by exploiting a complex eigenvalue    + j ,  = Re {}, decomposition (EVD), given by  = T T −1 , where  =   and  = Im {} in (6.23). Finally, according to (6.20) and (6.21), we can obtain the  L  L paired super-resolution estimates of AoAs and AoDs, i.e. θˆl l=1 and ϕˆl l=1 . 6.2.2.3

Reconstruct High-Dimensional mmWave MIMO Channel

In this subsection, the high-dimensional mmWave MIMO channel will be recon L  L structed according to the obtained AoAs θˆl l=1 and AoDs ϕˆl l=1 in (6.21). First BS and A MS according to the steering vectors of all, we reconstruct the matrices A ˆ aBS (θl ) and aMS (ϕˆl ) in (6.3). Then, based on (6.4) and (6.9), we have the expression √ H H BS D A MS ¯ =W  A  with D = diag(d), where d = NBS NMS /L [α1 , . . . , α L ]T . H F

106

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

To acquire the associated path gains, we vectorize the low-dimensional effective ¯ as channel matrix H   T  H H BS d = Zd, MS  A  W (6.24) h¯ = A F   H T  H   BS , and we use the identity MS F ¯ , Z= A  W  A where h¯ = vec H   vec ( ABC) = C T A b with B = diag (b). Using the LS estimator, we can obtain the LS solution  d, i.e.   −1  ¯  d = arg min  h¯ − Zd 2 = Z † h¯ = Z H Z Z H h. d

(6.25)

MS , and the gain BS , A Finally, according to the obtained steering vector matrices A  of paths d above,wecan reconstruct the high-dimensional mmWave MIMO channel H BS diag  MS = A as H d A .

6.2.2.4

Analysis of Computational Complexity

From Algorithm 6.1, it can be observed that the main computational complexity comes from the SVD in Step 4 as well as the matrix inversion and EVD in Step 5. While the computational complexity of the rest implementations, such as the basic matrix multiplications, can be negligible. Specifically, for Step 4, the computational complexity of partial SVD taking the first L columns of left singular matrix U is  the order of O m 1 (NR − m 2 + 1) L 2 [21]. While for Step 5, the computational complexity of the matrix inversion operations of both the real and imaginary parts in   (6.22) and the following EVD are O L 3 [21], which can be small since the number of dominated paths L is small due to the limited number of scatterers over mmWave channels. Hence, the main computational cost of Algorithm 6.1 lies in Step 4, where  the computational complexity is O m 1 (NR − m 2 + 1) L 2 . In this subsection, we also consider the ACS-based CE scheme [11] and OMPbased CE scheme [13] for comparison. For the ACS-based CE scheme, the compu 3 log K (G ACS /L) [21], where K is the number of tational complexity is O 2L NBS beamforming vectors in each stage, and G ACS is the number of uniform grid points. For the OMP-based CE scheme, the main computational costs lie in the correlation operation and matrix inversion  operation. Hence, its computational complexity is  O NTBeam NRBeam G 2OMP + |It |4 [13, 21], where NTBeam and NRBeam are the numbers of transmit and receive pilot beam patterns at the MS and BS, respectively. G OMP is the size of quantized grids of virtual AoAs/AoDs, and It is the cardinality of index set (here |It | is equal to the number of iterations for CE). It should be pointed  out that the matrix inversion operation with the computational complexity of O |It |4 will dominate the computational complexity of the OMP-based CE scheme when the number of iterations |It | becomes very large, hundreds for instance.

6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband …

107

Based on the analysis above, it can be observed that the computational complexity of the proposed scheme is proportional to NR , namely the number of rows of the ¯ By contrast, those of the ACS-based low-dimensional effective channel matrix H. 3 and |It |4 (or CE scheme and the OMP-based CE scheme are proportional to NBS 2 G OMP ), respectively. Hence the computational complexity of the 2D unitary ESPRIT based CE scheme is lower compared with the ACS-based CE scheme and the OMPbased CE scheme. In Sect. 6.2.3, the computational complexity among three different schemes will be further compared in the specific simulations.

6.2.3 Simulation Results In this section, we will investigate the performance of the proposed 2D unitary ESPRIT based CE scheme by comparing it with the ACS-based CE scheme [11] and the OMP-based CE scheme [13]. We consider the simulation parameters shown as BS MS = NRF = 4, TMS = NS = 3, follows. Specifically, NBS = NMS = 64, NRF = NRF L R T Nb = Nb = 10, m 1 = m 2 = 13,  = 1/2 (namely,d = λ/2), σα2 = 1, and {θl }l=1 π π L . The metrics for performance and {ϕl }l=1 follow the uniform distribution − , 3 3 evaluation include the normalized mean square error (NMSE), defined as    2   2  , NMSE = 10log10 E H − H  / H F F

(6.26)

and the average spectral efficiency (ASE), defined as   1 −1 H H ASE = log2 det I NRF + Rn W opt H F opt F opt H H W opt , NRF 

(6.27)

H where Rn = σn2 W opt W opt , and F opt and W opt are the optimal precoder and combiner  and U,  respectively [13, 14], and the given V  consisting of the first NRF columns of V H       and U are the left and right singular matrices of H, i.e. H = U  V . Additionally, the BER performance with the optimal precoder F opt and combiner W opt is also investigated, where the number of data streams used in downlink transmission is NS = NRF and the modulation mode is 16-QAM. Noted that we assume the MS has a full CSI estimated at the BS without considering the specific feedback mechanism here. We begin by discussing the pilot overhead required among the proposed 2D unitary ESPRIT based CE scheme, the ACS-based CE scheme [11], and the OMP-based CE scheme [13]. We consider the number of paths is L = 5 for example. The corresponding pilot overhead are TProposed = TMS NbR NbT = 300 for our proposed scheme, TACS = K L 2 (K L/NRF ) log K (G ACS /L) = 1500 with K = 4, G ACS = 320 for the ACS-based scheme, and TOMP = NTBeam NRBeam /NRF = 576 with NTBeam = NRBeam =

108

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

48 and G OMP = 150 for the OMP-based scheme, respectively. Obviously, the pilot overhead required for our proposed scheme is the smallest. Furthermore, the computational complexity of three different schemes will be compared with the specific simulation parameters. According to the discussion in Sect. 6.2.2.4 and the simulation parameters used in Sect. 6.2.3, the computational complexity for the proposed Whilethose for the ACS-based and the scheme is CProposed = O (5850).  OMP-based   CE schemes are CACS = O 7.8 × 106 , and COMP = O 5.28 × 107 (here we consider |It | = 50 in simulations), respectively. Therefore, we have CProposed /CACS = 7.5 × 10−4 and CProposed /COMP = 1.1 × 10−4 . Clearly, the low complexity of our proposed scheme is self-evident. Figure 6.2a and b compares the NMSE performance of the proposed 2D unitary ESPRIT based CE scheme with the ACS-based and the OMP-based CE schemes against different signal-to-noise ratios (SNRs) with L = 5 and L = 10, respectively. Additionally, the pilot overhead required for different schemes is also provided as TACS = 1500 (with L = 5) and TACS = 3000 (with L = 10) for the ACSbased scheme, TOMP = 576 for the OMP-based scheme, and TProposed = 300 for the proposed scheme. From Fig. 6.2a, it can be observed that the NMSE performance of our proposed scheme outperforms the other two schemes significantly with a much reduced pilot overhead. Moreover, the performance gap between the proposed scheme and its counterparts becomes larger when SNR increases. Especially, the NMSE performance of the proposed scheme are more than 10 dB and 5 dB better than the ACS-based and OMP-based schemes, respectively. This is because the proposed CE scheme can acquire the super-resolution estimates of the AoAs and AoDs with high accuracy. By contrast, the ACS-based and the OMP-based CE schemes suffer from the obvious performance floor when SNR becomes large. This is because for the ACS-based CE scheme, the estimation resolution of the AoAs and AoDs is limited by the size of codebook and the resolution of quantized grids. It should be pointed out that the ACS-based CE scheme can not effectively distinguish multiple AoAs or AoDs close to each other, and will work poorly when the number of paths becomes large.2 From Fig. 6.2b, we can observe that when L = 10 at SNR = 20 dB, the NMSE performance of the ACS-based CE scheme is just around − 5 dB. While for the OMP-based CE scheme quantizing the continuously distributed AoAs and AoDs as the discretized grids, the NMSE performance will have the floor effect at high SNR due to the limited estimate resolution of the AoAs and AoDs. It is also worth pointing out again that our proposed scheme requires a much reduced pilot overhead compared with two other CE schemes. This means that, to achieve the better NMSE performance, the proposed scheme can reduce the required pilot overhead by 80% and 48%, respectively, compared with the ACS-based and the OMP-based CE schemes. Figure 6.3 compares the NMSE performance of different CE schemes versus SNRs, where TACS = 312, TOMP = 256, TProposed = 243, NRF = 4, NRF = 8, NRF = 16, and L = 5 are considered. From Fig. 6.3, we can observe that the NMSE per2

In simulations, we use the MATLAB codes provided by the authors in http://www.aalkhateeb.net/ publications.html?i=1.

6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband …

109

Fig. 6.2 NMSE performance comparison of different CE schemes versus SNRs [38]: a number of paths L = 5, b number of paths L = 10

110

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

Fig. 6.3 NMSE performance comparison of different schemes versus SNRs, where NRF = 4, NRF = 8 and NRF = 16 are considered, respectively [38]

formance of the proposed scheme improve considerably when the number of RF chains increases from NRF = 4 to NRF = 16. This is because the larger number of RF chains NRF can enlarge the effective observation dimension and thus improve the NMSE performance. By contrast, when NRF increases, the NMSE performance of both the ACS-based and the OMP-based CE schemes improves slightly due to the floor effect. Particularly, if SNR = 10 dB is considered, it can be observed that for the proposed scheme, the ACS-based scheme, and the OMP-based scheme, the performance improvements are 22 dB, 15 dB, and 5 dB, respectively, when NRF increases from 4 to 16. It is worth pointing out that the NMSE performance improvement of the OMP-based CE scheme can be negligible as NRF becomes large, since the accuracy of CE heavily depends on the resolution of the quantized grids G OMP rather than the number of RF chains NRF . This phenomenon further confirms the fact that quantizing the continuously distributed AoAs and AoDs for CE will lead to an inevitable quantization error and thus a non-negligible performance loss. Figure 6.4 compares the NMSE performance of different schemes with the fixed pilot overhead against different numbers of paths, where TACS = 375, TOMP = 256, TProposed = 243. It shows that the NMSE performance of all schemes decreases when the number of paths increases. Figure 6.4 further confirms the better performance of the proposed scheme than the ACS-based and OMP-based schemes for different numbers of paths with different SNRs. Figure 6.5 compares the ASE performance of different CE schemes against different SNRs, where TACS = 375, TOMP = 256, TProposed = 243, and both L = 5 and L = 10 are considered. In Fig. 6.5, the ASE with the perfect CSI known at BS and MS is considered as the performance upper bound. It can be observed from Fig. 6.5a that our proposed scheme is superior to the ACS-based and OMP-based CE schemes, and its performance approaches the optimal performance when SNR is larger than

6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband …

111

Fig. 6.4 NMSE performance comparison of different schemes with the fixed pilot overhead against different numbers of paths, where SNRs are 0 dB, 10 dB, and 20 dB, respectively [38]

− 5 dB. The ASE performance of the ACS-based CE scheme is rather poor for L = 10, since it can not effectively estimate the AoAs and AoDs when the number of paths becomes large. It is worthy pointing out that although the performance gap between the proposed scheme and the OMP-based scheme becomes smaller as SNR increases, the computational complexity of the proposed scheme is much smaller than that of the OMP-based scheme. Figure 6.6 compares the BER performance of three different schemes versus SNRs. In Fig. 6.6, we consider TACS = 375, TOMP = 256, TProposed = 243, the number of data streams used in downlink transmission is NS = NRF , the modulation type is 16-QAM, and the optimal precoder and combiner are adopted by the BS and MS. Obviously, the BER performance of our proposed scheme is better than the ACSbased and OMP-based scheme. The OMP-based scheme suffers from an obvious BER performance floor when SNR is larger than 10 dB for both L = 5 and L = 10, while the ACS-based scheme cannot work when L = 10. Additionally, it is worthy pointing out that although the larger number of paths will lead to the worse NMSE performance for both the proposed scheme and the OMP-based scheme, their BER performance improves when L increases from 5 to 10. This is because the larger number of paths can provide the higher spatial diversity gains, and thus can improve the BER performance.

112

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

Fig. 6.5 Comparison of ASE performance among different CE schemes versus SNRs, where the optimal performance with the perfect CSI known at both BS and MS are considered as the upper bound [38]: a number of paths L = 5, b number of paths L = 10

6.2 Subspace-Based Super-Resolution Sparse Channel Estimation in Narrowband …

113

Fig. 6.6 BER performance comparison of different schemes versus SNRs, where NS = NRF , 16QAM, and the optimal precoder and combiner are considered [38]: a number of paths L = 5, b number of paths L = 10

114

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband Millimeter-Wave Massive MIMO Systems This section will extend the discussion to the wideband mmWave MIMO-OFDM systems and propose a closed-loop super-resolution sparse CE scheme. The proposed scheme includes the downlink CE stage followed by the UL CE stage, as illustrated in Fig. 6.7, and the frame structure of our solution is further depicted in Fig. 6.8.

6.3.1 Downlink CE Stage Consider the mmWave FD-MIMO system with hybrid beamforming, where the BS and Q UDs are all equipped with UPA, and OFDM with K subcarriers is adopted, while Nsd independent signal streams are transmitted on each subcarrier [22]. The h v h v RF NBS (NUD = NUD NUD ) antennas and NBS  NBS BS (UD) employs NBS = NBS RF h h v v (NUD  NUD ) RF chains, where NBS (NUD ) and NBS (NUD ) are the numbers of antennas in horizontal and vertical directions at the BS (UD), respectively.

6.3.1.1

Downlink CE Signal Model

The downlink CE stage lasts Nd time slots and each time slot contains Nod OFDM d symbols. The signal yq [k, i, m] ∈ C Ns received by the qth UD at the kth subcarrier of the ith OFDM symbol in the mth time slot can be expressed as yq [k, i, m] =W H d,q [k, m]H q [k]F d [k, m]s[k, i, m] + WH d,q [k, m]nq [k, i, m],

(6.28)

for 1 ≤ q ≤ Q, 0 ≤ k ≤ K − 1, 1 ≤ i ≤ Nod and 1 ≤ m ≤ Nd . In (6.28), the UD’s d receive combining matrix W d,q [k, m] = W RF,q [m]W BB,q [k, m] ∈ C NUD ×Ns in which RF RF d W RF,q [m] ∈ C NUD ×NUD and W BB,q [k, m] ∈ C NUD ×Ns are the analog and digital receive combining matrices, and the BS’s transmit precoding matrix F d [k, m] = F RF,d [m] RF RF d d F BB,d [k, m] ∈ C NBS ×Ns in which F RF,d [m] ∈ C NBS ×NBS and F BB,d [k, m] ∈ C NBS ×Ns are the analog and digital transmit precoding matrices, respectively, while H q [k] ∈ d C NUD ×NBS is the corresponding downlink channel matrix, s[k, i, m] ∈ C Ns is the   training signal with E s[k, i, m]sH [k, i, m] = N1d I Nsd , and nq [k, i, m] ∈ C NUD is s the complex AWGN vector with the covariance matrix σn2 I NUD , i.e., nq [k, i,   m] ∼  2 CN 0 NUD , σn I NUD . Due to the constant modulus of the PSN, F RF,d [m] j1 , j2 =   √ 1 ejϑ1, j1 , j2 and W RF,q [m] = √ N1 ejϑ2, j1 , j2 with ϑ1, j1 , j2 , ϑ2, j1 , j2 ∈ A, and A is j1 , j2 NBS UD ps the quantized phase set of the PSN with the resolution Nq , given by

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

115

Fig. 6.7 Procedure of the proposed closed-loop sparse CE solution, where the limited feedback is realized via the low-frequency control link [39]

Fig. 6.8 Frame structure of the proposed closed-loop sparse CE solution [39]

! A = −π, −π +

2π 2

ps Nq

, −π + 2 ·

2π 2

ps Nq

,...,π −

2π 2

ps Nq

"

.

(6.29)

RF to guarantee the constraint on the total transmit power Also F d [k, m]2F ≤ NBS [23]. Here at the CE stage, some elegant solutions [24, 25] can be used to acquire the robust synchronization of burst training signals without the knowledge of noise/interference power even at low SNR. Due to the obviously resolvable delay spread for each MPC caused by the large bandwidth, according to the typical mmWave channel model [22, 26–31], the downlink delay-domain continuous channel matrix H q (τ ) ∈ C NUD ×NBS with L q MPCs can be expressed as

H q (τ ) =βq

Lq 

  H q,l p τ − τq,l ,

(6.30)

l=1

 where βq = NUD NBS /L q is the normalization factor, τq,l is the delay of the lth MPC, and p(τ ) denotes the equivalent PSF, while the complex gain matrix H q,l ∈ C NUD ×NBS is given by  UD UD  H  BS BS  , νq,l aBS μq,l , νq,l , H q,l = αq,l aUD μq,l

(6.31)

116

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

 UD  UD where αq,l ∼ CN (0, σα2 ) is the associated complex path gain, μq,l = π sin θq,l  UD  BS  BS   BS   UD  BS  BS  UD cos ϕq,l (μq,l = π sin θq,l cos ϕq,l ) and νq,l (νq,l = π sin ϕq,l ) = π sin ϕq,l denote the horizontally and vertically spatial frequencies with half-wavelength UD BS UD (θq,l ) and ϕq,l antenna spacing at the qth UD (the BS), respectively. Here, θq,l BS (ϕq,l ) are the downlink horizontal and vertical AoAs (AoDs) of the lth MPC associated with the UPA, respectively. The response vector at UD is given by  UD   UD  array UD UD = av νq,l ⊗ ah μq,l ∈ C NUD [19, 32, 33], in which , νq,l aUD μq,l  UD   UD h UD T ah μq,l = √ 1 h 1 ejμq,l · · · ej( NUD −1)μq,l ,

(6.32)

 UD   UD v UD T av νq,l = √ 1 v 1 ejνq,l · · · ej( NUD −1)νq,l .

(6.33)

NUD NUD

are the steering vectors associated with the horizontal and vertical directions,  BS respec BS = , νq,l tively. Similarly, the array response vector at BS is given by aBS μq,l  BS   BS  av νq,l ⊗ ah μq,l ∈ C NBS , where the horizontal and vertical direction steering vec BS   BS  h v UD tors ah μq,l ∈ C NBS and ah νq,l ∈ C NBS are given respectively by substituting μq,l h BS h UD v and NUD with μq,l and NBS in (6.32) as well as by substituting νq,l and NUD with BS v and NBS in (6.33). νq,l The frequency-domain channel matrix H q [k] at the kth subcarrier can then be expressed as H q [k] = βq

Lq 

H q,l e−j

2πk f s τq,l K

l=1

= βq

Lq 

 UD UD  H  BS BS  −j 2πk fs τq,l K αq,l aUD μq,l , νq,l aBS μq,l , νq,l e ,

(6.34)

l=1

where f s = 1/Ts denotes the system bandwidth, and Ts is the sampling period. The derivation of the first equation in (6.34) is shown in Appendix. Observe that H q [k] does not depend on the PSF, and it exhibits the sparsity in delay domain due to small L q but large normalized delay spread. Recall that the existing CS-based solutions of [28, 30] have to estimate the effective delay-domain CIRs that include the PSF, and this PSF will destroy the delay-domain sparsity of mmWave channels when the order of PSF is large. H q [k] in (6.34) can be rewritten as H q [k] = AUD,q Dq [k] AH BS,q ,

(6.35)

  where Dq [k] = diag d q [k] ∈ C L q ×L q is the diagonal matrix in which d q [k] =  T    and τ q [k] = e−j2πk fs τq,1 /K · · · diag α q τ q [k] with α q = βq αq,1 · · · αq,L q T e−j2πk fs τq,L q /K , and the array response matrix associated with the AoAs of the μ qth UD AUD,q ∈ C NUD ×L q can be expressed as AUD,q = AνUD,q AUD,q in which

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

117

    h μ UD UD UD UD AUD,q = ah (μq,1 ) · · · ah (μq,L ) ∈ C NUD ×L q and AνUD,q = av (νq,1 ) · · · av (νq,L ) q q v ∈ C NUD ×L q are the steering matrices corresponding to the horizontally and vertiμ cally spatial frequencies, respectively, while ABS,q = AνBS,q ABS,q ∈ C NBS ×L q is the array response matrix associated with the AoDs in which the steering matrices h v AμBS,q ∈ C NBS ×L q and AνBS,q ∈ C NBS ×L q have the similar form as AμUD,q and AνUD,q , respectively.

6.3.1.2

Obtain Horizontal/Vertical AoAs at UD

The downlink CE corresponds to Step 1 to Step 4 of Fig. 6.7, where the horizontal and vertical AoAs are estimated. We first assume that the training signal s[i, m] is independent of subcarriers, and its j1 th element can be designed as [s[i, m]] j1 = N1d ej2πφ j1 s with φ j1 randomly and uniformly selected from the interval [0, 1], i.e., φ j1 ∼ U[0, 1]. Second, a predefined frequency-domain scrambling code x d ∈ C K with its kth element being xd [k] for 0 ≤ k ≤ K − 1 can be introduced to effectively avoid the high peak-to-average power ratio (PAPR) resulted from the same training signal used at all subcarriers.3 Then, we can obtain the scrambled training signal s[k, i, m] at the kth subcarrier, i.e., s[k, i, m] = xd [k]s[i, m]. The signals received at the UD will be first descrambled by multiplying the conjugate of scrambling code x ∗d , which indicates that the scrambling code x d does not affect the subsequent signal processing. Moreover, the same digital transmit precoding/receive combining matrices are adopted at every subcarrier, i.e., F BB,d [k, m] = F BB,d [m] and W BB,q [k, m] = W BB,q [m], for 0 ≤ k ≤ K − 1. The number of independent signal streams associated with each subRF . We can visualize a low-dimensional carrier in each OFDM symbol is Nsd ≤ NUD h v h v are the numbers of antennas digital MUD × MUD sub-UPA, in which MUD and MUD in horizontal and vertical directions, from the high-dimensional hybrid analog/digital # sub d $ h v sub h v × NUD UPA. Given NUD = MUD MUD , the BS only requires Nd = NUD /Ns NUD time slots to broadcast training signals, with each time slot containing Nod OFDM h v , MUD and Nod trades off estimation accuracy with trainsymbols. The choice of MUD h v 4 and Nod lead to better estimation accuracy ing overhead, because larger MUD , MUD but higher training overhead, and vice versa. Since the signals received by all UDs have the same form, we can focus on the qth UD and the user index q can be omitted from yq [k, i, m], W d,q [m], H q [k], nq [k, i, m], AUD,q , Dq [k], ABS,q and other relevant variables for clarity. By collecting the received signals of (6.28) associated with the kth subcarrier over all the Nod OFDM symbols of the mth time slot into the signal matrix Y m [k] =   d d y[k, 1, m] · · · y[k, Nod , m] ∈ C Ns ×No , we have Each element in the predefined scrambling code x d should satisfy xd∗ [k]xd [k] = 1, 0 ≤ k ≤ K − 1. To achieve the low PAPR of training signals, we can adopt the constant-module Zadoff-Chu sequence as the scrambling code x d . 4 In this chapter, the training overhead is defined as the number of OFDM symbols required at the CE stage. In terms of downlink CE stage, the training duration is Nd Nod OFDM symbols. 3

118

6 Subspace-Based Super-Resolution Sparse Channel Estimation … H Y m [k] = xd∗ [k]W H d [m]H[k]F d [m]Sd [k, m] + W d [m]N m [k],

(6.36)

  where Sd [k, m] = s[k, 1, m] · · · s[k, Nod , m] = xd [k]Sd [m] with Sd [m] =     d d s[1, m] · · · s[Nod , m] ∈ C Ns ×No , and N m [k] = n[k, 1, m] · · · n[k, Nod , m] ∈ d C NUD ×No . Since the BS transmits the common random signal F d [m]Sd [m], the transmit precoding matrix F d [m] = F RF,d[m]F BB,d [m] should be a random matrix. This is achieved by designing F RF,d [m] as F RF,d [m] j1 , j2 = √ N1 ejϑ3, j1 , j2 with ϑ3, j1 , j2 ranBS   domly and uniformly selected from A, and designing F BB,d [m] as F BB,d [m] j1 , j2 = ej2πa j1 , j2 with a j1 , j2 ∼ U [0, 1]. The BS can use the same transmit precoding matrix F d = F d [m] to send the same sounding signal Sd = Sd [m] for every time slot. By stacking the received signal matrices Y m [k] of (6.36) over the Nd time slots into  T d d  Y d [k] = Y T1 [k] · · · Y TNd [k] ∈ C Nd Ns ×No , we have   H ˇd  H  N d [k], A D[k] A F S + Bdiag W Y d [k] = W UD d d BS d

(6.37)

 d = [W d [1] · · · W d [Nd ]] ∈ C NUD ×Nd Ns aggregates the downlink receive where W    ˇ d = Bdiag W H combining matrices used in the Nd time slots, and Bdiag W d [1] · · ·    d d T H T T W d [Nd ] ∈ C Nd Ns ×Nd NUD , while  N d [k] = N 1 [k] · · · N Nd [k] ∈ C Nd NUD ×No is the corresponding noise matrix.   sub O sub sub the Multiplying  Y d [k] with J d = I NUD NUD ×( Nd Nsd −NUD ) and aggregating  resulting signals over all the K subcarriers lead to the signal matrix Y¯ d = J d  Y d [0]  sub NUD ×K Nod   J d Y d [1] · · · J d Y d [K − 1] ∈ C as d

¯ UD S¯ d + N ¯ d, Y¯ d = A

(6.38)

   ¯ UD = J d W ¯ d = J d Bdiag W ˇd  H N d [0]  where N N d [1] · · ·  N d [K − 1] , A d AUD ,   H and S¯ d = S¯ d [0] S¯ d [1] · · · S¯ d [K − 1] with S¯ d [k] = D[k] ABS F d Sd . Observe from (6.37) and (6.38) that we cannot directly apply powerful spectrum estimation techniques [19, 34] to estimate the horizontal/vertical AoAs from Y¯ d , since the shiftinvariance structure of the array response matrix AUD does not hold in hybrid receive array [18, 35]. We propose to visualize the high-dimensional hybrid array as a lowdimensional digital array by designing appropriate aggregated receive combining  d so that the shift-invariance structure of array response can be reconmatrix W structed and therefore super-resolution CE based on spectrum estimation techniques can be harnessed.

6.3.1.3

Design Receive Combining Matrix at UD

RF Without loss of generality, we consider Nsd = NUD − 1 independent signal streams.   RF RF NUD ×NUD RF = u 1 · · · u RF to design the digFirst, we utilize a unitary matrix U NUD NUD ∈ C

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

119

ital receive combining matrix W BB [m] ∈ C NUD ×Ns of the mth time slot’s receive combining matrix W d [m] = W RF [m]W BB [m] for 1 ≤ m ≤ Nd . Specifically, W BB [m] = RF NUD ×NUD RF , U NUD [:,1:Nsd ] . To design the analog receive combining matrix W RF [m] ∈ C d NUD ×Nd Ns we construct the matrix d ∈ R as RF

 d =

d



v I MUD +1 ⊗ B

h v h v O ( NUD −NUD +1))×MUD +1) ( MUD ( MUD   I Nsd Nd , × h v O ( MUD +1)−Nsd Nd )×Nsd Nd ( MUD

(6.39)

 T h h h h h h where B = I MUD O MUD ∈ R NUD ×MUD . Then we take the sub-matrix ×(NUD −MUD ) NUD ×Nsd sub and define ξ¯ d,m = vec( sub d,m = d [:,(m−1)Nsd +1:m Nsd ] ∈ R d,m ) to construct   the ordered index set Dm = find ξ¯ d,m = 0 with |Dm |c = Nsd . Next we perform the modulo operation on Dm with NUD to get the ordered index set Im = mod(Dm , NUD ) with |Im |c = Nsd . The rows of W RF [m] whose indices correspond to Im are determined by W BB [m] as W RF [m][Im ,:] = W H BB [m], while the rest rows of W RF [m] consist of the (NUD − Nsd ) identical uHN RF . The phase value of arbitrary element in the UD designed W RF [m], denoted by ϑd , is then quantized to ϑ ∈ A by minimizing the Euclidean distance according to arg minϑ∈A ϑd − ϑ2 . Thus, the mth receive combining matrix can be obtained as W d [m] = W RF [m]W BB [m] for 1 ≤ m ≤ Nd .  d is summarized in Algorithm 6.2 [39]. Since the The proposed design for W number of RF chains is usually the power of 2, we can adopt Hadamard matrix for ps ps 5 RF . Clearly, our design can be Nq ≥ 1 or DFT matrix for Nq ≥ 2 to construct U NUD ps used for the PSN with arbitrary Nq , even the extremely low resolution PSN with ps  d , we have Nq = 1. With the designed W  ν μ  ¯ UD = J d W ¯ν ¯μ H A d AUD AUD = AUD AUD .

(6.40)

h v h ×L ×L ¯ μUD ∈ C MUD ¯ νUD ∈ C MUD Clearly, A (A ) is the matrix containing the first MUD μ ν v ¯ (MUD ) rows of AUD ( AUD ). Thus, AUD maintains the double shift-invariance structure of the original array response matrix AUD for both horizontal and vertical AoAs  d can be used to visualize the high-dimensional hybrid [35], and the designed W analog/digital array as a low-dimensional digital array. Therefore, we can utilize the MDU-ESPRIT algorithm detailed in Sect. 6.3.3 to obtain the super-resolution estimates of horizontal/vertical AoAs at UD. Since the ESPRIT-type algorithms [18, 19, 34, 35] require the knowledge of the number of MPCs, we next turn to the task of acquiring the number of MPCs at the receiver, i.e., Step 2 of Fig. 6.7.

5

Note that the phase value of every entry of the quantized Hadamard or DFT matrices still belongs to the set A, and therefore we ensure the columns of the selected U N RF to be mutually orthogonal. UD

120

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

Algorithm 6.2 Proposed Receive Combining Matrix Design RF , N h , N v , M h , M v Input: Nd , Nsd , NUD UD UD UD UD d Output: W   1: Generate unitary matrix U N RF = u1 · · · u N RF UD UD 2: Construct index matrix d of (6.39) 3: for m = 1, 2, . . . , Nd do 4: W BB [m] = U N RF [:,1:N d ] , and initialize W RF [m] = 1 NUD ⊗ uHN RF UD

5: 6: 7: 8: 9: 10: 11:

s

UD

sub ¯ Extract sub d,m = d [:,(m−1)Nsd +1:m Nsd ] , and obtain ξ d,m = vec( d,m )     ¯ Obtain ordered index set Im = mod find ξ d,m = 0 , NUD Replace W RF [m][Im ,:] ← W H BB [m] Quantize phase values of W RF [m] based on A W d [m] = W RF [m]W BB [m] end for  d = [W d [1] · · · W d [Nd ]] return W

6.3.1.4

EVD-Based Estimate for Number of MPCs

In OFDM systems, the channels of multiple adjacent subcarriers within coherence bandwidth are highly correlated. If the maximum delay spread is τmax = Nc Ts with Nc 1 delay taps, the channel coherence bandwidth is Bc ≈ τmax = Nfsc . Then we can jointly use the measurements of P ≤ Bcf = NKc adjacent subcarriers to estimate the number of MPCs, where  f = Kfs is the subcarrier’s bandwidth. Specifically, by dividing K   K −1 signal matrices J d  Y d [k] k=0 into N P = K /P groups, we can obtain the n p th sub d measurement matrix Yˇ d [n p ] ∈ C NUD ×No , as the average of the measurements in the n p th group n p P−1  1 ˇ Jd (6.41) Y d [k], 1 ≤ n p ≤ N P . Y d [n p ] = P k=(n −1)P p





sub

d

The N P average measurements are collected as Yˇ d = Yˇ d [1] · · · Yˇ d [N P ] ∈ C NUD ×No N P , H Yˇ d Yˇ d . According to the EVD, we T  sub obtain Rd = [U s U n ] diag (λd ) [U s U n ]H , where λd = λ1 · · · λ L λ L+1 · · · λ NUD =  T T T λs λn is the eigenvalue vector with the eigenvalues arranged in descending order, U s and U n are the eigenvector matrices corresponding to the signal and noise  T sub are the subspaces, respectively, while λs = [λ1 · · · λ L ]T and λn = λ L+1 · · · λ NUD eigenvalue vectors related to U s and U n , respectively. The number of MPCs L is the dimension of λs . sub To obtain an accurate estimate of L, we first construct  λ = [λTs 0TN sub −L ]T ∈ C NUD . UD The optimal estimate of  λ can be acquired by solving the following optimization problem

and the covariance matrix of Yˇ d is Rd =

1 Nod N P

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

  2 1   λ1 , λ = arg min  λ − λd 2 + ε  λ≥0 N sub 2

121

(6.42)

UD

where ε is the threshold parameter related to the AWGN power, which is determined experimentally. Clearly, the solution to the optimization problem (6.42) is [36]  λi =

%

λi − ε, λi ≥ ε 0, λi < ε,

(6.43)

  λ . From the estimate  λ , we obtain the estimate where  λi is the ith element of  of the number of MPCs, denoted by  L, which is the input to the MDU-ESPRIT algorithm for estimating  L pairs of horizontal and vertical AoAs. The resulting L   UD UD L ang ϕl l=1 are quantized as θ¯lUD , ϕ¯lUD l=1 with Nq angle quantized estimates  θl ,  bits in [−π/2, π/2]. Finally, only the few bits of the quantized angle estimates are fed back to BS through the low-frequency control link with limited resource [2]. Thus, since very little data needs to be transmitted via the feedback link, the feedback overhead at the AoAs feedback stage can be ignored in our proposed closed-loop sparse CE scheme.6

6.3.2 UL Channel Estimation Stage 6.3.2.1

Obtain Horizontal/Vertical AoDs and Delays at BS

At the UL CE stage, the BS jointly estimates the horizontal/vertical AoDs and delays for each UD. Due to the channel reciprocity in TDD systems [7–9], the UL channel matrix for the qth UD is given by H T [k] = A∗BS D[k] ATUD ∈ C NBS ×NUD , where again RF − 1 independent signal streams, the user index q is omitted. We employ Nsu = NBS h v h v and MBS antennas and a low-dimensional digital MBS × MBS sub-UPA with MBS in horizontal and vertical directions is visualized from the high-dimensional $ # sub uhybrid h v × NBS UPA at the BS. Each UD requires Nu = NBS /Ns time analog/digital NBS sub h v slots with NBS = MBS MBS to transmit the training signals, and each time slot consists u of No OFDM symbols. Hence, the UL CE for Q UDs has a training overhead of Q Nu Nou , and the total training overhead of the proposed closed-loop sparse CE scheme is TCE = Nd Nod + Q Nu Nou . Similar to (6.37), after the frequency-domain u u scrambling/descrambling operation, the signal matrix  Y u [k] ∈ C Nu Ns ×No received by the BS at the kth subcarrier and over the Nu time slots can be expressed as   ∗ T ¯  H  Y u [k] = W u ABS D[k] AUD F u Su + Bdiag W u N u [k], 6

(6.44)

In the open-loop CE schemes [28, 29] the support sets and channel gains for every subcarrier estimated at the receiver also need to be fed back to transmitter to perform the following signal processing such as beamforming design or channel equalization [1, 5]. Compared with these schemes, our proposed closed-loop CE scheme only feeds back the dominant channel parameters estimated by the BS and UD to each other, and thus, its feedback overhead is almost negligible.

122

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

 u = [W u [1] · · · W u [Nu ]] ∈ C NBS ×Nu Ns with W u [m] ∈ C NBS ×Ns being the where W ¯u= the mth time slot for 1 ≤ m ≤ Nu , W  matrix usedNin×N UL Hreceive combining u UD s is the multi-beam transmit precoding [N ] , and F ∈ C W u [1] · · · W H u u u u u matrix at UD, while Su ∈ C Ns ×No is the UL training signal matrix, and  N u[k] sub O sub sub is the UL noise matrix.  Y u [k] is multiplied by J u = I NBS u N ×(Ns Nu −NBS ) ∈ BS T  sub NBS ×Nsu Nu and the result is converted into the vector  yu [k] = vec J u  Y u [k] , i.e., R u

    ¯ BS ATUD F u Su T diag(α)τ [k] +   nu [k], yu [k] = A

u

(6.45)

  where we have used the identity vec( ABC) = C T A b with B = diag(b) [21], ∗ ¯ BS = J u W H A nu [k] is the corresponding noise vector. Furthermore, by colu ABS , and sub

u

lecting yu [k] ∈ C NBS No for 0 ≤ k≤ K − 1, we obtain the aggregated signal matrix sub u  yu [0]  yu [1] · · · yu [K − 1] ∈ C NBS No ×K given by Yu =      ¯ BS ATUD F u Su T diag (α) ATτ +   Nu, Yu = A

(6.46)

where Aτ = [τ [0] τ [1] · · · τ [K − 1]]T ∈ C K ×L , and  N is the aggregated noise  T u    matrix. Recalling τ [k] = e−j2πk fs τ1 /K · · · e−j2πk fs τL /K , we have Aτ = aτ μτ1 · · ·      τ τ T aτ μτL , in which aτ μlτ = 1 ejμl · · · ej(K −1)μl ∈ C K with μlτ = −2π f s τl /K . Observe that Aτ can be considered as the steering matrix  associated with the delays L . Taking the vectorization of  Y u , i.e., ˇyu = vec  Y u , leads to {τl }l=1 ˇyu =



    ¯ BS ATUD F u Su T α + nˇ u , Aτ A

(6.47)

where  have used the identity A sub(Bu C) = ( A B) C [21], and nˇ usub=  we vec  N u . We further reshape ˇyu ∈ C K NBS No as the matrix Yˇ u = mat ˇyu ; Nou , K NBS sub u ∈ C No ×K NBS : T    ¯ BS T + N ˇ u, Yˇ u = ATUD F u Su diag(α) Aτ A

(6.48)

  T sub u sub ˇ u = mat nˇ u ; Nou , K NBS . Hence, Y¯ u = Yˇ u ∈ C K NBS ×No can be written as where N     ¯ BS diag(α) ATUD F u Su + N ˇ Tu . Y¯ u = Aτ A

(6.49)

∗ ¯ BS = J u W  H From A u ABS , we observe that W u may destroy the shift-invariance  u using Algorithm 6.2 structure of ABS . Similar to downlink CE, we can design W RF h v h v d and MUD for UD by replacing the input parameters Nd , Ns , NUD , NUD , NUD , MUD RF h v h v u with Nu , Ns , NBS , NBS , NBS , MBS and MBS for BS. By substituting the designed  u into (6.49), we obtain W ˇ Tu , (6.50) Y¯ u = Aτ BS S¯ u + N

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

123

RF = N RF = 4 and the AoAs of the 5 MPCs are Fig. 6.9 Comparison of beam patterns, where NBS UD known to the UD [39]: a random transmit precoding matrix with 8 × 8 UPA, b multi-beam transmit precoding matrix with 8 × 8 UPA, and c multi-beam transmit precoding matrix with 16 × 16 UPA

  v h where S¯ u = diag(α) ATUD F u Su , and Aτ BS ∈ C K MBS MBS ×L is given by   ν ∗  μ ∗   H Aτ BS = Aτ J u W ABS ABS u ν μ ¯ BS A ¯ BS . = Aτ A

(6.51)

h v ×L ¯ ν MBS ×L ¯ μBS ∈ C MBS (A ) is the sub-matrix consisting of the first In (6.51), A BS ∈ C    ∗ ∗ μ ν h v  u visualizes the high-dimensional MBS (MBS ) rows of ABS ( ABS ). In this way, W hybrid analog/digital array as a low-dimensional digital array, and Aτ BS holds the triple shift-invariance structure for horizontal/vertical AoDs and delays [35]. Therefore, we can apply the MDU-ESPRIT algorithm to obtain the super-resolution esti BS BS L ϕl ,  τl . mates of horizontal/vertical AoDs and delays,  θl , 

l=1

6.3.2.2

Design Multi-beam Transmit Precoding Matrix at UD

We design the transmit precoding matrix F u = F RF,u F BB,u at UD by exploiting the  UD UD L estimate  θl ,  ϕl l=1 obtained at the downlink CE stage so that the UD with the limited transmit power can transmit directional multi-beam signals for improving the received SNR at the BS. RF We first consider the analog transmit precoding matrix F RF,u ∈ C NUD ×NUD . The estimate  AUD of the array response matrix AUD can be calculated given the estimated  UD UD L  UD UD L  AoAs θl ,  ϕl l=1 . To fully exploit the acquired  θl ,  ϕl l=1 , the multi-beam transmit precoding matrix should align its  L beams with the  L estimated AoAs. Specifically, the phase shifters of the PSN at UD can be divided into the  L groups as RF . equally as possible, depending on  L and NUD RF . This is the case that the number of beams transmitted by UD is Case I:  L > NUD larger than that of RF chains. For the UD equipped with the hybrid array with the RF NUD phase shifters. Let the number of fully-connected PSN, there are NPS = NUD phase shifters assigned to the lth group be n ps,l with 1 ≤ l ≤  L. We introduce the

124

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

 L-dimensional vector v ps as T T   v ps = n ps,1 · · · n ps,L = n ps 1L + 1Tn re 0TL−n re ,

(6.52)

  where n ps = NPS / L and n re = mod NPS ,  L . By defining the index vector p = [1 2 · · · NPS ]T , the ordered index set Pl of the lth group can be obtained as  Pl = p&l−1 n +1:&l n  with |Pl |c = n ps,l . Then, we can define the vector f = f T1 · · · i=1 ps,i i=1 ps,i  ∗ T T NPS f L ∈ C , where f l =  AUD[mod(Pl ,NUD ),l] ∈ Cn ps,l for 1 ≤ l ≤  L, to obtain F RF,u =   RF mat f ; NUD , NUD . RF RF RF and NUD can be divided exactly by  L. The NUD RF chains can Case II:  L ≤ NUD ∗ be equally allocated to the  L groups, and we can choose F RF,u = 1TNrep ⊗  AUD , with RF  Nrep = NUD / L. RF RF Case III:  L < NUD and NUD cannot be divided exactly by  L. In this case, we   1 2 1 RF  NUD × L Nrep have F RF,u = F RF,u F RF,u , where F RF,u ∈ C with Nrep = NUD / L and  RF  RF 2 1 NUD ×NUD,re RF F RF,u ∈ C with NUD,re = mod NUD ,  L . We can design F RF,u = 1TNrep ⊗   T T T ∗ RF  AUD similar to Case II, and we can choose F 2RF,u = mat  f1 ··· f L ; NUD , NUD,re RF RF similar to Case I, where  f l can be acquired by using NUD,re instead of NUD in Case I. Due to the limited resolution of the PSN, the phase value of every element in the designed F RF,u is quantized to the nearest value in the phase set A. As for RF u thedigital transmit precoding matrix F BB,u ∈ C NUD ×Ns , we can design its element as F BB,u j1 , j2 = ej2πb j1 , j2 with b j1 , j2 ∼ U[0, 1]. Finally, we obtain the multi-beam transmit precoding matrix F u = F RF,u F BB,u . To intuitively compare F d of the BS designed at the downlink CE stage and F u of the UD designed at the UL CE stage, we provide the comparison of beam patterns RF RF = NUD = 4 RF chains at the BS and each UD, and in Fig. 6.9, where we have NBS the AoAs of the 5 MPCs are known to the UD. Specifically, Fig. 6.9a depicts the beam pattern of the random transmit precoding matrix F d for the 8 × 8 UPA, while Fig. 6.9b and c plot the beam patterns of the multi-beam transmit precoding matrices F u for the 8 × 8 and 16 × 16 UPAs, respectively. Compared with the beam pattern of Fig. 6.9a, the beam pattern in Fig. 6.9b has 5 mainlobes aligned with the directions of the AoAs of the 5 MPCs, which can significantly improve the SNR at receiver. Moreover, by comparing Fig. 6.9b with Fig. 6.9c, it can be observed that the sidelobes of the multi-beam signals are further suppressed when the array size increases. In a nutshell, the proposed multi-beam transmit precoding matrix design enables the UD with limited transmit power to form the directional signals with multiple beams aligned with the estimated AoAs of the MPCs for improving the UL CE performance.

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

125

6.3.3 MDU-ESPRIT Algorithm  d (W  u ) designed at the UD Because the aggregated receive combining matrix W (BS) reconstructs the double (triple) shift-invariance structure of array response, spectrum estimation techniques can be utilized to estimate the channel parameters. We consider R-dimensional (R-D) unitary ESPRIT algorithm with R ≥ 2. Without loss of generality, we define a general signal transmission model for the channel consisting of L MPCs and R sets of spatial frequencies as Y = AS + N,

(6.53)

where Y ∈ C M×N is the received data matrix aggregated over N snapshots, M = ' R r =1 Mr , with Mr being the dimension of the parameter vector associated with the r th spatial frequency for 1 ≤ r ≤ R, and S ∈ C L×N and N ∈ C M×N are the transmit signal and noise matrices, respectively, while the array response matrix A ∈ C M×L is given by A = Aμ R · · · Aμ2 Aμ1 (6.54)      = a μ11 , μ21 , . . . , μ1R · · · a μ1L , μ2L , . . . , μ LR .   In (6.54), Aμr = a(μr1 ) · · · a(μrL ) ∈ C Mr ×L is the steering matrix related to the r th    r r T L set of spatial frequencies {μrl }l=1 , with a μrl = 1 ejμl · · · ej(Mr −1)μl ∈ C Mr being the lth steering vector, while the array response vector a μl1 , μl2 , . . . , μlR ∈ C M related to the lth MPC is given by         a μl1 , μl2 , . . . , μlR = a μlR ⊗ · · · ⊗ a μl2 ⊗ a μl1 .

(6.55)

The MDU-ESPRIT algorithm, which acquires the super-resolution estimates of  r L the R sets of spatial frequencies from (6.53), denoted by  μl l=1 for 1 ≤ r ≤ R, consists of the five steps. R-D Spatial Smoothing Preprocessing (SSPP): In order to take into account the insufficient measurement dimension N caused by the limited training overhead, we will exploit the spatial smoothing technique [18] to preprocess the original data matrix Y of (6.53). This preprocessing can mitigate the influence of other coherent signals and avoid the rank deficiency of the covariance matrix of S to enhance robustness. Specifically, we first define the R spatial smoothing parameters {G r }rR=1 with 1 ≤ G r ≤ Mr , and obtain the sub-dimensions {Mrsub }rR=1 corresponding to {Mr }rR=1 as Mrsub = Mr − ' G r + 1 for 1 ≤ r ≤ R. Thus the size of total sub-dimension is Msub = rR=1 Mrsub . To obtain the R-D selection matrix, we next  definesubthe gr th ‘1-D’ selection matrix as J (gr ) = O Mrsub ×(gr −1) I Mrsub O Mrsub ×(G r −gr ) ∈ R Mr ×Mr for 1 ≤ gr ≤ G r . Then, we ' can obtain G = rR=1 G r ‘R-D’ selection matrices, with the (g1 , g2 , . . . , g R )th ‘RD’ selection matrix given by J g1 ,g2 ,...,g R = J (g R ) ⊗ · · · ⊗ J (g2 ) ⊗ J (g1 ) ∈ R Msub ×M . By applying these R-D selection matrices to Y , the smoothed complex-valued data matrix Y¯ ∈ C Msub ×N G is obtained as

126

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

Y¯ =

    J 1,1,...,1,1 Y · · · J 1,1,...,1,G R Y J 1,1,...,2,1 Y     J 1,1,...,2,2 Y · · · J G 1 ,G 2 ,...,G R−1 ,G R Y .



(6.56)

Real-Valued Processing (RVP): To reduce the computational complexity, the forward backward averaging technique [19] is utilized to transform Y¯ into the real-valued matrix Y¯ re ∈ R Msub ×2N G    ∗ (6.57) Y¯ re = Q HMsub Y¯ Msub Y¯ N G Q 2N G , where n is the exchange matrix of size n × n that permutates the row order of I n , and Q n ∈ Cn×n is a sparse unitary matrix satisfying n Q H n = Qn . Signal Subspace Approximation (SSA): To extract the information of spatial frequencies from the real-valued matrix Y¯ re , we introduce the transform steering matrix K which satisfies [19]    H  Re Q H mr J r Q Msub K r = Im Q mr J r Q Msub K ,

(6.58)

     where 1 ≤ r ≤ R, m r = Msub Mrsub − 1 /Mrsub , and r = diag tan μr1 /2 · · · tan  r T  is the real-valued diagonal matrix involving the desired spatial frequenμ L /2 (r ) (r ) L m r ×Msub −1 R cies {μrl }l=1 , while J r = I 'i=r J ⊗ I 'ri=1 , with  J = sub ⊗  Misub ∈ R +1 Mi   0 Mrsub −1 I Mrsub −1 . Note that K is related to the approximate signal subspace matrix E s ∈ R Msub ×L corresponding to the underlying signal subspace. Specifically, since the columns of K and E s span the same L-dimensional signal subspace [18, 35], K = E s T , where T ∈ R L×L is a non-singular matrix. To determine E s , we take the left singular vectors corresponding to the largest L singular values of Y¯ re as E s . Specifically, from the real-valued partial singular values decomposition (SVD) of Y¯ re = U re  re V H re , we have E s = U re[:,1:L] . Shift-Invariance Equation Solving (SIES): Based on the acquired approximate signal subspace E s , we use K = E s T in (6.58) to obtain the R shift-invariance equations    H  Re Q H m r J r Q Msub E s r = Im Q mr J r Q Msub E s ,

(6.59)

where r = T r T −1 ∈ R L×L for 1 ≤ r ≤ R. To estimate the diagonal matrices {r }rR=1 , we first obtain the estimates of the R real-valued matrices {r }rR=1 , denoted  R r as  , by applying the LS or total least squares (TLS) estimator to solve the R r =1 shift-invariance equations of (6.59). r T −1 with  r denotr = T  R-D Joint Diagonalization (JD): From the estimated  ing the estimate of r for 1 ≤ r ≤ R, we exploit the following R-D joint diagonaliza R  r L r . tion to obtain the paired estimates of the spatial frequencies  μl l=1 from  r =1 Specifically, we consider the two cases of R = 2 and R ≥ 3. For R = 2, namely, 2 share the same eigenvector matrix T , we can cal1 and  the 2-D case, since 

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

127

1 and  2 , 1 + j 2 to obtain  culate the EVD of the complex-valued matrix  =   1 + j 2 , and  1 = Re{ } and  2 = Im{ }. specifically,  = T T −1 with =   R r For R ≥ 3, the noise-corrupted matrices  do not always exactly share the r =1 same T . Hence, we exploit the simultaneous Schur decomposition (SSD) algorithm [34], which is developed from the real Schur decomposition [21] for multi-parameter estimation and pairing. By utilizing the SSD algorithm, we obtain the R approxi R r are acquired as the main mate upper-triangular matrices {Γ r }rR=1 so that  r =1 R r = diag (vdiag(Γ r )), for 1 ≤ r ≤ R. Finally, diagonal elements of {Γ r }r =1 , i.e.,   r L the R paired super-resolution estimates of the spatial frequencies  μl l=1 can be    r r as  for 1 ≤ l ≤ L and 1 ≤ r ≤ R. μrl = 2 arctan  calculated from  l,l This MDU-ESPRIT algorithm is summarized in Algorithm 6.3 [39]. At the  UD UD L downlink CE stage, we estimate  μl , νl l=1 based on Y¯ d of (6.38) by applying the 2-D (R = 2) unitary ESPRIT algorithm. Furthermore, based on Y¯ u of (6.50),  BS BS τ L  μl , νl ,  μl l=1 are estimated using the 3-D (R = 3) unitary ESPRIT algorithm at the UL CE stage. Hence, the spatial smoothing parameters for Algorithm 6.3 in the 2-D and 3-D cases are {G d1 , G d2 } and {G u1 , G u2 , G u3 }, respectively. The corresponding UD h v = (MUD − G d1 + 1)(MUD − total sub-dimension sizes at the UD and BS are Msub d u u u BS h v G 2 + 1) and Msub = (MBS − G 1 + 1)(MBS − G 2 + 1)(K − G 3 + 1), respectively. Algorithm 6.3 MDU-ESPRIT Algorithm Input: Data matrix Y , number of MPCs L, sub-dimensions {Mr }rR=1 , spatial smoothing parameters {G r }rR=1  r L Output: Super-resolution estimates of spatial frequencies  μl l=1 , 1 ≤ r ≤ R 1: Obtain smoothed data matrix Y¯ (6.56) using R-D spatial smoothing preprocessing 2: Obtain real-valued data matrix Y¯ re (6.57) using forward backward averaging 3: Determine approximate signal subspace matrix E s through SVD  R r 4: Solve shift-invariance equations (6.59) to obtain R real-valued matrices  r =1  R  5: Perform R-D joint diagonalization to estimate diagonal matrices r r =1 : i) R = 2, calculate 1 = Re{ } and  2 = Im{ }; ii) R ≥ 3, obtain 1 + j 2 = T T −1 to obtain  EVD of  =   R  R diagonal matrices r r =1 via SSD algorithm  r L  R r 6: Extract R paired  μ from  l l=1

r =1

6.3.4 ML Pairing and Path Gains Estimation At the downlink CE stage, the BS obtains the estimated horizontal/vertical AoAs  UD UD L θ¯l , ϕ¯l l=1 fed back by the UD. It then estimates the horizontal/vertical AoDs and  BS BS L L   BS BS L delays  θl ,  ϕl ,  τl l=1 at the UL CE stage. Since θ¯lUD , ϕ¯lUD l=1 and  θl ,  ϕl ,  τl l=1 are acquired in the two different ends of the channel at two different stages, it is nec-

128

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

essary to pair them. Furthermore, the path gain vector α needs to be estimated. We propose to apply an ML approach to pair the channel parameters and to estimate the path gains, which corresponds to Step 7 of Fig. 6.7 at the BS. Specifically, according to (6.51), we construct the equivalent steering matrix asso BS BS L  L ϕl ,  τl l=1 as  Aτ BS , where { τl }l=1 are arranged in ascending order. ciated with  θl ,   L Based on the estimated horizontal/vertical AoAs fed back to the BS, θ¯lUD , ϕ¯lUD l=1 , we can reconstruct the estimated multi-beam transmit precoding matrix, denoted by  F u , similar to the construction of F u given in Sect. 6.3.2.2. Clearly, there are a total  L of Jc =  L! possible ordered combinations or pairs θ¯lUD , ϕ¯lUD that can pair with l j =1 j j  BS BS L  Aτ BS or  θl ,  ϕl ,  τl , where j ∈ J and the size of the ordered set J is |J |c = Jc .  UD UD Ll=1 ¯ , we can establish the corresponding array response matrix For each θl , ϕ¯l j

j

l j =1

AUD, j . Thus, for each pair of the AoDs and AoAs, we have AUD , which is denoted as   Aτ BS ,  AUD, j and  F u . Substituting them into (6.47) yields ˇyu =  A j α j + nˇ u , where T  T  Aτ BS  AUD, j  Aj =  F u Su while α j is the path gain vector corresponding to the jth pair of the AoDs and AoAs with j ∈ J . The LS estimate of α j is readily given as  H −1 H  Aj  A j ˇyu . Aj  (6.60) αj =  ˇyu, j =  From the estimate  α j , we can estimate ˇyu according to  α j with the residual A j  2   ˇyu −   ˇyu, j 2 . We can then find the optimal pair index j by solving the following optimization problem  2 ˇyu, j 2 . (6.61) j  = arg min  ˇyu −  j∈J

T  Hence, the optimal estimate of the path gains is given by  α = α j = β  α1 · · ·  αL   and we have the optimal ordered mmWave channel parameter estimate θ¯lUD , L ϕ¯lUD ,  θlBS ,  ϕlBS ,  τl ,  αl l=1 . L  By substituting θ¯lUD , ϕ¯lUD ,  θlBS ,  ϕlBS ,  τl ,  αl l=1 into (6.34), we obtain the opti mally estimated frequency-domain channel matrix H[k] at the kth subcarrier as  H[k] =β

 L 

   BS BS  −j2π k fs τl K μl ,  αl aUD μ¯ lUD , ν¯lUD aH νl e , BS 

(6.62)

l=1

 BS  UD UD where μ¯ lUD sin(θ¯lUD ) cos( ϕ¯lUD μlBS = π sin  θl  BS  = πBS  ) and ν¯l = π sin(ϕ¯l ), while   BS ϕl . νl = π sin  cos  ϕl and 

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

129

6.3.5 Performance Evaluation An extensive simulation investigation is carried out to evaluate the CE performance and computational complexity of the proposed closed-loop CE scheme. In simulations, the carrier frequency is f c = 30 GHz with the bandwidth f s = 200 MHz, the RF RF = NUD = 4, and the numnumbers of RF chains at BS and UD are both 4, i.e., NBS h v = NBS = bers of horizontal and vertical antennas at BS and UD are both 12, i.e., NBS ps h v NUD = NUD = 12, and the quantization accuracy of the PSN is defined by Nq = 3 ang bits, while the feedback quantization accuracy for AoAs is specified by Nq = 10 bits. Without loss of generality, the case of single UD Q = 1 is considered. From Fig. 6.8, it is clear that for the generic case of Q > 1, the UL training overhead becomes Nu Nou instead of Q Nu Nou for the case of Q = 1. The channel model is simL is generated according to CN (0, 1), ulated as follows. Each of the path gains {αl }l=1 UD L all follow uniform while the other channel parameters {τl , θl , θlBS , ϕlUD , ϕlBS }l=1 UD BS UD distribution, specifically, τl ∼ U[0, τmax ] and θl , θl , ϕl , ϕlBS ∼ U [−π/3, π/3] for the lth MPC. The maximum multipath delay is set to τmax = 16Ts , i.e., Nc = 16. The number of subcarriers is set to K = 128 with the length of CP being 32, and perfect frame synchronization is assumed. In our proposed solution, the sizes of lowdimensional digital sub-arrays visualized from the high-dimensional hybrid arrays h v h v = MBS = MUD = MUD = 8. Hence, the numbers of downlink and are set to MBS UL training time slots are Nd = 22 and Nu = 22, respectively, given the number RF − 1 = 3 and the number of UL of downlink independent signal streams Nsd = NUD RF u independent signal streams Ns = NBS − 1 = 3. Additionally, P = 8 adjacent subcarriers are jointly employed to estimate the number of MPCs, with the threshold parameter ε empirically set to 1.54, 0.50, 0.16, 0.05, 0.016, and 0.005, respectively, at the SNR of − 15 dB, − 10 dB, − 5 dB, 0 dB, 5 dB, and 10 dB. The spatial smoothing parameters used for Algorithm 6.3 are G d1 = G d2 = G u1 = G u2 = 2 and G u3 = K /2. The downlink and UL SNRs are both defined as ρσα2 /σn2 , where ρ and σn2 are the transmit power and receiver noise variance, respectively. The state-of-the-art OMP-based frequency-domain scheme [28]7 and the SWOMP-based scheme [29] are adopted as two benchmarks. In order to be consistent with [28, 29], their digital transmit precoding/receive combining matrices are taken as the identity matrix, while the design of analog counterparts is similar to the construction of F RF,d [m] given in Sect. 6.3.1.2. The sizes of the quantized angle-domain grids associated with horizontal/vertical AoAs/AoDs, denoted by G hBS , G vBS , G hUD and G vUD , are set to twice the numbers of antennas in the horizontal and vertical directions of UPA, respectively, according to h v × 2NBS = 24 × 24 and G UD = G hUD × [28, 29], i.e., G BS = G hBS × G vBS = 2NBS v h v G UD = 2NUD × 2NUD = 24 × 24. Furthermore, all CE schemes adopt the same training overhead, which is equal to the required number of training frames [28, 29], to ensure the fairness of comparison. 7

The redundant dictionary of OMP-based time-domain method in [28] is generated by the quantized grids at the delay and angle domains, which imposes the unaffordable computational complexity and storage requirements. Hence, we just consider the frequency-domain scheme in simulations.

130

6.3.5.1

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

Channel Estimation Performance Evaluation

First the CE performance is evaluated using the normalized mean square error (NMSE) metric given by  K −1 −1 2 ( K       H[k]2 .  H[k] − H[k] F NMSE = E F k=0

(6.63)

k=0

Figure 6.10 compares the NMSE performance of the proposed closed-loop scheme with those of the OMP and SW-OMP based schemes for different SNRs, given the numbers of MPCs L = 3 and L = 5. For our closed-loop scheme, the numbers of OFDM symbols in each downlink time slot and UL time slot are Nod = Nou = 3. Therefore, the training overhead of our closed-loop scheme is TCE = Nd Nod + Nu Nou = 132. In Fig. 6.10, the NMSE curve labeled as ‘Proposed Close-Loop’ is our proposed closed-loop CE scheme, which also estimates the number of MPCs L, while the curve labeled with ‘Benchmark of Closed-Loop’ is the closed-loop scheme given the perfect knowledge of L, which provides a lower bound NMSE. It can be seen that the CE accuracy achieved by our closed-loop scheme with no knowledge of L is very close to this lower bound, which demonstrates the super-resolution accuracy of our solution Additionally, our closed-loop CE scheme adopting the random transmit precoding matrix F u is also illustrated in Fig. 6.10, where it is observed to suffer from around 5 dB and 3 dB performance losses in the cases of L = 3 and 5, respectively. This clearly demonstrates the effectiveness of the proposed multi-beam transmit precoding matrix design which fully exploits the estimated horizontal/vertical AoAs to optimize the received SNR for improving CE performance. Furthermore, the results of Fig. 6.10 show that our proposed closed-loop CE scheme dramatically outperforms the two CS-based schemes, in terms of CE accuracy. In particular, the OMP and SW-OMP based schemes seem to suffer from the NMSE floor at high SNR. By adopting larger discrete angle-domain grids to achieve larger quantized CS dictionary, the performance of these CS-based schemes can be improved [28, 29] at the expense of significantly increased computational complexity, which becomes unaffordable for FD-MIMO systems with massive antenna array. Figure 6.11 compares the NMSE performance of different CE schemes against different SNRs, given two training overheads with the same number of MPCs L = 4. TCE = 88 in Fig. 6.11a and TCE = 176 in Fig. 6.11b correspond to choosing Nod = Nou = 2 and Nod = Nou = 4 in our scheme, respectively. From Fig. 6.11, similar conclusions to those observed for Fig. 6.10 can be obtained. In particular, it can be seen that our closed-loop CE scheme considerably outperforms the two CS-based schemes. Figure 6.12 compares the NMSE performance of different CE schemes versus the number of MPCs L given two SNR values of 0 and 10 dB as well as two training overheads of TCE = 88 and TCE = 176. In simulations, we adopted the same parameter settings as in Fig. 6.11 except the number of MPCs. From Fig. 6.12, the good performance of the proposed multi-beam transmit precoding matrix design and

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

131

Fig. 6.10 NMSE performance comparison of different CE schemes versus SNRs with the same training overhead TCE = 132 [39]: a the number of MPCs L = 3; and b L = 5

132

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

Fig. 6.11 NMSE performance comparison of different CE schemes versus SNRs with the same number of MPCs L = 4 [39]: a training overhead TCE = 88; and b TCE = 176

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

133

Fig. 6.12 NMSE performance comparison of different CE schemes versus the number of MPCs L with SNR = 0 dB and 10 dB [39]: a training overhead TCE = 88; and b TCE = 176

134

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

Fig. 6.13 ASE performance comparison of different CE schemes versus SNRs [39]: a the number of MPCs L = 3; and b L = 5

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

135

Table 6.1 Computational complexity of the closed-loop scheme [39] Operation Complexity   RF (1 + N d ) + N N N RF (1 + N u ) Steps 1(b) and 5(b) O Nd NUD NUD u BS BS s    sub 3  sub 2 s d Step 2 O NUD + NUD No N P  UD UD UD  Step 3 O Msub K Nod G d1 G d2 + 8Msub K Nod G d1 G d2 + 41 Msub L2 +          (R = 2) RVP

SSA

Step 6 (R = 3)

Step 7 Step 8

 BS + 2( BS )2 + 3  4  L 3 + 2 L 2 Msub L + 1)(Msub 4L     SIES 3−D JD   sub N u ) O  L!( L 3 + 2 L 2 K NBS o   O K L NBS NUD 3 4



2−D SSPP

 UD + 2( UD )2 + 1  3  L 2 Msub L + 1)(Msub L 3 + 2 4L     SIES 2−D JD  BS u u u u BS u u u u BS  L2 + O Msub No G 1 G 2 G 3 + 8Msub No G 1 G 2 G 3 + 41 Msub          1 2



3−D SSPP

RVP

SSA

EVD-based approach for MPCs’ number estimation is evident. Again, the proposed closed-loop scheme significantly outperforms two other CS-based schemes. Observe that the performance gain of our scheme over the two other schemes increases for sparser mmWave channels, i.e., having smaller number of MPCs. Moreover, although the performance gap between the proposed solution and the CS-based methods is gradually reduced at low SNRs as the number of MPCs L increases, the proposed scheme can achieve the considerable performance gain over the CS schemes at high SNRs. Hence, the proposed scheme is suitable for sparse mmWave channels, and more performance gain can be obtained for sparser channels. Next we consider the ASE performance metric [37] defined as ASE =

K −1  1  log2 det I Ns K k=0

 H H H + N1s R−1 n [k]W c [k]H[k]F p [k]F p [k]H [k]W c [k] ,

(6.64)

where Rn [k] = σn2 W H c [k]W c [k], F p [k] = F RF, p F BB, p [k] and W c [k] = W RF,c W BB,c [k] are the transmit precoding and receive combining matrices used during data transmission, respectively, while Ns is the number of transmit data stream. The principle component analysis (PCA)-based hybrid beamforming scheme proposed in [37] is used to evaluate the ASE performance, where the CSI is based on the estimated channels. Besides, the spectral efficiency of the PCA-based hybrid beamforming scheme with the perfect CSI known both to the BS and UD is adopted as the performance upper bound. Here, the same simulation parameters used in Fig. 6.10 are considered, and the number of transmit data streams used is Ns = 2 in (6.64).

136

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

Table 6.2 Computational complexity of two CS-based CE schemes [39] Operation OMP-based scheme [28]   RF N N Measurement matrix O K TCE NBS BS UD G BS G UD Whitening NA  &K   RF G G Correlation O TCE NBS BS UD k=1 Ik  &K  1 2 2 Project subspace O k=1 4 Ik (Ik + 1) +  1 RF (2Ik + 1) TCE NBS 3 Ik (Ik + 1)   & K Ik RF Update residual O TCE NBS 2(Ik + 1)   &k=1 K RF Compute MSE O TCE NBS I k  &k=   K Reestablishment O NBS NUD k=1 Ik Operation Measurement matrix Whitening Correlation Project subspace Update residual Compute MSE Reestablishment

SW-OMP-based scheme [29]   RF N N O TCE NBS BS UD G BS G UD    RF (T N RF )2 + K + G G O TCE NBS CE BS BS UD   RF G G O TCE NBS BS UD K I 1 2 O 4 I (I + 1)2 + 13 K I (I + 1)(2I +  RF 1)TCE NBS   RF K I (I + 1) O TCE NBS 2   RF K 2 I O TCE NBS O (NBS NUD K I )

Figure 6.13 compares the ASE performance of different CE schemes against different SNRs. It can be observed from Fig. 6.13 that the ASE performance using the CSI estimated by the proposed scheme closely matches to the performance upper bound obtained using the perfect CSI at both the BS and UD. It can also be seen that the ASE performance gain achieved by the proposed scheme over the two CS-based schemes is 0.1 [bit/s/Hz] at high SNR conditions. At low SNRs, this gain is clearly larger. Note that the ASE performance of the OMP based scheme is particularly poor when SNR ≤ 0 dB.

6.3.5.2

Computational Complexity Evaluation

The computational complexity analysis of our closed-loop CE scheme is detailed in Table 6.1, where the notation O(N ) stands for ‘on order of N ’. The computational requirements of Step 1(a) and Step 5(a) are omitted, since they are much smaller, compared with the requirements of other steps. Clearly, Step 4 does not involve computation. It can be seen that the computational requirements are dominated by Step 6 (corresponding to Algorithm 6.3 with R = 3) and Step 7. Also observe that the complexity of the CE scheme increases fast as  L increases, since the computational L!, respectively. complexity of Step 6 and Step 7 are proportional to  L 4 and  The computational complexity of the two CS-based CE schemes is given in Table 6.2 for comparison, where the numbers of iterations for the OMP algorithm at

6.3 Subspace-Based Super-Resolution Sparse Channel Estimation in Wideband …

137

Fig. 6.14 Computational complexity comparison of different CE schemes given the training overhead TCE = 88 [39]: a sizes of UPAs at BS and UD are both 12 × 12; and b number of MPCs L=4

138

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

the kth subcarrier and the SW-OMP algorithm are denoted by Ik and I , respectively. Note that the values of Ik are different for different subcarriers. It can be seen that the computational complexity of these two CS-based schemes increase fast as the quantized grids G BS and G UD increase. Also the complexity of the OMP scheme is around K times of the SW-OMP scheme, because the K subchannels at K subcarriers are independently estimated in the OMP scheme but they are jointly estimated in the SW-OMP scheme. Due to the power leakage caused by the mismatch between the discrete CS angle-domain dictionary and continuously distributed AoAs/AoDs of channels, the number of effective MPCs represented in the redundant CS dictionary are usually greater than L. Hence, the value of Ik in the OMP scheme and the value of I in the SW-OMP scheme are not fixed and they are usually greater than L. Therefore, we can use I = Ik = L to provide the lower bounds of the computational complexity for the two CS-based schemes. Figure 6.14 compares the computational complexity of our closed-loop CE scheme with those of the two CS-based schemes given the training overhead TCE = 88 corresponding to Nod = Nou = 2 in our scheme. From Fig. 6.14a, we observe that the computational complexity of the proposed CE solution increases slightly as the number of MPCs increase. Most strikingly, however, given the size of UPA as 12 × 12, the complexity of our solution is at least 3 orders of magnitude lower than the SW-OMP scheme and at least 5 orders of magnitude lower than the OMP scheme. The results of Fig. 6.14b indicate that given L = 4, the complexity of our solution is almost immune to the size of UPA at the BS and UD. By contrast, the complexity of the two CS-based schemes increase considerably as the number of antennas increases. Again, the complexity of our solution is several orders of magnitude lower than the other two schemes. It should also be reiterated that to mitigate the power leakage, the CS-based schemes adopt the high-dimensional redundant dictionary, which results in unaffordable storage space requirements when the number of antennas is large. Clearly, for FD-MIMO systems with massive number of antennas, the proposed closed-loop scheme offers considerable advantage over the CS-based schemes, in terms of both computational complexity and storage requirements.

Table 6.3 Comparison of advantages and disadvantages of different CE schemes [39]

6.4 Summary

139

The advantages and disadvantages of our proposed solution and two other CSbased CE schemes are given in Table 6.3, where the training/feedback overhead, storage requirements, computational complexity and received SNR at CE are compared.

6.4 Summary In this chapter, we discuss the super-resolution CE for mmWave massive MIMO with hybrid precoding. First, we propose a 2D unitary ESPRIT based super-resolution CE scheme to jointly obtain the super-resolution estimates of AoAs and AoDs with high accuracy. Specifically, by designing an efficient UL training signals at both BS and MS, we can use a much reduced pilot overhead to acquire the low-dimensional effective channel, which has the same shift-invariance of array response as the highdimensional mmWave MIMO channel to be estimated. Then, by exploiting the 2D unitary ESPRIT based CE algorithm, the super-resolution estimates of AoAs and AoDs can be jointly acquired from the low-dimensional effective channel. Moreover, the associated path gains can be acquired by using the LS estimator. Finally, the highdimensional mmWave MIMO channel can be reconstructed according to the obtained AoDs, AoDs, and path gains. Simulation results verify that the proposed CE scheme can achieve the better CE performance with lower pilot overhead and computational complexity than the conventional schemes. On the other hand, we propose a closed-loop sparse CE scheme for multi-user wideband mmWave FD-MIMO systems with hybrid beamforming. By exploiting the sparsity of mmWave channels in both angle and delay domains and by visualizing high-dimensional hybrid arrays as low-dimensional digital arrays, the proposed scheme is capable of obtaining the super-resolution estimates of horizontal/vertical AoDs/AoAs and delays based on the MDU-ESPRIT algorithm. Specifically, at the downlink CE stage, we design the common random transmit precoding matrix at the BS and the receive combining matrix at each UD to estimate the horizontal/vertical AoAs of sparse MPCs. At the UL CE stage, based on the designed receive combining matrix at the BS, we estimate horizontal/vertical AoDs and delays. Furthermore, the AoAs estimated at each UD are utilized to design the multi-beam transmit precoding matrix for further enhancing CE performance. We also propose an ML approach at the BS to pair the channel parameters acquired at the two stages and to optimally estimate the path gains. Simulation results demonstrate that the proposed closed-loop CE scheme offers considerable advantages over state-of-the-art CS-based CE schemes, in terms of providing significantly more accurate CSI estimate while imposing dramatically lower computational complexity and storage requirements.

140

6 Subspace-Based Super-Resolution Sparse Channel Estimation …

References 1. Heath, R.W., Gonzalez-Prelcic, N., Rangan, S., Roh, W., Sayeed, A.M.: An overview of signal processing techniques for millimeter wave MIMO systems. IEEE J. Sel. Topics Signal Process. 10(3), 436–453 (2016) 2. Gao, X., Dai, L., Han, S., Chih-Lin, I., Heath, R.W.: Energy-efficient hybrid analog and digital precoding for mmWave MIMO systems with large antenna arrays. IEEE J. Sel. Areas Commun. 34(4), 998–1009 (2016) 3. Alkhateeb, A., Mo, J., Gonzalez-Prelcic, N., Heath, R.W.: MIMO precoding and combining solutions for millimeter-wave systems. IEEE Commun. Mag. 52(12), 122–131 (2014) 4. Méndez-Rial, R., Rusu, C., González-Prelcic, N., Alkhateeb, A., Heath, R.W.: Hybrid MIMO architectures for millimeter wave communications: phase shifters or switches? IEEE Access 4, 247–267 (2016) 5. Gao, Z., Dai, L., Mi, D., Wang, Z., Imran, M.A., Shakir, M.Z.: mmWave massive-MIMO-based wireless backhaul for the 5G ultra-dense network. IEEE Wirel. Commun. 22(5), 13–21 (2015) 6. Zhang, J., Huang, Y., Shi, Q., Wang, J., Yang, L.: Codebook design for beam alignment in millimeter wave communication systems. IEEE Trans. Commun. 65(11), 4980–4995 (2017) 7. Zhao, L., Ng, D.W.K., Yuan, J.: Multi-user precoding and channel estimation for hybrid millimeter wave systems. IEEE J. Sel. Areas Commun. 35(7), 1576–1590 (2017) 8. Gao, Z., Dai, L., Han, S., Chih-Lin, I., Wang, Z., Hanzo, L.: Compressive sensing techniques for next-generation wireless communications. IEEE Wirel. Commun. 25(3), 144–153 (2018) 9. Ma, J., Zhang, S., Li, H., Gao, F., Jin, S.: Sparse Bayesian learning for the time-varying massive MIMO channels: acquisition and tracking. IEEE Trans. Commun. 67(3), 1925–1938 (2018) 10. Zhu, D., Choi, J., Heath, R.W.: Auxiliary beam pair enabled AOD and AOA estimation in closed-loop large-scale millimeter-wave MIMO systems. IEEE Trans. Wirel. Commun. 16(7), 4770–4785 (2017) 11. Sun, S., Rappaport, T.S.: Millimeter wave MIMO channel estimation based on adaptive compressed sensing. In: 2017 IEEE International Conference on Communications Workshops (ICC Workshops), pp. 47–53. IEEE (2017) 12. Alkhateeb, A., El Ayach, O., Leus, G., Heath, R.W.: Channel estimation and hybrid precoding for millimeter wave cellular systems. IEEE J. Sel. Topics Signal Process. 8(5), 831–846 (2014) 13. Lee, J., Gil, G.T., Lee, Y.H.: Channel estimation via orthogonal matching pursuit for hybrid MIMO systems in millimeter wave communications. IEEE Trans. Commun. 64(6), 2370–2386 (2016) 14. Ghauch, H., Kim, T., Bengtsson, M., Skoglund, M.: Subspace estimation and decomposition for large millimeter-wave MIMO systems. IEEE J. Sel. Topics Signal Process. 10(3), 528–542 (2016) 15. Yang, L., Zeng, Y., Zhang, R.: Efficient channel estimation for millimeter wave MIMO with limited RF chains. In: 2016 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2016) 16. Huang, C., Liu, L., Yuen, C., Sun, S.: A LSE and sparse message passing-based channel estimation for mmWave MIMO systems. In: 2016 IEEE Globecom Workshops (GC Wkshps), pp. 1–6. IEEE (2016) 17. Kokshoorn, M., Chen, H., Wang, P., Li, Y., Vucetic, B.: Millimeter wave MIMO channel estimation using overlapped beam patterns and rate adaptation. IEEE Trans. Signal Process. 65(3), 601–616 (2016) 18. Van der Veen, A.J., Vanderveen, M.C., Paulraj, A.: Joint angle and delay estimation using shift-invariance techniques. IEEE Trans. Signal Process. 46(2), 405–418 (1998) 19. Zoltowski, M.D., Haardt, M., Mathews, C.P.: Closed-form 2-d angle estimation with rectangular arrays in element space or beamspace via unitary esprit. IEEE Trans. Signal Process. 44(2), 316–328 (1996) 20. Miao, H., Juntti, M., Yu, K.: 2-d unitary esprit based joint AOA and AOD estimation for MIMO system. In: 2006 IEEE 17th International Symposium on Personal, Indoor and Mobile Radio Communications, pp. 1–5. IEEE (2006)

References

141

21. Golub, G.H., Van Loan, C.F.: Matrix Computations. JHU Press (2013) 22. Gao, Z., Hu, C., Dai, L., Wang, Z.: Channel estimation for millimeter-wave massive MIMO with hybrid precoding over frequency-selective fading channels. IEEE Commun. Lett. 20(6), 1259–1262 (2016) 23. Liu, A., Lau, V.K.: Impact of CSI knowledge on the codebook-based hybrid beamforming in massive MIMO. IEEE Trans. Signal Process. 64(24), 6545–6556 (2016) 24. Xiao, Z., Xia, X.G., Bai, L.: Achieving antenna and multipath diversities in GLRT-based burst packet detection. IEEE Trans. Signal Process. 63(7), 1832–1845 (2015) 25. Xiao, Z., Zhang, C., Jin, D., Ge, N.: GLRT approach for robust burst packet acquisition in wireless communications. IEEE Trans. Wirel. Commun. 12(3), 1127–1137 (2013) 26. Zhou, Z., Fang, J., Yang, L., Li, H., Chen, Z., Blum, R.S.: Low-rank tensor decomposition-aided channel estimation for millimeter wave MIMO-OFDM systems. IEEE J. Sel. Areas Commun. 35(7), 1524–1538 (2017) 27. Dong, Y., Chen, C., Yi, N., Lu, G., Jin, Y.: Channel estimation using low-resolution PSS for wideband mmWave systems. In: 2017 IEEE 85th Vehicular Technology Conference (VTC Spring), pp. 1–5. IEEE (2017) 28. Venugopal, K., Alkhateeb, A., Prelcic, N.G., Heath, R.W.: Channel estimation for hybrid architecture-based wideband millimeter wave systems. IEEE J. Sel. Areas Commun. 35(9), 1996–2009 (2017) 29. Rodríguez-Fernández, J., González-Prelcic, N., Venugopal, K., Heath, R.W.: Frequencydomain compressive channel estimation for frequency-selective hybrid millimeter wave MIMO systems. IEEE Trans. Wirel. Commun. 17(5), 2946–2960 (2018) 30. Ma, X., Yang, F., Liu, S., Song, J., Han, Z.: Design and optimization on training sequence for mmWave communications: a new approach for sparse channel estimation in massive MIMO. IEEE J. Sel. Areas Commun. 35(7), 1486–1497 (2017) 31. Xiao, Z., Xia, P., Xia, X.G.: Enabling UAV cellular with millimeter-wave communication: potentials and approaches. IEEE Commun. Mag. 54(5), 66–73 (2016) 32. Hu, C., Dai, L., Mir, T., Gao, Z., Fang, J.: Super-resolution channel estimation for mmWave massive MIMO with hybrid precoding. IEEE Trans. Veh. Technol. 67(9), 8954–8958 (2018) 33. Mao, J., Gao, Z., Wu, Y., Alouini, M.S.: Over-sampling codebook-based hybrid minimum summean-square-error precoding for millimeter-wave 3d-MIMO. IEEE Wirel. Commun. Lett. 7(6), 938–941 (2018) 34. Haardt, M., Nossek, J.A.: Simultaneous Schur decomposition of several nonsymmetric matrices to achieve automatic pairing in multidimensional harmonic retrieval problems. IEEE Trans. Signal Process. 46(1), 161–169 (1998) 35. Vanderveen, M.C., Van der Veen, A.J., Paulraj, A.: Estimation of multipath parameters in wireless communications. IEEE Trans. Signal Process. 46(3), 682–690 (1998) 36. Donoho, D.L.: De-noising by soft-thresholding. IEEE Trans. Inf. Theory 41(3), 613–627 (1995) 37. Sun, Y., Gao, Z., Wang, H., Wu, D.: Wideband hybrid precoding for next-generation backhaul/fronthaul based on mmWave FD-MIMO. In: 2018 IEEE Globecom Workshops (GC Wkshps), pp. 1–6. IEEE (2018) 38. Liao, A., Gao, Z., Wu, Y., Wang, H., Alouini, M.S.: 2D unitary ESPRIT based super-resolution channel estimation for millimeter-wave massive MIMO with hybrid precoding. IEEE Access 5, 24747–24757 (2017) 39. Liao, A., Gao, Z., Wang, H., Chen, S., Alouini, M.S., Yin, H.: Closed-loop sparse channel estimation for wideband millimeter-wave full-dimensional MIMO systems. IEEE Trans. Commun. 67(12) 8329–8345 (2019)

Chapter 7

Compressive Sensing Single-User Signal Detection in Massive MIMO Systems with Spatial Modulation

Abstract In this chapter, a low-complexity signal detector based on structured CS is introduced to improve the signal detection performance in massive SM-MIMO systems. The goal is to address the high complexity of the optimal maximum likelihood (ML) detector in massive SM-MIMO, while also avoiding the performance loss associated with state-of-the-art low-complexity detectors for small-scale SMMIMO. The adopted signal detector leverages the structured sparsity of multiple SM signals. We start by introducing a grouped transmission scheme at the transmitter, where multiple SM signals in several continuous time slots are grouped to carry the common spatial constellation symbol. This grouping introduces the desired structured sparsity. At the receiver, a structured subspace pursuit (SSP) algorithm is introduced to jointly detect multiple SM signals by leveraging the structured sparsity. Additionally, this chapter introduces SM signal interleaving to permute SM signals in the same transmission group. This allows for the exploitation of channel diversity to further improve the signal detection performance. Theoretical analysis is used to quantify the gain from SM signal interleaving, and simulation results verify the near-optimal performance of the considered scheme.1

7.1 Introduction SM-MIMO exploits the pattern of one or several simultaneously active antennas out of all available TAs to transmit extra information [1, 2]. Compared with smallscale SM-MIMO which only introduces the limited gain in spectrum efficiency, massive SM-MIMO is recently proposed by integrating SM-MIMO with massive MIMO working at 3–6 GHz to achieve higher spectrum efficiency [1]. In massive SMMIMO systems, the BS uses a large number of low-cost antennas for higher spectrum efficiency but only one or several power-hungry transmit RF chains for power saving, while the user can compactly employ the multiple receive diversity antennas with low correlation [2]. Since the power consumption and hardware cost are largely dependent on the number of simultaneously active transmit RF chains (especially the 1

The work introduced in this chapter is based on the reference [14].

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3_7

143

144

7 Compressive Sensing Single-User Signal Detection …

power amplifier), massive SM-MIMO outperforms the traditional MIMO schemes in higher spectrum efficiency, reduced power consumption, lower hardware cost, etc. In practice, SM can be adopted in conventional massive MIMO systems as an energyefficient transmission mode. Meanwhile, massive SM-MIMO can be also considered as an independent scheme to reduce both the power consumption and hardware cost. For massive SM-MIMO, due to the small number of receive antennas at the user and massive antennas at the BS, the signal detection is a challenging large-scale underdetermined problem. When the number of TAs becomes large, the optimal ML signal detector suffers from the prohibitively high complexity [3]. Nevertheless, due to limited number of RF chains, SM signals have the inherent sparsity, which can be considered by exploiting the CS theory [4] for improved signal detection performance. By far, CS has been widely used in wireless communications [5–8], and the CS-based signal detectors have been proposed for underdetermined smallscale SM-MIMO [7, 8]. However, their BER performance still has a significant gap compared with that of the optimal ML detector, especially in massive SM-MIMO with large Nt , Nr , and Nr  Nt . This chapter proposes a near-optimal SCS-based signal detector with low complexity for massive SM-MIMO. Specifically, we first propose the grouped transmission scheme at the BS, where multiple successive SM signals are grouped to carry the common spatial constellation symbol to introduce structured sparsity. Accordingly, we propose a SSP algorithm at the user to detect multiple SM signals, whereby their structured sparsity is leveraged for improved signal detection performance. Moreover, the SM signal interleaving is proposed to permute SM signals in the same transmission group, so that the channel diversity can be exploited. Theoretical analysis and simulation results verify that the proposed SCS-based signal detector outperforms existing CS-based signal detector.

7.2 System Model In SM-MIMO systems, the transmitter has Nt TAs but Na < Nt transmit RF chains, of two symbols: and the receiver has Nr receive antennas. Each SM signal  consists   the spatial constellation symbol obtained by mapping log2 NNat bits to a pattern of Na active antennas out of Nt TAs, and Na independent signal constellation symbols coming from the M-ary signal constellation set (e.g., QAM), as in Fig. 7.1.  shown  Nt  Hence, each SM signal carries the information of Na log2 M + log2 Na bits.

At the receiver, the received signal y ∈ C Nr ×1 can be expressed as y = Hx + w, where x ∈ C Nt ×1 is the SM signal transmitted by the transmitter, w ∈ C Nr ×1 is the AWGN vector with independent and identically distributed (i.i.d.) entries following the circular symmetric complex Gaussian distribution CN (0, σw2 ), H = 1/2 ˜ 1/2 Rr HR ∈ C Nr ×Nt is the correlated flat Rayleigh-fading MIMO channel, entries t ˜ of H are subjected to the i.i.d. distribution CN (0, 1), Rr and Rt are the receiver and transmitter correlation matrices, respectively [9]. The correlation matrix R is given

7.2 System Model

145

Fig. 7.1 Spatial constellation symbol and signal constellation symbol in SM-MIMO systems, where Nt = 4, Na = 1, and quadrature phase shift keying (QPSK) are considered as an example [14]

by ri j = r |i− j| , where ri j is the ith row and jth column element of R, and r is the correlation coefficient of neighboring antennas. It should be pointed out that H should be known by the receiver and can be acquired by CE [9]. To achieve both high spectrum efficiency and energy efficiency, massive SM-MIMO, which employs massive low-cost antennas but few power-hungry transmit RF chains at the BS to serve the user with comparatively small number of receive antennas, is recently proposed [1]. However, its signal detection is a challenging large-scale underdetermined problem, since Nt , Nr can be large and Nr  Nt , e.g., Nt = 64 and Nr = 16 are considered [1].    For x, the spatial constellation symbol of log2 NNat bits is mapped into the spatial Nt TAs is constellation set A, where the pattern of Na active antennas selected from  log2 ( NNat ) regarded as the spatial constellation symbol. Hence there are |A| = 2 kinds of patterns of active antennas, i.e., supp {x} ∈ A. Meanwhile, the signal constellation symbol of the ith active antenna, denoted as x (i) for 1 ≤ i ≤ Na , is mapped into the M-ary signal constellation set the signal detection in SM-MIMO  B. Therefore,  Nt Na log2 ( Na ) can be formulated as the M 2 -hypothesis detection problem. Clearly, the

optimal signal detector to this problem is ML signal detector, which can be expressed as [1] y − Hx2 . min (7.1) xˆ ML = arg supp(x)∈A,x (i) ∈B,1≤i≤Na

146

7 Compressive Sensing Single-User Signal Detection …

However, the computational complexity of the optimal ML signal detector is  Nt log ( ) 2 Na O(M Na 2 ), which can be unrealistic when Nt , Na , and/or M become large. To reduce the complexity, SV-based signal detector has been proposed [3], but it only considers the case of Na = 1. LMMSE-based signal detector with the complexity of O(2Nr Nt2 + Nt3 ) [1] and SD-based signal detector with the complexity of O(max{Nt3 , Nr Nt2 , Nr2 Nt }) [10] have been proposed for well or overdetermined SM-MIMO with Nr ≥ Nt . However, for underdetermined SM-MIMO systems with Nr < Nt , these detectors suffer from a significant performance loss [7]. Since only Na TAs are active in each time slot for power saving and low hardware cost, there are only Na < Nt nonzero entries in x, and thus the SM signal has the inherent sparsity. By exploiting such sparsity, the CS-based signal detectors have been proposed for SM [6–8]. Reference [6] proposed a spatial modulation matching pursuit (SMMP) algorithm to detect multi-user SM signals in the UL massive SM-MIMO systems. In [7, 8], the CS-based signal detectors are proposed for underdetermined single-user SMMIMO systems with Nr < Nt in the downlink. The normalized compressive sensing (NCS) detector (with the complexity of O(2Nr Na2 + Na3 )) in [8] first normalizes the MIMO channels and then uses OMP algorithm to detect signals. Reference [7] developed a BPDN algorithm (with the complexity of O(Nt3 )) from the classical BP algorithm to detect SM signals. However, both NCS and BPDN detectors are based on the framework of CS theory, and such CS-based signal detectors still suffer from a significant performance gap compared with the optimal ML detector when Nt /Nr becomes large, especially in massive SM-MIMO systems with Nr  Nt [7].

7.3 SCS-Based Signal Detector for Massive SM-MIMO In this section, an SCS-based signal detector is proposed for downlink single-user massive SM-MIMO as illustrated in Fig. 7.2.

7.3.1 Transmitter Design We assume that signal constellation symbols in the proposed scheme are mutually independent. Moreover, for the proposed grouped transmission scheme, every G consecutive SM signals are considered as a group, and SM signals in the same transmission group share the same spatial constellation symbol, i.e.,       supp x(1) = supp x(2) = · · · = supp x(G) ,

(7.2)

where x(1) , x(2) , . . . , x(G) are SM signals in G consecutive time slots. Due to the conveyed common spatial constellation symbol, x(1) , x(2) , . . . , x(G) in the same trans-

7.3 SCS-Based Signal Detector for Massive SM-MIMO

147

Fig. 7.2 Illustration of the proposed SCS-based signal detector [14], where Nt = 4, Nr = 2, Na = 1, G = 2, and QPSK are considered. Note that the white dot bock in MIMO channels denotes the deep channel fading

mission group share the same support set and thus have the structured sparsity. It is clear that to introduce such structured sparsity, the effective information bits carried by spatial constellation symbols will be reduced. However, as will be demonstrated in our simulations, such structured sparsity allows more reliable signal detection performance and eventually could even improve the BER performance of the whole system without the reduction of the total bit per channel use (bpcu). On the other hand, due to the temporal channel correlation, channels in several consecutive time slots can be considered to be quasi-static, i.e., H(1) = H(2) = · · · = H(G) , where H(t) for 1 ≤ t ≤ G is the channel associated with the tth SM signal in the

148

7 Compressive Sensing Single-User Signal Detection …

Algorithm 7.1 The SSP Algorithm Input: Received signal y(t) , the channel matrix H(t) , and the number of active antennas Na , where 1 ≤ t ≤ G. Output: Estimated SM signal xˆ (t) for 1 ≤ t ≤ G. 1: 0 = ∅; 2: r(t) = y(t) , ∀t; 3: k = 1; 4: while k ≤ Na do ∗ 5: a(t) = H(t) r(t) , ∀t; G     (t) 2 6:  = arg max a˜  , ˜ ∈ A, ˜ = min {2Na , Nr } if k = 1 2 ˜  t=1

or ˜ = min {Na , Nr − Na } if k > 1 ;  = k−1 ∪ ; (t) † (t) 8: b(t) y , ∀t;  = H G  

  (t) 2 ˜ ∈ A and  ˜ =Na ; 9: k = arg max b˜  ,  2 ˜  t=1 (t) † (t) (t) 10: ck = Hk y , ∀t; 7:

11: r(t) = y(t) − H(t) c(t) , ∀t; 12: k = k + 1; 13: end while 14: xˆ (t) = c(t) , ∀t;

group. This implies that if channels used for SM fall into the deep fading, such deep fading usually remains unchanged during G time slots, and the corresponding signal detection performance will be poor. To solve this issue, we further propose the SM signal interleaving at the transmitter. Specifically, after the original SM signals x(t) ’s are generated, the actually transmitted signals are given by (t) x(t) ’s, where each column and row of (t) ∈ C Nt ×Nt only has one non-zero element with the value of one, and (t) can permutate the entries in x(t) . We consider that (t) ’s for 1 ≤ t ≤ G are different in different time slots, and they are predefined and known by both the transmitter and receiver. In this way, the active antennas vary in different time slots from the same transmission group even though x(t) ’s share the common spatial constellation symbol. Hence, the channel diversity can be appropriately exploited to improve the signal detection at the receiver. In Section IV-B, such diversity gain will be further discussed.

7.3.2 SCS-Based Signal Detector at the Receiver At the receiver, the received signal in the tth time slot is y(t) = H(t) (t) x(t) + w(t) = H(t) x(t) + w(t) , where H(t) = H(t) (t) is the deinterleaving processing.

(7.3)

7.4 Performance Analysis

149

From (7.3), we observe that x(t) ’s share the structured sparsity, but they have different non-zero values. According to SCS theory, the structured sparsity of x(t) ’s can be exploited to improve the signal detection performance compared with the conventional CS-based signal detectors [4]. Under the framework of SCS theory, the solution to (7.3) can be achieved by solving the following optimization problem:  min

supp(x(t) )∈A

(t)

G   

t=1

s.t. y = H

q x(t)  p

1/q ,

    x , supp x(t) = supp x(1) , ∀t.

(7.4)

(t) (t)

In this chapter, based on the classical SP algorithm [4], we propose an SSP algorithm by utilizing the structured sparsity to solve the optimization problem (7.4) in a greedy way, where p = 0 and q = 2 are advocated [4]. The proposed SSP algorithm is described in Algorithm 7.1 [14]. Specifically, Lines 1–3 perform the initialization. In the kth iteration, Line 5 performs the correlation between the MIMO channels and the residual in the previous iteration; Line 6 obtains the potential true indices according to Line 5; Line 7 merges the estimated indices obtained in Lines 8–9 in the previous iteration and the estimated indices in Line 6 in the current iteration; after the LS in Line 8, Line 9 removes wrong indices and selects Na most likely indices; Line 10 estimates SM signal according to k ; Line 11 acquires the residue. The iteration stops when k > Na . Compared with the classical SP algorithm which only reconstructs one sparse signal from one received signal, the proposed SSP algorithm can jointly recover multiple sparse signals with the structured sparsity but having different measurement matrices, where the structured sparsity of multiple sparse signals can be leveraged for improved signal detection performance. Therefore, the classical SP algorithm can be regarded as a special case of the proposed SSP algorithm when G = 1, and more details will be further discussed in Sect. 7.4.1. Another difference should be pointed out that in the steps of Lines 6 and 9 in Algorithm 7.1, the selected support set should belong to the predefined spatial constellation set A for enhanced signal detection performance. However, the classical SP algorithm and existing CS-based signal detectors do not exploit this priori information of the expected support set [7, 8]. By using the proposed SSP algorithm, we can acquire the estimation of the spatial constellation symbol according to supp xˆ (t) ’s and the rough estimation of signal constellation symbols. By searching for the minimum Euclidean distance between the rough estimation of signal constellation symbols and legitimate constellation symbols, we can finally estimate signal constellation symbols.

7.4 Performance Analysis In this section, we will provide the performance analysis.

150

7 Compressive Sensing Single-User Signal Detection …

7.4.1 Superiority of SCS-Based Signal Detectors Typically, existing CS-based signal detectors utilize one received SV to recover one sparse SM SV, which is a typical SMV problem, i.e., y = Hx + w. If multiple sparse sharethe common support set and   identical measurement matrix, i.e.,  (1) signals y , y(2) , . . . , y(G) = H x(1) , x(2) , . . . , x(G) + w, the reconstruction of x(t) ’s from y(t) ’s for 1 ≤ t ≤ G can be considered as the multiple measurement vectors (MMV) problem in SCS theory [4]. The SCS theory has proven that with the same size of the measurement vector, the recovery performance of SCS algorithms is superior to that of conventional CS algorithms [4]. This implies that with the same number of receive antennas Nr , the proposed SCS-based signal detector can outperform conventional CS-based signal detectors. Compared with the conventional MMV problem, our formulated problem (7.4) is to solve multiple sparse signals with the common support set but having different measurement matrices. Hence both conventional SMV problem and MMV problem can be considered as the special cases of our problem. If (t) ’s are identical, (7.4) becomes the conventional MMV problem, and furthermore if G = 1, it reduces to the SMV problem. Therefore, our formulated problem can be regarded as a GMMV problem.

7.4.2 Benefits from SM Signal Interleaving We discuss the performance gain from the SM signal interleaving by comparing the detection probability of the proposed SSP algorithm with and without SM signal interleaving. Here, we consider a simplified scenario with Na = 1 and uncorrelated Rayleigh-fading MIMO channels. Let m be the index of the active antenna, and for any given l, Hl(t) ’s for 1 ≤ t ≤ G are mutually independent, where 1 ≤ m, l ≤ Nt . Based on these assumptions, the received signal is given by y(t) = α(t) Hm(t) + w(t) , for 1 ≤ t ≤ G, where α(t) ∈ B denotes the signal constellation symbol carried by the active antenna in the tth time slot. To identify the active antenna, the proposed SSP algorithm relies on the correlation operation in Line 5 of Algorithm 7.1, i.e., Cl 

G G G  ∗ (t) 2   (t) ∗ (t) 2   (t) (t) (t) 2 Hl = y α Hm + w(t) Hl = Fm,l , t=1

t=1

(7.5)

t=1

∗  (t) = α(t) Hm(t) + w(t) Hl(t) for 1 ≤ l ≤ Nt . Due to large Nr in pracwhere Fm,l  (t)    (Nr2 +Nr )σs2 N σ2 tice, we have Re Fm,m ∼ N μ1 , σ12 with μ1 = 0, σ12 = 2−δ(M=2) + r2 w , and    (t)  (1−δ(M=2))(Nr2 +Nr )σs2 N σ2 ∼ N μ2 , σ22 with μ2 = 0, σ22 = + r2 w according Im Fm,m 2 



(t) (t) to central limit theorem [11]. Similarly, both Re Fm,l and Im Fm,l follow the

7.4 Performance Analysis

151

  distribution N μ3 , σ32 with l = m, μ3 = 0, σ32 =

Nr σs2 2 that σs2

+

Nr σw2 . The associated proof 2   

(t) (t) T

, and = Tr E x x will be provided in Appendix of [12]. Note 



(t) (t) Re Fm,l and Im Fm,l ∀l are mutually independent. Moreover, we can have Cm ∼ σ22 χ2G + σ12 χ2G and Cl ∼ σ32 χ22G with l = m, where χ2n is the central chisquared distribution with the degrees of freedom n [11]. Since Algorithm 7.1 only has one iteration and || = || = 2 in the iteration for Na = 1, we consider PGMMV Cm − Cl[2] > 0|l = m as the correct active antenna detection probability, where Cl[1] > Cl[2] > · · · > Cl[Nt −Na ] with l = m are sequential statistics. The PDFs of Cm and Cl with l = m are denoted by f 1 (x) and f 2 (x), respectively. The PDF of t −Na )! Cl [2] with l = m is f 2[2] (x) = (N(Nt −N (F2 (x)) Nt −Na −2 (1 − F2 (x)) f 2 (x), a −2)! where F2 (x) is the cumulative density function of f 2 (x). In this way, we have   ∞ ∞ PGMMV Cm − Cl[2] > 0|l = m = 0 −∞ f (x) f 2[2] (x − z) d xdz.

(7.6)

For the conventional MMV problem with identical channel matrices, similar to 2 2 the previous analysis, we have Cm ∼ Gσ22 χ21 + Gσ12 χ21 and Cl ∼ Gσ3 χ2 with l = m. Similarly, we can also get PMMV Cm − Cl[2] > 0|l = m . To intuitively compare the signal detection probability, we compare PMMV (Cm − Cl > 0| l = m) and PGMMV ( Cm − Cl > 0| l = m) when σs2 /σw2 → ∞ and G are sufficient large. In this case, Cm − Cl can be approximated to the Gaussian distri 3 bution N μ4 , σ42 with μ4 = G μ21 + μ22 − 2μ23 + σ12 + σ22 − 2σ32 , σ42 = G i=1 2σi4 + 4μi2 σi2 . In this way, we can obtain that PGMMV ( Cm − Cl > 0| l = m) ≈ Q(−μ4 /σ4 ), where Q-function is the tail probability of the standard normal distribution [11]. By contrast, for conventional MMV √ case, we can obtain that PMMV ( Cm − Cl > 0| l = m) ≈ Q(−μ4 /( Gσ4 )). Clearly, PMMV is larger than PGMMV due to μ4 > 0 and G > 1, which implies that an appropriate SM signal interleaving will lead to the improved signal detection performance. To achieve the goal that Hl(t) ’s, ∀l, are mutually independent, we consider the pseudo-random permutation matrix (t) . In Sect. 7.5, simulation results confirm the good channel diversity gain from interleaving, whose performance gain approaches that of the case of mutually independent channel matrices in the same group.

7.4.3 Computational Complexity 



log Nt The optimal ML signal detector has the complexity of O(M Na 2 2 ( Na ) ), which is high for large Na , Nt , and/or M. The conventional signal detectors [1, 7, 10] have the complexity of O(Nt3 ), which is still high in massive SM-MIMO systems with large Nt . By contrast, for the proposed signal detector, the main computational burden comes from the step of LS with the complexity of O(G(2Nr Na2 + Na3 )) [4],

152

7 Compressive Sensing Single-User Signal Detection …

Fig. 7.3 Comparison of the simulated and analytical SCSER of the SCS-based signal detector in different cases over uncorrelated Rayleigh-fading MIMO channels, where Nt = 64, Nr = 16, Na = 1, and 8-phase shift keying (PSK) are considered [14]

or equivalently O(2Nr Na2 + Na3 ) per SM signal in each time slot. This indicates that the proposed SCS-based signal detector enjoys the same order of complexity with the CS-based signal detector [8].

7.5 Simulation Results A simulation study was carried out to compare the performance of the proposed SCSbased signal detector with that of the conventional LMMSE-based signal detector [1] and the CS-based signal detector [7]. The performance of the optimal ML detector [13] is also provided as the benchmark for comparison. Figure 7.3 compares the simulated and analytical spatial constellation symbol error rate (SCSER) of the SCS-based signal detector in different cases over uncorrelated Rayleigh-fading MIMO channels, where Nt = 64, Nr = 16, Na = 1, and 8-PSK are considered. For the GMMV case, “i.i.d.” denotes the case that H(t) = H(t) , ∀t and H(t) ’s are independently generated, while “interleaving” denotes the case that H(1) = H(2) = · · · = H(G) and H(t) = H(t) (t) with different permutation matrices (t) ’s. Clearly, the analytical SCSER derived in Sect. 7.4.2 have the good tightness with the simulation results. In addition, the proposed SCS-based signal

7.5 Simulation Results

153

Fig. 7.4 SCSER of different signal detectors over correlated Rayleigh-fading MIMO channels, where rt = rr = 0.4, Nt = 64, Nr = 16, Na = 1, and 8-PSK are considered [14]

detector outperforms the conventional CS-based signal detector, since the structured sparsity of multiple sparse SM signals is exploited. Moreover, since the channel diversity can be also exploited, the SCS-based signal detector with mutually independent channel matrices is superior to that with identical channel matrices by more than 4 dB if the SCSER of 10−3 is considered. Finally, the performance of the SCS-based signal detector with SM signal interleaving approaches that with mutually independent channel matrices, which indicates that the proposed SM signal interleaving can fully exploit the channel diversity. Figure 7.4 provides SCSER comparison of different signal detectors over correlated Rayleigh-fading MIMO channels, where both the channel correlation coefficients at the transmitter and receiver are rt = rr = 0.4 [9], Nt = 64, Nr = 16, Na = 1, and 8-PSK are considered. The conventional LMMSE-based signal detector works poorly due to Nr  Nt . The SCS-based signal detector with interleaving outperforms the conventional CS-based signal detector and SCS-based signal detector without interleaving. Moreover, it has the similar performance with that with mutually independent channel matrices (i.e., H(t) = H(t) , ∀t and H(t) ’s are independently generated), which indicates the good channel diversity gain from interleaving even in correlated MIMO channels. Figure 7.5 provides the BER performance comparison of the existing CS-based signal detector and the proposed SCS-based signal detector with interleaving over correlated Rayleigh-fading MIMO channels with rt = rr = 0.4 and Nr = 16. The

154

7 Compressive Sensing Single-User Signal Detection …

Fig. 7.5 BER comparison between the traditional CS-based signal detector and the proposed SCSbased signal detector over correlated Rayleigh-fading MIMO channels, where rt = rr = 0.4 and Nr = 16 are considered [14]

existing scheme adopts two transmission modes: 1) Nt = 64, Na = 1, BPSK with 7 bpcu and 2) Nt = 65, Na = 2, no signal constellation symbol with 11 bpcu. In contrast, the SCS-based signal detector with Nt = 65, Na = 2 and G = 2 adopts QPSK and 8-PSK, respectively, and the corresponding data rates are 9.5 bpcu and 11.5 bpcu. From Fig. 7.5, it can be observed that the proposed SCS-based signal detector with even higher bpcu achieves better BER performance than the conventional CSbased signal detector. Figure 7.6 compares the performance of the proposed SCS-based signal detector with interleaving and the optimal ML signal detector, where rt = rr = 0.4, Nt = 65, Nr = 16, Na = 2, and 8-PSK are considered. We find that with the increasing G, the BER performance gap between the SCS-based signal detector and the optimal ML signal detector becomes smaller. When G ≥ 2, the SCS-based signal detector approaches the optimal ML signal detector with a small performance loss. For example, if the BER of 10−4 is considered, the performance gap between the SCS-based signal detector with G = 3 and the optimal ML detector is less than 0.2 dB. Thus, the near-optimal performance of the proposed SCS-based signal detector can be verified.

References

155

Fig. 7.6 BER performance comparison between the proposed SCS-based signal detector and the optimal ML signal detector, where rt = rr = 0.4, Nt = 65, Nr = 16, Na = 2, and 8-PSK are considered [14]

7.6 Summary This chapter propose a near-optimal SCS-based signal detector with low complexity for the massive SM-MIMO. First, the grouped transmission scheme can introduce the desired structured sparsity of multiple SM signals in the same transmission group for improved signal detection performance. Second, the SSP algorithm can jointly detect multiple SM signals with low complexity. Third, by using SM signal interleaving, we can fully exploit the channel diversity to further improve the signal detection performance, and the gain from SM signal interleaving can approach that of the ideal case of mutually independent channel matrices in the same transmission group. Besides, we quantify the gain from SM signal interleaving. Simulation results confirm the near-optimal performance of the proposed scheme.

References 1. Di Renzo, M., Haas, H., Ghrayeb, A., Sugiura, S., Hanzo, L.: Spatial modulation for generalized MIMO: challenges, opportunities, and implementation. Proc. IEEE 102(1), 56–103 (2013) 2. Yang, P., Di Renzo, M., Xiao, Y., Li, S., Hanzo, L.: Design guidelines for spatial modulation. IEEE Commun. Surv. Tutor. 17(1), 6–26 (2014)

156

7 Compressive Sensing Single-User Signal Detection …

3. Zheng, J.: Signal vector based list detection for spatial modulation. IEEE Wirel. Commun. Lett. 1(4), 265–267 (2012) 4. Duarte, M.F., Eldar, Y.C.: Structured compressed sensing: from theory to applications. IEEE Trans. Signal Process. 59(9), 4053–4085 (2011) 5. Shim, B., Kwon, S., Song, B.: Sparse detection with integer constraint using multipath matching pursuit. IEEE Commun. Lett. 18(10), 1851–1854 (2014) 6. Garcia-Rodriguez, A., Masouros, C.: Low-complexity compressive sensing detection for spatial modulation in large-scale multiple access channels. IEEE Trans. Commun. 63(7), 2565–2579 (2015) 7. Liu, W., Wang, N., Jin, M., Xu, H.: Denoising detection for the generalized spatial modulation system using sparse property. IEEE Commun. Lett. 18(1), 22–25 (2013) 8. Yu, C.M., Hsieh, S.H., Liang, H.W., Lu, C.S., Chung, W.H., Kuo, S.Y., Pei, S.C.: Compressed sensing detector design for space shift keying in MIMO systems. IEEE Commun. Lett. 16(10), 1556–1559 (2012) 9. Wu, X., Claussen, H., Di Renzo, M., Haas, H.: Channel estimation for spatial modulation. IEEE Trans. Commun. 62(12), 4362–4372 (2014) 10. Cal-Braz, J.A., Sampaio-Neto, R.: Low-complexity sphere decoding detector for generalized spatial modulation systems. IEEE Commun. Lett. 18(6), 949–952 (2014) 11. Steven, M.K.: Fundamentals of Statistical Signal Processing, vol. 10, p. 151045. PTR PrenticeHall, Englewood Cliffs, NJ (1993) 12. Gao, Z., Dai, L., Qi, C., Yuen, C., Wang, Z.: Near-optimal signal detector based on structured compressive sensing for massive SM-MIMO. arXiv: http://arxiv.org/abs/1601.07701 (2016) 13. Legnain, R.M., Hafez, R.H., Marsland, I.D., Legnain, A.M.: A novel spatial modulation using MIMO spatial multiplexing. In: 2013 1st International Conference on Communications, Signal Processing, and Their Applications (ICCSPA), pp. 1–4. IEEE (2013) 14. Gao, Z., Dai, L., Qi, C., Yuen, C., Wang, Z.: Near-optimal signal detector based on structured compressive sensing for massive SM-MIMO. IEEE Trans. Veh. Technol. 66(2) 1860–1865 (2017)

Chapter 8

Compressive Sensing Multi-User Detection in Massive MIMO Systems with Spatial Modulation

Abstract In this chapter, we introduce an LS-SM-MIMO scheme for the multi-user uplink (UL), where each user adopting multiple AEs and a single RF chain invokes SM to increase the throughput. By using hundreds of AEs and a limited number of RF chains, i.e., adopting the hybrid MIMO architecture, the BS can efficiently serve multiple users while also reducing power consumption. However, due to the large number of AEs of multiple users and limited RFs at the BS, the multi-user signal detection problem here is a challenging high-dimensional under-determined problem. To address this challenge, we design a joint SM transmission scheme at the transmitter, and an SCS-based MUD at the receiver. Furthermore, we use the cyclicprefix single-carrier (CPSC) technique to combat the multipath channels. Additionally, a receive AE selection is adopted to improve the performance over correlated Rayleigh-fading MIMO channels. By exploiting the intrinsically sparse features of the UL signals, the considered SCS-based MUD can reliably detect the UL signals with low complexity. Simulation results verify the advantages of the considered SCS-based MUD over its counterparts.1

8.1 Introduction As a critical technique in the fifth-generation (5G) systems, LS-MIMO employing hundreds of AEs at the BS is capable of improving the spectral efficiency by orders of magnitude, but it suffers from the nonnegligible power consumption and hardware cost due to one specific RF chain usually required by every AE [1]. By using a reduced number of RF chains, the emerging SM-MIMO activates part of available AEs to transmit extra information in the spatial domain, and it has attracted much attention due to its high energy efficiency and reduced hardware cost [1]. However, conventional SM-MIMO is usually considered in the downlink of small-scale MIMO systems, and therefore its achievable capacity is limited. Individually, both technologies have their own advantages and drawbacks. By an effective combination 1

The work introduced in this chapter is based on the reference [15].

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3_8

157

158

8 Compressive Sensing Multi-User Detection …

of them together, one can envision the win-win situation. SM-MIMO is attractive for LS-MIMO systems, since the reduced number of required RF chains in SM-MIMO can reduce the power consumption and hardware cost in conventional LS-MIMO systems. Moreover, hundreds of AEs used in LS-MIMO can improve the system throughput of SM-MIMO. Such reciprocity enables LS-MIMO and SM-MIMO to enjoy the apparent compatibility. In this chapter, we propose a LS-SM-MIMO scheme for intrinsically amalgamating the compelling benefits of both LS-MIMO and SM-MIMO for the 5G UL (UL) over FSF channels. In the proposed scheme, each UL-user equipped with multiple AEs but only a single RF chain invokes SM for increasing the UL-throughput, and the CPSC transmission scheme is adopted to combat the multipath channels [2]. At the BS, hundreds of AEs but only dozens of RF chains are employed to simultaneously serve multiple users, and a direct AE selection scheme is used to improve the system performance over correlated Rayleigh-fading MIMO channels at the BS [3]. The proposed scheme can be adopted in conventional LS-MIMO as a specific ULtransmission mode for reducing the power consumption, or alternatively, for energyand cost-efficient LS-SM-MIMO, where joint benefits of efficient AE selection [3], transmit precoding [4], and CE [5] can be readily exploited. To sum up, the proposed scheme inherits the advantages of LS-MIMO and SM-MIMO, while reducing the power consumption and hardware cost. A challenging problem in the proposed UL LS-SM-MIMO scheme is how to realize a reliable MUD with low complexity. The optimal ML signal detector suffers from the excessive complexity. Conventional sphere decoding detectors cannot be readily used in multi-user scenarios and may still appear the high complexity for LS-SM-MIMO [6]. Existing low-complexity linear signal detectors, e.g., the MMSEbased signal detector, perform well for conventional LS-MIMO systems [7]. However, they are unsuitable for the proposed LS-SM-MIMO UL-transmission, since the large number of transmit AEs of the UL-users and the reduced number of receive RF chains at the BS make the UL multi-user signal detection be a large-scale underdetermined/rank-deficient problem. The authors of [8–10] proposed CS-based signal detectors to solve the under-determined signal detection problem in SM-MIMO systems, but they only considered the single-user small-scale SM-MIMO systems in the downlink. Against this background, we exploit the specific signal structure in the proposed multi-user LS-SM-MIMO UL-transmission, where each user only activates a single AE in each time slot. Hence the SM signal of each UL-user is sparse with the sparsity level of one, and the aggregate SM signal consisting of multiple UL-users’ SM signals of a CPSC block exhibits a certain distributed sparsity, which can be beneficially exploited for improving the signal detection performance at the BS. Moreover, we propose a joint SM transmission scheme for the UL-users in conjunction with an appropriately SCS-based MUD at the BS. The proposed SCS-based MUD is specifically tailored to leverage the inherently distributed sparsity of the aggregate SM signal and the group sparsity of multiple aggregate SM signals owing to the joint SM transmission scheme for reliable signal detection performance. Our simulation

8.2 System Model

159

results demonstrate that the proposed SCS-based MUD is capable of outperforming the conventional detectors even with higher UL-throughput. Notation: The Lower-case and upper-case boldface letters denote vectors and matrices, respectively, while (·)T , (·)∗ , (·)† and · denote the transpose, conjugate transpose, Moore-Penrose matrix inversion, and the integer floor operators, respectively. The l0 and l2 norm operations are given by  · 0 and  · 2 , respectively. The support set of the vector x is denoted by supp{x}, and xi denotes the ith entry of the vector x. Additionally, x| denotes the entries of x defined in the set , | denotes the sub-matrix whose columns comprise the columns of  that are defined in , and  denotes the sub-matrix whose rows comprise the rows of  that are defined in . The expectation operator is given by E{·}. mod (x, y) = x − x/y y if y = 0 and x − x/y y = 0, while mod (x, y) = y if y = 0 and x − x/y y = 0.

8.2 System Model We first introduce the proposed LS-SM-MIMO scheme and then focus our attention on the UL-transmission with an emphasis on the multi-user signal detection.

8.2.1 Multi-User Spatial Modulation Scheme for Massive MIMO Systems As shown in Fig. 8.1, we consider the proposed LS-SM-MIMO from both the BS side and the user side. For conventional LS-MIMO, the number of AEs employed by the BS is equal to the number of its RF chains [7]. However, the BS in LS-SM-MIMO, as shown in Fig. 8.1, is equipped with a much smaller number of RF chains MRF than the total number of AEs M, i.e., we have MRF  M. Conventional LS-MIMO systems typically assume single-antenna users [7]. By contrast, in the proposed scheme, each user is equipped with n t > 1 AEs but only a single RF chain, and SM is adopted for the UL-transmission, where only one of the available AEs is activated for data transmission. It has been shown that the main power consumption and hardware cost of cellular networks comes from the radio access network [11]. Hence, using a reduced number of expensive RF chains compared to the total number of AEs at the BS can substantially reduce both the power consumption and the hardware cost for the operators. Meanwhile, it is feasible to incorporate several AEs and a single RF chain in the handsets. The resultant increased degrees of freedom in the spatial domain may then be exploited for improving the UL-throughput. The proposed scheme can be considered as an optional UL-transmission mode in conventional LS-MIMO systems, where AE selection schemes may be adopted for beneficially selecting the most suitable MRF AEs at the BS to receive UL SM signals [3]. Alternatively, it can also

160

8 Compressive Sensing Multi-User Detection …

Fig. 8.1 In the proposed UL LS-SM-MIMO, the BS is equipped with M AEs and MRF RF chains to simultaneously serve K users, where M MRF > K , and each user is equipped with n t > 1 AEs and one RF chain [15]. By exploiting the improved degree of freedom in the spatial domain, multiple users can simultaneously exploit SM for improving the UL-throughput

be used for the UL of LS-SM-MIMO, when advantageously combining transmit precoding, receive AE selection, and CE [3–5].

8.2.2 Uplink Transmission We first consider the generation of SM signals at the users. The SM signal xk = ek sk transmitted by the kth user in a time slot consists of two parts: the spatial constellation symbol ek ∈ Cn t and  the signal constellation symbol sk ∈ C. ek is generated by mapping log2 (n t ) bits to the index of the active AE, and typically the user terminal employs n t = 2 p AEs, where p is a positive integer. Due to only a single RF chain employed at each user, only one entry of ek associated with the active AE equals one, and the rest of the entries of ek are zeros, i.e., we have supp(ek ) ∈ A, ek 0 = 1, ek 2 = 1,

(8.1)

8.2 System Model

161

where A = {1, 2, . . . , n t } is the spatial constellation symbol set. The signal constellation symbol comes from the L-ary modulation, i.e., sk ∈ L, where L is the signal constellation symbol set (e.g., 64QAM) of size L. Hence, each UL-user’s SM signal carries the information of log  per channel use (bpcu), and the  2 (L) + log2 (n t ) bits overall UL-throughput is K log2 (L) + log2 (n t ) bpcu. The users rely on the CPSC scheme for transmitting their SM signals [2]. Explicitly, each CPSC block consists of a CP having the length of P − 1 and the associated data block having the length of Q. Hence the length of each CPSC block is Q + P − 1, where this CP is capable of counteracting a dispersive multipath channel imposing dispersion over P samples. The concatenated data block consists of Q successive SM signals. At the receiver, due to the reduced number of RF chains at the BS, only MRF receive AEs can be exploited to receive signals, where existing receive AE selection schemes can be adopted to preselect MRF receive AEs for achieving an improved signal detection performance [3]. Since the BS can serve K users simultaneously, after the removal of the CP, the received signal yq ∈ C MRF for 1 ≤ q ≤ Q of the qth time slot of a specific CPSC block can be expressed as yq =

K 

yk,q + wq

k=1

=

P−1  K 

 Hk, p  xk, mod (q− p,Q) + wq

(8.2)

p=0 k=1

=

P−1  K 

˜ k, p xk, mod (q− p,Q) + wq , H

p=0 k=1

where Hk, p ∈ C M×n t is the kth user’s MIMO channel matrix for the pth multipath  ˜ k, p ∈ C MRF ×n t , the set  is determined by the AE selection component, Hk, p = H  scheme used, the elements of  having the cardinality of MRF are uniquely selected from the set {1, 2, . . . , M}, xk,q has one nonzero entry, and wq ∈ C MRF is the AWGN vector with entries obeying the independent and identically distributed (i.i.d.) circular symmetric complex Gaussian distribution with zero mean and a variance of σw2 /2 per 1/2 ¯ 1/2 ¯ dimension, denoted by CN(0, σw2 ). Hk, p = RBS H k, p RUS , entries of Hk, p obey the i.i.d. CN(0, 1), RUS with the correlation coefficient ρUS and RBS with the correlation coefficient ρBS are correlation matrices at the users and BS, respectively. The specific |m−n| |m−n| element of the mth row and nth column of RBS (RUS ) is ρBS (ρUS ). For correlated Rayleigh-fading MIMO channels, the specific  or receive AE selection scheme has an impact on the attainable system performance. In this chapter, the direct AE selection scheme is used for maximizing the minimum geometric distance between RF −1 any pair of the selected AEs [7]. For ULA,  = {ϕ + m RF M/MRF }mMRF =0 with 1 ≤ ϕ ≤ M/MRF  − 1. Then (8.2) can be further expressed as

162

8 Compressive Sensing Multi-User Detection …

yq =

P−1 

˜ p x mod (q− p,Q) + wq , H

(8.3)

p=0

MRF ×(n t K ) ˜p= H ˜ 2, p · · · H ˜ 1, p H ˜ by defining H and xq = K,p ∈ C 

T

T

T T x1,q x2,q · · · x K ,q ∈ C(n t K ) . By considering the Q SM signals of a specific CPSC block, we arrive at ˜ + w, y = Hx

(8.4)

     T T (M Q) T T ∈ RF , the aggregate SM signal x = where y = y1 y2 · · · y Q    

    T  T  T T T T T T x1 x2 · · · x Q ∈ (K n t Q) , w = w1 w2 · · · w Q , and ⎡

˜0 H

0

⎢ ⎢ H ˜0 ⎢ ˜1 H ⎢ . ⎢ .. ˜1 H ⎢ ⎢ . ⎢˜ ⎢ H P−1 .. ⎢ ˜ =⎢ H ˜ ⎢ 0 H P−1 ⎢ . ⎢ . ⎢ . 0 ⎢ . .. ⎢ . ⎢ . . ⎢ .. ⎢ .. ⎣ . . 0 0

0 0

˜2 ··· H . · · · ..

˜0 ··· H ˜ H P−1 ˜ 1 ··· 0 H .. .. .. . . . .. .. ˜ . H P−1 . .. 0 . 0 .. .. ˜0 . . H ˜1 0 ··· H

˜1 H



⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ˜ H P−1 ⎥ ⎥ ⎥ 0 ⎥. .. ⎥ ⎥ . ⎥ .. ⎥ ⎥ . ⎥ ⎥ ⎥ 0 ⎦ ˜0 H ˜2 H .. .

(8.5)

˜ x22 }/E{w22 }. The SNR at the receiver is defined by SNR = E{H The optimal signal detector for (8.4) relies on the ML algorithm:       ˜ xˆ  ˜ xˆ  y−H min y − H  = min  ,  K ,Q 2 2 xˆ

{ˆxk,q }k=1,q=1 

∈ L, 1 ≤ k ≤ K , 1 ≤ q ≤ Q, s.t. supp xˆ k,q ∈ A, xˆ k,q

(8.6)

supp xˆ k,q

whose complexity increases exponentially with the number of users, since the size of the search set for the ML detector is (n t · L) K Q . This excessive complexity can be unaffordable in practice. To reduce the complexity, near-optimal sphere decoding detectors have been proposed [6], but their complexity may still remain high, particularly for the systems supporting large K , Q, n t , and L [8]. In conventional LS-MIMO systems, low-complexity linear signal detectors (e.g., the MMSE-based signal detector) have been shown to be near-optimal since M = MRF K and n t = 1 lead

8.3 Multi-User Detection for Massive MIMO Systems with Spatial Modulation

163

the multi-user signal detection to be an over-determined problem [7]. However, in the proposed scheme, we have MRF < K n t . Hence the multi-user signal detection problem (8.6) represents a large-scale under-determined problem. Consequently, the conventional linear signal detectors perform poorly in the proposed LS-SM-MIMO [8]. By exploiting the sparsity of SM signals, the authors of [8–10] have proposed the concept of CS-based signal detectors for the downlink of small-scale SM-MIMO operating in a single-user scenario. However, these signal detectors are unsuitable for the proposed multi-user scenarios. Observe from (8.1) that xk,q is a sparse signal having a sparsity level of one. Hence the aggregate SM signal x which consists of multiple users’ SM signals in Q time slots exhibits the distributed sparsity with the sparsity level of K Q. This property of x inspires us to exploit SCS theory for the multi-user signal detection [12]. To further improve the signal detection performance and to increase the system’s throughput, we propose a joint SM transmission scheme and an SCS-based MUD, which will be detailed in the next section.

8.3 Multi-User Detection for Massive MIMO Systems with Spatial Modulation To solve the multi-user signal detection of our UL LS-SM-MIMO system, we first propose a joint SM transmission scheme to be employed at the users. Accordingly, an SCS-based low-complexity MUD is developed at the BS, whereby the distributed sparsity of the aggregate SM signal and the group sparsity of multiple aggregate SM signals are exploited. Moreover, the computational complexity of the proposed SCS-based MUD is discussed.

8.3.1 Transmitter Design at the Users For the kth user in the qth time slot, every successive J CPSC block is considered as a group and they share the same spatial constellation symbol, namely,





(1) (2) (J ) = supp xk,q = · · · = supp xk,q , 1 ≤ k ≤ K , 1 ≤ q ≤ Q, (8.7) supp xk,q where we introduce the superscript ( j) to denote the jth CPSC block, and J is typically small, e.g., J = 2. In CS theory, the specific signal structure, where (1) (2) (J ) , xk,q , . . . , xk,q share a common support is often referred to as the group sparxk,q sity. Similarly, the aggregate SM signals consisting of the K users’ SM signals also exhibit group sparsity, i.e.,       supp x(1) = supp x(2) = · · · = supp x(J ) ,

(8.8)

164

8 Compressive Sensing Multi-User Detection …

Although exhibiting group sparsity may slightly reduce the information carried by the spatial constellation symbols, it is also capable of reducing the number of the RF chains required according to the SCS theory, whilst simultaneously improving the total BER of the entire system even with higher UL-throughput. This conclusion will be confirmed by our simulation results.

8.3.2 SCS-Based MUD at the BS According to (8.4), the received signals at the BS in the same group can be expressed as ˜ ( j) x( j) + w( j) , 1 ≤ j ≤ J, (8.9) y( j) = H ˜ ( j) and w( j) where y( j) denotes the received signal in the jth CPSC block, while H are the effective MIMO channel matrix and the AWGN vector, respectively. The intrinsically distributed sparsity of x( j) and the under-determined nature of (8.9) inspire us to solve the signal detection problem based on CS theory, which can efficiently acquire the sparse solutions to under-determined linear systems. Moreover, the J different aggregate SM signals in (8.9) can be jointly exploited for improving the signal detection performance due to the group sparsity of {x( j) } Jj=1 . Thus, by considering both the distributed sparsity and the group sparsity of the aggregate SM signals, the multi-user signal detection at the BS can be formulated as the following optimization problem 2 2 J  J     ( j)   ( j)  ˜ ( j) ˜ ( j) −H x( j)  = min −H x( j)  , y y ( j) J,K ,Q 2 2 { x( j) } Jj=1 j=1 { xk,q } j=1,k=1,q=1j=1    ( j)  xk,q  = 1, 1 ≤ j ≤ J, 1 ≤ q ≤ Q, 1 ≤ k ≤ K . s.t.  min

(8.10)

0

Our proposed SCS-based MUD solves the optimization problem (8.10) with the aid of two steps. In the first step, we estimate the spatial constellation symbols, i.e., the indices of K users’ active AEs in J successive CPSC blocks. In the second step, we infer the legitimate signal constellation symbols of the K users in J CPSC blocks.

8.3.2.1

Step 1. Estimation of Spatial Constellation Symbols

We propose a group subspace pursuit (GSP) algorithm developed from the classical SP algorithm of [13] to acquire the sparse solution to the large-scale under-determined    ( j)  problem (8.10), where both the a priori sparse information (i.e., xk,q  = 1) and 0

the group sparsity of x(1) , x(2) , . . . , x(J ) are exploited for improving the multiuser signal detection performance. The proposed GSP algorithm is described in

8.3 Multi-User Detection for Massive MIMO Systems with Spatial Modulation

165

 K ,J,Q  ( j) Algorithm 8.1 [15], which estimates SM signal xˆ k,q . Hence, the estik=1, j=1,q=1

 K ,J,Q  ( j) mated spatial constellation symbol is supp  xk,q . k=1, j=1,q=1

Compared to the classical SP algorithm, the proposed GSP algorithm exploits the  J distributed sparsity and group sparsity of x( j) j=1 . More explicitly, x( j) ∈ C(K Qn t ) ( j)

( j)

consists of the K Q low-dimensional sparse vectors xk,q ∈ Cn t , where each xk,q has the known sparsity level of one, and the aggregate SM signals x(1) , x(2) , . . . , x(J ) appear the group sparsity. Specifically, the differences between the proposed GSP algorithm and the classical SP algorithm lie in the following two aspects: (1) the identification of support set including the steps of preliminary support set and final support set as shown in Algorithm 8.1; and (2) the joint processing of y(1) , y(2) , . . . , y(J ) . First, for the support selection, taking the step of preliminary support set for instance, when selecting the preliminary support set, the classical SP algorithm selects the support set

∗ associated with the first K Q largest values of the global correlation result ( j) ˜ r( j) . By contrast, the proposed GSP algorithm selects the support set assoH

∗ ˜ ( j) r( j) . ciated with the largest value from the local correlation result in each H k,q In this way, the distributed sparsity of the aggregate SM signal can be exploited

Algorithm 8.1 The GSP Algorithm ˜ ( j) for 1 ≤ j ≤ J . Input: Noisy received signals y( j) and effective channel matrices H

T

T

T T ( j) ( j) ( j) x1  x2 Output: Estimated  x( j) =  ···  xQ , where 



T ( j) T ( j) T ( j) T x K ,q for 1 ≤ q ≤ Q.  x2,q · · ·   x1,q r( j) = y( j) for 1 ≤ j ≤ J ; {Initialization} 0 = ∅; {Empty support set} t = 1; {Iteration index} repeat

∗ ( j) ˜ ( j) r( j) for 1 ≤ k ≤ K , 1 ≤ q ≤ Q, and 1 ≤ j ≤ J ; {Correlation} 5: ak,q = H k,q    J  2   a( j)  for 1 ≤ k ≤ K , 1 ≤ q ≤ Q; {Identify support} 6: τk,q = arg max  k,q 

1: 2: 3: 4:

 τk,q j=1

7: 8: 9: 10: 11:

 τk,q 2

  K ,Q  = τk,q + (k − 1 + K (q − 1)) n t k=1,q=1 ; {Preliminary support set} 

†  ˜ ( j)  b( j) t−1 ∪ = H y( j) for 1 ≤ j ≤ J ; {Least squares} t−1  ∪   2  J     b( j)  for 1 ≤ k ≤ K , 1 ≤ q ≤ Q; {Pruning support set} ωk,q = arg max  k,q   ωk,q j=1

 ωk,q 2

 K ,Q = ωk,q + (k − 1 + K (q − 1)) n t k=1,q=1 ; {Final support set}  †  ˜ ( j)  c( j) t = H y( j) for 1 ≤ j ≤ J ; {Least squares} t

t





˜ ( j) c( j) for 1 ≤ j ≤ J ; {Compute residual} 12: r( j) = y( j) − H 13: t = t + 1; {Update iteration index} 14: until t = t−1 or t ≥ Q

( j)

 xq =

166

8 Compressive Sensing Multi-User Detection …

for improved signal detection performance. Second, compared with the classical SP algorithm, the proposed GSP algorithm jointly exploits the J correlated signals having the group sparsity, which can bring the further improved signal detection performance. It should be noted that even for the special case of J = 1, i.e., without using the joint SM transmission scheme, the proposed GSP algorithm still achieves a better signal detection performance than the classical SP algorithm when handling the aggregate SM signal, since the inherently distributed sparsity of the aggregate SM signal is leveraged to improve the signal detection performance.

8.3.2.2

Step 2. Acquisition of Signal Constellation Symbols

Following Step 1, we can also acquire a rough estimate of the signal constellation symbol for each user in each time slot. By searching for the minimum Euclidean distance between this rough estimate of the signal constellation symbol and the legitimate constellation symbols of L, we can obtain the final estimate of signal constellation symbols.

8.3.3 Computational Complexity The optimal ML signal  detector has a prohibitively high computational complex ity of O (L · n t )(K ·Q) according to (8.6). The sphere decoding detectors [6] are indeed capable of reducing the computational complexity, but they may still suffer from an unaffordable complexity, particularly for large K , Q, L and n t values. By contrast, the conventional MMSE-based detector for LS-MIMO and  CS-based detector [9] for small-scale SM-MIMO enjoy the low complexity of O MRF · (n t ·   Q · K )2 + (n t · Q · K )3 and O 2MRF · (Q · K )2 + (Q · K )3 ), respectively. For the proposed SCS-based MUD, most of the computational requirements are imposed by  3 the LS operations, which has a complexity of O J · (2MRF · (Q · K )2 +  (Q · K ) ) [14]. Consequently, the computational complexity per CPSC block is O 2MRF · (Q · K )2 + (Q · K )3 ), since J successive aggregate SM signals are jointly processed. Compared with conventional signal detectors, the proposed SCS-based MUD benefits from a substantially lower complexity, and it has a similar low complexity to the conventional MMSE-based and CS-based signal detectors.

8.4 Simulation Results A simulation study was carried out to compare the attainable performance of the proposed SCS-based MUD to that of the MMSE-based signal detector [7] and to that of the CS-based signal detector [9]. In the LS-SM-MIMO system considered,

8.4 Simulation Results

167

Fig. 8.2 The total BERs achieved by the proposed SCS-based MUD with different AE selection schemes, where K = 8, J = 2, 64QAM, MRF = 18, n t = 4, and ρUS = 0 are considered [15]

the BS used a ULA relying on a large number of AEs M, but a much smaller number of RF chains MRF , while K users employing n t AEs but only a single RF chain simultaneously use the CPSC scheme associated with P = 8 and Q = 64 to transmit the SM signals to the BS. The total BER including both the spatial constellation symbols and the signal constellation symbols was evaluated. Figure 8.2 compares the total BERs achieved by the proposed SCS-based MUD with different AE selection schemes, where K = 8, J = 2, 64QAM, MRF = 18, n t = 4, and ρUS = 0 are considered. The contiguous AE selection scheme implies that we RF −1 select MRF adjacent AEs, i.e.  = {ϕ + m RF }mMRF =0 with 1 ≤ ϕ ≤ M − MRF + 1. By contrast, in the random AE selection scheme, the elements of  are randomly selected from the set {1, 2, . . . , M}, while the direct AE selection scheme of [3] has been described in Sect. 8.2.2. Furthermore, the BER achieved by the SCS-based MUD relying on ρBS = 0 is also considered as a performance bound, since the choice of ρBS = 0 and ρUS = 0 implies the uncorrelated Rayleigh-fading MIMO channels. Observe from Fig. 8.2 that the direct AE selection scheme outperforms the other pair of AE selection schemes. Moreover, for a certain AE selection scheme, the BER performance degrades when MRF /M or ρBS increases. For the direct AE selection scheme, the BER performance of ρBS = 0.8, M = 128 and of ρBS = 0.5, M = 64 approaches the BER achieved for transmission over uncorrelated Rayleigh-fading MIMO channels, which confirms the near-optimal performance of the direct AE selection scheme.

168

8 Compressive Sensing Multi-User Detection …

Fig. 8.3 The total BERs achieved by CS-based signal detector and SCS-based MUD against different SNR’s in LS-SM-MIMO, where K = 8, MRF = 18, M = 64, ρBS = 0.5, and the direct AE selection scheme is considered [15]

Figure 8.3 compares the overall BER achieved by the CS-based signal detector and by the proposed SCS-based MUD versus the SNR in our LS-SM-MIMO context, where K = 8, MRF = 18, M = 64, ρBS = 0.5, and the direct AE selection scheme is considered. The SCS-based MUD outperforms the CS-based signal detector even for J = 1, since the distributed sparsity of the aggregate SM signal is exploited. For the SCS-based MUD, the BER performance improves when J increases, albeit this is achieved at the cost of a reduced UL-throughput. To mitigate this impediment, a higher number of AEs can be employed by the users for expanding the spatial constellation symbol set constituted by the AEs. Specifically, by increasing n t from 4 to 8, the UL-throughput of the SCS-based MUD may be increased, but having more AEs at the user results in a higher ρUS . When n t is increased, the BER performance of the SCS-based MUD associated with J = 1 degrades, as expected. By contrast, when n t is increased, the BER performance loss of the SCS-based MUD using J = 2 can be less than 0.2 dB if the BER of 10−4 is considered, even when a higher ρUS associated with a higher n t is considered. Figure 8.4 portrays the BER achieved by the different signal detectors as a function of the SNR in the context of the proposed LS-SM-MIMO for K = 8, MRF = 18, M = 64, n t = 4, ρBS = 0.5, ρUS = 0, where the direct AE selection scheme is also considered. In Fig. 8.4, we also characterize the ‘oracle-assisted’ LS-based signal detector relying on the assumption that the spatial constellation symbol is perfectly

8.5 Summary

169

Fig. 8.4 The total BERs achieved by different signal detectors against different SNR’s in the proposed LS-SM-MIMO and conventional LS-MIMO [15]

known at the BS for the proposed LS-SM-MIMO scheme associated with J = 2, 64QAM as well as for the MMSE-based LS-MIMO detector in conjunction with 64QAM, where both of them only consider the BER of the classic signal constellation symbol. Here we assume that the LS-MIMO arrangement uses the same number of RF chains to serve 8 single-antenna users communicating over uncorrelated Rayleighfading channels. The superiority of our SCS-based MUD over the MMSE-based and CS-based signal detectors becomes clear. Moreover, the performance gap between the oracle LS-based signal detector associated with 7 bpcu and the proposed SCS-based MUD with 7 bpcu is less than 0.5 dB. Note again that the oracle LS-based signal detector only considers the BER of the classic signal constellation symbol, while the proposed SCS-based MUD considers both the spatial and the classic signal constellation symbols. Finally, compared to the conventional LS-MIMO using the MMSE-based signal detector (6 bpcu), our proposed UL LS-SM-MIMO and the associated SCS-based MUD (7bpcu) only suffers from a negligible BER loss, which explicitly confirmed the improved UL-throughput of the proposed LS-SM-MIMO scheme.

170

8 Compressive Sensing Multi-User Detection …

8.5 Summary In this chapter, we propose a LS-SM-MIMO scheme for the UL-transmission. The BS employs a large number of AEs but a much smaller number of RF chains, where a simple receive AE selection scheme is used for the improved performance. Each user equipped with multiple AEs but only a single RF chain uses CPSC to combat multipath channels. SM has been adopted for the UL-transmission to improve the ULthroughput. The proposed scheme is especially suitable for scenarios, where a large number of low-cost AEs can be accommodated, and both the power consumption as well as hardware cost are heavily determined by the number of RF chains. Due to the reduced number of RF chains at the BS and multiple AEs employed by each user, the UL multi-user signal detection is a challenging large-scale under-determined problem. We propose a joint SM transmission scheme at the users to introduce the group sparsity of multiple aggregate SM signals, and a matching SCS-based MUD at the BS has been proposed to leverage the inherently distributed sparsity of the aggregate SM signal as well as the group sparsity of multiple aggregate SM signals for reliable multi-user signal detection performance. The proposed SCS-based MUD enjoys the low complexity, and our simulation results demonstrate that it performs better than its conventional counterparts with even much higher UL-throughput.

References 1. Serafimovski, N., Younis, A., Mesleh, R., Chambers, P., Di Renzo, M., Wang, C.X., Grant, P.M., Beach, M.A., Haas, H.: Practical implementation of spatial modulation. IEEE Trans. Veh. Technol. 62(9), 4511–4523 (2013) 2. Som, P., Chockalingam, A.: Spatial modulation and space shift keying in single carrier communication. In: 2012 IEEE 23rd International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), pp. 1962–1967. IEEE (2012) 3. Wu, X., Di Renzo, M., Haas, H.: Adaptive selection of antennas for optimum transmission in spatial modulation. IEEE Trans. Wirel. Commun. 14(7), 3630–3641 (2015) 4. Narayanan, S., Chaudhry, M.J., Stavridis, A., Di Renzo, M., Graziosi, F., Haas, H.: Multiuser spatial modulation MIMO. In: 2014 IEEE Wireless Communications and Networking Conference (WCNC), pp. 671–676. IEEE (2014) 5. Wu, X., Claussen, H., Di Renzo, M., Haas, H.: Channel estimation for spatial modulation. IEEE Trans. Commun. 62(12), 4362–4372 (2014) 6. Younis, A., Sinanovic, S., Di Renzo, M., Mesleh, R., Haas, H.: Generalised sphere decoding for spatial modulation. IEEE Trans. Commun. 61(7), 2805–2815 (2013) 7. Rusek, F., Persson, D., Lau, B.K., Larsson, E.G., Marzetta, T.L., Edfors, O., Tufvesson, F.: Scaling up MIMO: opportunities and challenges with very large arrays. IEEE Signal Process. Mag. 30(1), 40–60 (2012) 8. Liu, W., Wang, N., Jin, M., Xu, H.: Denoising detection for the generalized spatial modulation system using sparse property. IEEE Commun. Lett. 18(1), 22–25 (2013) 9. Yu, C.M., Hsieh, S.H., Liang, H.W., Lu, C.S., Chung, W.H., Kuo, S.Y., Pei, S.C.: Compressed sensing detector design for space shift keying in MIMO systems. IEEE Commun. Lett. 16(10), 1556–1559 (2012) 10. Shim, B., Kwon, S., Song, B.: Sparse detection with integer constraint using multipath matching pursuit. IEEE Commun. Lett. 18(10), 1851–1854 (2014)

References

171

11. Di Renzo, M., Haas, H., Ghrayeb, A., Sugiura, S., Hanzo, L.: Spatial modulation for generalized MIMO: challenges, opportunities, and implementation. Proc. IEEE 102(1), 56–103 (2013) 12. Duarte, M.F., Eldar, Y.C.: Structured compressed sensing: from theory to applications. IEEE Trans. Signal Process. 59(9), 4053–4085 (2011) 13. Dai, W., Milenkovic, O.: Subspace pursuit for compressive sensing signal reconstruction. IEEE Trans. Inf. Theory 55(5), 2230–2249 (2009) 14. Björck, Å., et al.: Numerical Methods in Matrix Computations, vol. 59. Springer (2015) 15. Gao, Z., Dai, L., Wang, Z., Chen, S., Hanzo, L.: Compressive-sensing-based multiuser detector for the large-scale SM-MIMO uplink. IEEE Trans. Veh. Tech. 65(10) 8725–8730 (2016)

Chapter 9

Compressive Sensing Massive IoT Access in Massive MIMO Systems with Media Modulation

Abstract This chapter introduces a media modulation-based mMTC solution to increase the throughput, utilizing a massive multi-input multi-output BS for enhanced detection performance. However, reliable active device detection and data decoding present a serious challenge in such an mMTC scenario. To address this problem, an efficient CS-based massive access solution is introduced, leveraging the sparsity of the UL massive access signals received at the BS. The adopted solution includes a structured orthogonal matching pursuit (StrOMP) algorithm for active device detection, exploiting the block sparsity of the UL access signals across successive time slots and the structured sparsity of media-modulated symbols to enhance detection performance. Furthermore, a successive interference cancellation (SIC)-based SSP algorithm is conceived for data demodulation of the active devices, leveraging the structured sparsity of media modulation-based symbols in each time slot to improve detection performance. Simulation results demonstrate the superiority of the considered scheme over state-of-the-art solutions.1

9.1 Introduction The emerging paradigm of mMTC is identified as an indispensable component for enabling the massive access of MTDs in the emerging IoT [1]. In stark contrast to conventional human-centric mobile communications, mMTC focuses on UL-oriented communications serving massive MTDs and exhibits sporadic tele-traffic requiring low-latency and high-reliability massive access [1]. The conventional grant-based access approach relies on complex time and frequency-domain resource allocation before data transmission, which would impose prohibitive signaling overhead and latency on massive mMTC [1]. To support lowpower MTDs at low latency, the emerging grant-free approach has attracted significant attention for massive access, since it simplifies the access procedure by directly delivering data without scheduling [2–7].

1

The work introduced in this chapter is based on the reference [17].

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3_9

173

174

9 Compressive Sensing Massive IoT Access …

Against this background, we propose to adopt media modulation at the MTDs for improving the UL throughput and to employ a massive multi-input multi-output (mMIMO) scheme at the BS. Moreover, a CS-based active device and data detection solution is proposed by exploiting both the sporadic traffic and the block-sparsity of mMTC as well as the structured sparsity of media-modulated symbols. Specifically, we first propose a StrOMP algorithm for AUD, where the block-sparsity of UL access signals across the successive time slots and the structured sparsity of mediamodulated symbols are exploited. Additionally, a successive interference cancellation based structured subspace pursuit (SIC-SSP) algorithm is proposed for demodulating the detected active MTDs, where the structured sparsity of media-modulated symbols in each time slot is exploited for enhancing the decoding performance. Note that the proposed StrOMP and SIC-SSP algorithms belong to the family of greedy algorithms. As a benefit of their near-optimal performance attained at a low complexity, greedy algorithms have been popularly used in mMTC scenarios [2–8]. Finally, our simulation results verify the superiority of the proposed scheme over cutting-edge benchmarks. Notation: Boldface lower and upper-case symbols denote column vectors and matrices, respectively. For a matrix A, AT , A H , A† , A F , A[m,n] denote the transpose, Hermitian transpose, pseudo-inverse, Frobenius norm, the m-th row and n-th column element of A, respectively. A[,:] (A[:,] ) is the sub-matrix containing the rows (columns) of A indexed in the ordered set . A[,m] is the m-th column of A[,:] . For a vector x, x p , [x]m , [x]m:n and [x] are the l p norm, m-th element, m-th to n-th elements, and entries indexed in the ordered set  of x, respectively. For an ordered set  and its subset , ||c , [m], and  \  are the cardinality of , m-th element of , and complement of subset  in , respectively. [K ] is the set {1, 2, ..., K }.

9.2 System Model We first introduce the proposed media modulation based mMTC scheme and then focus on our massive access technique relying on joint active device and data detection at the BS.

9.2.1 Media Modulation Aided mMTC As illustrated in Fig. 9.1, we propose that all K MTDs adopt media modulation for enhanced UL throughput and the BS employs mMIMO using Nr receive AEs for reliable massive access. In the UL, each symbol consists of the conventional modulated symbol and of the media-modulated symbol, and each MTD relies on a single conventional antenna and Mr extra RF mirrors [9–13]. By adjusting the binary on/off status of the Mr RF mirrors, we have Nt = 2 Mr mirror activation patterns (MAPs),

9.2 System Model

175

and the media-modulated symbol is obtained by mapping log2 (Nt ) = Mr bits to one of the Nt MAPs. Therefore, if the conventional M-QAM symbol is adopted, the overall UL throughput of an MTD is η = Mr + log2 M bit per channel use (bpcu). By contrast, to convey extra bits, SM relying on a single RF chain and multiple TAs will activate one of the TAs for UL transmission [14, 15]. To elaborate a little further, to achieve the same extra throughput, media modulation only requires a single UL TA and a linearly increasing number of RF mirrors, while SM requires an exponentially increasing number of TAs [9–15]. Clearly, media modulation is more attractive for mMTC owing to its increased UL throughput, which is achieved at a negligible power consumption and hardware cost [11–13]. Moreover, using a mMIMO UL receiver is the most compelling technique. By leveraging the substantial diversity gain gleaned from hundreds of antennas, the mMIMO BS is expected to achieve high-reliability UL multi-user detection, in the context of mMTC. By integrating the complementary benefits of media modulation at the MTDs and mMIMO reception at the BS into mMTC, we arrive at an excellent solution.

9.2.2 Transmission Model As shown in Fig. 9.1, we assume that the activity patterns of the K MTDs remain unchanged in a frame, which consists of J successive time slots. Hence we only focus our attention on the massive access for a given frame. Specifically, the signal received at the BS in the j-th (∀ j ∈ [J ]) time slot, denoted by y j ∈ C Nr ×1 , can be expressed as yj =

K 

j

j

ak gk Hk dk + w j =

k=1

K 

j

Hk xk + w j = H xj + wj,

(9.1)

k=1

where the activity indicator ak is set to one (zero) if the k-th MTD is active (inacj j j j j tive), while gk ∈ C, dk ∈ C Nt ×1 , and xk = ak gk dk ∈ C Nt ×1 are the conventional modulated symbol, media-modulated symbol, and equivalent UL access symbol of the k-th MTD in the j-th time slot, respectively. Furthermore, Hk ∈ C Nr ×Nt is the multi-input multi-output (MIMO) channel matrix associated with the k-th MTD, w j ∈ C Nr ×1 is the noise with elements obeying the independent and identically distributed (i.i.d.) complex Gaussian distribution CN (0, σw2 ), while H = j j j [H1 , H2 , ..., H K ] ∈ C Nr ×(K Nt ) and  x j = [(x1 )T , (x2 )T , ..., (x K )T ]T ∈ C(K Nt )×1 are the aggregate MIMO channel matrix and UL access signal in the j-th time slot, respectively. j Note that for any dk given ∀ j ∈ [J ] and ∀k ∈ [K ], only one of its entries is one and the others are all zeros, i.e., j

j

j

supp{dk } ∈ [Nt ],  dk 0 = 1,  dk 2 = 1,

(9.2)

176

9 Compressive Sensing Massive IoT Access …

Fig. 9.1 Proposed media modulation based mMTC scheme, where the UL access signal exhibits block-sparsity in a frame and structured sparsity in each time slot [17]

where supp{·} is the support set of its argument. Furthermore, we consider the Rayleigh MIMO channel model, hence the elements in Hk for ∀k ∈ [K ] follow the i.i.d. complex Gaussian distribution CN (0, 1). We assume that the channels remain K time-invariant for a relatively long period in typical IoT scenarios, hence {Hk }k=1 can be accurately estimated at the BS via periodic updates.

9.3 CS-Based Massive Access Scheme In typical IoT scenarios, the MTDs generate sporadic tele-traffic [2–7], which indicates that a=[a1 , a2 , ..., a K ]T ∈ C K×1 is a sparse vector and K a=a0 K . Moreover, this activity pattern exhibits the block-sparsity, since a typically remains unchanged j j j in J successive time slots within a frame [2–4, 7]. Furthermore, xk = ak gk dk for ∀ j∈[J ] exhibits the structured sparsity [9, 10], due to the sparse nature of mediamodulated symbols’ feature as illustrated in (9.2). The block-sparsity and structured sparsity of the UL signals inspire us to invoke CS theory to detect the active devices and demodulate the data at the BS. To exploit the block-sparsity of active MTD patterns, we first rewrite the received signals within a frame as Y = HX + W, (9.3) x1 , x2 , ..., xJ ] ∈ where we have Y = [y1 , y2 , ..., y J ] ∈ C Nr ×J , H ∈ C Nr ×(K Nt ) , X = [ (K Nt )×J 1 2 J Nr ×J , and W = [w , w , ..., w ] ∈ C . Thus the massive access problem can C be formulated as the following optimization problem

9.3 CS-Based Massive Access Scheme

177

Algorithm 9.1 The StrOMP Algorithm. Input: Y ∈ C Nr ×J , H ∈ C Nr ×(K Nt ) , and threshold Pth . a = ||c . Output: The index set of estimated active MTDs  ⊆ [K ], K 1: Initialization: The iterative index i=1, the residual matrix R(0)=Y,  (0)=∅. We define m∈C K×1 as an intermediate block correlation variable. For possible active MTDs given their temporary ={ n }||c , where  n ={Nt ([n]−1)+u} Nt is the MAP’s index set , their MAP’s index set is  n=1 u=1 index set of the n-th MTD in  for n ∈ [||c ]; 2: while 1 do J k Nt H (i−1) 2 3: [m]k = l=(k−1)N j=1 |(H[:,l] ) R[:, j] | , for k ∈ [K ]; t +1  4: k = arg maxk∈[K ] [m]k ; 5:  =  (i−1) ∪ k  ; {Possible support estimate} † 6: B[ ,:] = (H[:, ] ) Y, B[[K Nt ]\ ,:] = 0;{Coarse signal estimate via LS}  2 7: ηn, j = arg max n |B[ ηn, j , j] | , for n ∈ [||c ], j ∈ [J ]; ηn, j ∈ 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18:

||

 } c , for j ∈ [J ]; ( j) = {ηn, j n=1 A[( j) , j]=(H[:,( j) ] )† Y[:, j] , A[[K Nt ]\( j) , j] = 0, for j∈[J ]; {Fine signal estimate via LS} Update} R(i)  HA; {Residue   =Y− if R(i−1)  F − R(i)  F < Pth then break; {Terminates the while-loop} else  (i) = ; {Support estimate update} i = i + 1; end if end while a = ||c . Result: =  (i−1) , K

minX  Y − HX 2F = min

{ x j } Jj=1

=

min

j

J 

j

{ak ,dk ,gk } J,K j=1,k=1

 yj −

j=1

J 

 y j − H x j 22

j=1 K 

j

j

ak gk Hk dk 22

(9.4)

k=1

s.t. (2) and a0  K . In the following subsections, we will first utilize the proposed StrOMP algorithm to determine the indices of active devices. On that basis, the associated data is further detected based on the proposed SIC-SSP algorithm. Finally, we will discuss the computational complexity of the proposed algorithms.

9.3.1 The StrOMP Algorithm for AUD The proposed StrOMP procedure listed in Algorithm 9.1 [17] was evolved from the OMP algorithm of [17]. Specifically, line 3 calculates the sum correlation m associated with all Nt MAPs in J time slots for each MTD; line 5 combines k  (i.e., the most likely active MTD) with  (i−1) to update the possible support set ; in

178

9 Compressive Sensing Massive IoT Access …

line 6, the coarse signal estimate is obtained by the LS algorithm; lines 7∼9 exploit the structured sparsity of media-modulated symbols to estimate the possible MAPs based on the coarsely estimated signal B, and then the fine signal estimate is obtained in line 9 for enhancing the robustness to noise; line 10 updates the residual by using the finely estimated signalA. Inline 11, if the energy difference of the residual in adjacent iterations R(i−1)  F − R(i)  F falls below a predefined threshold, the loop is curtailed, otherwise the iteration continues. The classical OMP algorithm requires the sparsity level K a , whereas the proposed StrOMP algorithm adaptively acquires the number of active MTDs without knowing K a . Compared to the OMP algorithm, the proposed StrOMP achieves an improved detection performance by exploiting the block-sparsity (line 3) and the structured sparsity (lines 7∼9) of the UL signals.

9.3.2 SIC-SSP Algorithm for Data Detection Based on the estimated active MTDs  obtained from Algorithm 9.1, the data detection problem in formula (9.4) reduces to the same CS problem as in [16] (i.e., Eq. (10) for J = 1 in [16]), which can be solved by the GSP algorithm of [16]. To further improve the performance, the proposed SIC-SSP algorithm, as listed in Algorithm 9.2 [17], intrinsically integrates the idea of SIC with the GSP algorithm. x j with j∈[J ], Specifically, the outer for-loop recovers { x j } Jj=1 separately. For each  a sparsity by performing the inner for-loop recovers a structured sparse signal with K a − 1) SIC operations. In contrast to the existing GSP algorithm, the inner for-loop (K of the proposed algorithm incorporates the SIC operation (line 17∼22). Specifically, line 18 selects the index of the maximum element of the finely estimated signal e and subsequently line 19 eliminates it from the measurement vector v; line 20 records the maximum element in  x j ( j∈[J ]) and reduces the size of the remaining set of active MTDs  by 1, which corresponds to reducing the column dimension of the channel matrix in the next iteration for improving the data detection performance. Moreover, lines 9 and 13 improve the performance by exploiting the signal’s structured sparsity. Finally, the algorithm is terminated when X is fully reconstructed.

9.3.3 Computational Complexity The computational complexity of the proposed StrOMP algorithm (Algorithm 9.1) in the i-th iteration mainly depends on the following operations. Signal correlation (line 3): The matrix multiplication used has a complexity order of O(J K Nt Nr ). Coarse signal estimate via LS (line 6): The coarse LS solution has a complexity order of O[J (2Nr (i Nt )2 + (i Nt )3 )].

9.3 CS-Based Massive Access Scheme

179

Algorithm 9.2 The SIC-SSP Algorithm a . Input: Y = [y1 , y2 , ..., y J ] ∈ C Nr ×J , H ∈ C Nr ×(K Nt ) , and the output of Algorithm 9.1: , K Output: Reconstructed UL access signal X = [ x1 , x2 , ..., x J ]. 1: for j = 1 : J do a do 2: for s = 1 : K 3: if s = 1 then 4: v = y j ,  = , where v is the measurement vector and  is the remaining set of MTDs  and  n are the same as those in Algorithm to be decoded, and the definitions of  9.1;{Initialization} 5: end if 6: i = 1,  (0) = ∅, r(0) = v; {Initialization} 7: while true do H (i−1) , [p] 8: [p] ] ) r  = 0; {Correlation}  = (H[:, [K Nt ]\  2 , for n ∈ [|| ]; 9: τn = arg max |[p] |   τ c n τn ∈n 10: 11: 12: 13:

||

 = {τn + ([n] − 1)Nt }n=1c ;{||c most likely MAPs}  =  ∪  (i−1) ;{Preliminary support estimate} [e] = (H[:, ] )† r(0) , [e][K Nt ]\ = 0;{Coarse LS} 2 ηn = arg max n |[e] ηn | , for n ∈ [||c ]; ηn ∈ ||

14:  (i) = {ηn + ([n] − 1)Nt }n=1c ; {Pruning support set} 15: [e] (i) = (H[:, (i) ] )† r(0),[e][K Nt ]\ (i) = 0;{Fine LS} 16: r(i) = r(0) − He;{Residue Update} a or  (i) =  (i−1) then 17: if i ≥ K 2 18:  =  (i) , n  = arg max n] | ; n ∈[||c ] |[e][ 19: v = v−H[:,[n  ]] [e][n  ] ;{Measurement vector update} 20: [ x j ][n  ] = [e][n  ] ,  =  \ {[n  ]}; 21: break; {Terminates the while-loop} 22: end if i = i + 1 23: end while 24: end for 25: end for 26: Result:X = [ x1 , x2 , ..., x J ].

Fine signal estimate via LS (line 9): The fine-grained LS solution has a complexity order of O[J (2Nr i 2 + i 3 )]. Residue update (line 10): Since signal A acquired in line 9 is represented by a sparse matrix, the complexity of computing the residual is O(J Nr i). While, The computational complexity of the proposed SIC-SSP algorithm (Algoa ) inner for-loop mainly depends on the following rithm 9.2) in the s-th (1 ≤ s ≤ K operations. Correlation (line 8): The matrix multiplication involved has a complexity order a − s + 1)Nt Nr ]. of O[( K Coarse LS (line 12): The coarse LS solution has a complexity order of O[2Nr a − s + 1))2 + (2( K a − s + 1))3 ]. (2( K a − Fine LS (line 15): The fine LS solution has a complexity order of O[2Nr ( K 2 3  s + 1) + ( K a − s + 1) ]. a − Residue update (line 16): The complexity of computing the residual is O[( K s + 1)Nr ].

180

9 Compressive Sensing Massive IoT Access …

9.4 Simulation Results Let us now evaluate the probability of AUD error rate (Pe ) and the BER for E +E the proposed CS-based massive access solution. Here we have Pe = u K f , and m +Bc , where E u is the number of active MTDs missed by activity BER = Eu J η+B Ka J η detection, E f is the number of falsely detected inactive MTDs, Bm and Bc are the total number of error bits in the media-modulated symbols and conventional symbols for detected active MTDs within a frame, respectively, and K a J η is the total number of bits transmitted by K a active MTDs within a frame. In our simulations, the total number of MTDs is K = 100 with K a = 8 active MTDs. Furthermore, each media modulation based MTD adopts Mr = 2 RF mirrors and 4-QAM (M = 4), hence the overall throughput becomes η = Mr + log2 M = 4 bpcu. Finally, Pth in the proposed StrOMP algorithm is set to 2, which is selected experimentally. For comparison, we consider the following benchmarks. Benchmark 1: Zero forcing MUD for the traditional mMIMO UL [16] supporting K a single-antenna users adopting 16-QAM to achieve the same 4 bpcu. TLSSCS: The TLSSCS detector from [7], and using the scaling factor of α = 4 (i.e., α in Eq. (6) of [7]). StrOMP+GSP: The proposed StrOMP algorithm and the existing GSP algorithm of [16] are successively used to detect the active MTDs and the data. AUD lower bound: A modified StrOMP algorithm relying on the perfect knowledge of K a , which performs the iterations including lines 3∼10 and lines 14∼15 for K a times, and the estimated output support set is  (K a ) containing K a elements. BER lower bound: The Oracle LS based detector relying on the perfect known index set of active MTDs and the support set of mediamodulated symbols, is considered as the BER lower bound of the proposed mMTC scheme. From Figs. 9.2b, 9.3b, and 9.4b, it is obvious that the BER performance of the proposed mMTC scheme outperforms the traditional mMIMO UL (Benchmark 1) for the same throughput when Pe is small enough, thanks to the extra bits introduced by media modulation. Note that it is actually unfair for the proposed scheme to be compared with the benchmark 1 in BER performance since the latter does not consider the AUD error. Figure 9.2a, b compare the AUD performance and BER performance versus the signal-to-noise ratio (SNR), respectively. It is clear that the AUD performance of the proposed StrOMP algorithm is better than that of the TLSSCS algorithm, and it is hence closer to the AUD lower bound. We find that the BER performance of our “StrOMP+SIC-SSP” solution outperforms the TLSSCS detector, and the “StrOMP+GSP” solution, which demonstrates the efficiency of the proposed solution. Moreover, compared to the “StrOMP+GSP” solution, the BER performance of our “StrOMP+SIC-SSP” solution is getting better and better with the increase of SNR, which proves the efficiency of the SIC operation. Figure 9.3a, b compare the AUD performance and BER performance versus the frame length J , respectively. Owing to the exploitation of the block sparsity, it can be seen that the AUD performance of the proposed StrOMP improves upon increasing J . Furthermore, as for the AUD performance, the advantage of the proposed

9.4 Simulation Results

181

Fig. 9.2 Performance comparison of different solutions versus the SNR (Nr = 50, J = 12) [17]: a AUD performance; b BER performance

Fig. 9.3 Performance comparison of different solutions versus the frame length J (SNR = 2 dB, Nr = 50) [17]: a AUD performance; b BER performance

StrOMP algorithm over the TLSSCS algorithm becomes more obvious upon increasing J . We also find that except for the Oracle LS (BER lower bound), the proposed “StrOMP+SIC-SSP” solution has the lowest BER floor, for sufficiently large J . Figure 9.4a, b compare the AUD performance and BER performance versus the number of receive antennas Nr , respectively. Observe from Fig. 9.4 that when Nr becomes large, the AUD performance or BER of the proposed “StrOMP+SIC-SSP” solution is better than that of the TLSSCS detector and the “StrOMP+GSP” solution. This indicates the superiority of the proposed solution for employment in mMIMO.

182

9 Compressive Sensing Massive IoT Access …

Fig. 9.4 Performance comparison of different solutions versus the number of receive antennas Nr (SNR = 2 dB, J = 12) [17]: a AUD performance; b BER performance Table 9.1 Computational complexity comparison of different algorithms [17] Algorithms AUD

Data detection

Proposed StrOMP

Computational complexity  K a +1 O {(K a + 1)J K Nt Nr + s=1 [ J Nr (s + 2s 2 + 2(s Nt )2 )+J (s 3 + (s Nt )3 )]}

Complex-valued multiplications a (106 ) Nr = 50

Nr = 100

9.6

17.6

AUD part of TLSSCS [7]

O {(K a + 1)[Nr 2 (K Nt + J ) +  K a +1 [Nr 2 + Nr J K Nt ] + s=1 2Nr (s Nt )2 + (s Nt )3 ]}

12.5

44.2

AUD lower bound

O {K J K Nt Nr +  K aa 2 s=1 [J Nr (s + 2s +

7.1

13.2

2.1

4.0

Proposed SIC-SSP

2(s Nt )2 ) + J (s 3 + (s Nt )3 )]} Ka O {J s=1 [2s Nr (Nt + 1) + 14Nr s 2 + 11s 3 ]}

Data detection part of TLSSCS [7]

O [J Nr K a Nt + 2Nr (K a Nt )2 + (K a Nt )3 ]

0.15

0.28

GSP [8]

O {J [2s Nr (Nt + 1) + 14Nr K a 2 + 11K a 3 ]}

0.65

1.2

BER lower bound

O (J Nr K a + 2Nr K a 2 + K a 3 )

0.01

0.02

Benchmark 1

O (J Nr K a + 2Nr K a 2 + K a 3 )

0.01

0.02

The number of the complex-valued multiplications is calculated under the parameters J = 12, Nt = 4, K = 100, K a = 8

a

References

183

The computational complexity of different solutions in our simulations are compared in Table 9.1, where the different algorithms are divided into two parts based on their functions (i.e., AUD or data detection). It is obvious that the number of complex-valued multiplications of the proposed StrOMP algorithm is a little lower than that in the AUD part of the TLSSCS algorithm (i.e., lines 1–14 of Algorithm 1 in [7]) when Nr = 50. If Nr is doubled, the number of complex-valued multiplications of the proposed StrOMP algorithm increases linearly with Nr , whereas the complexity of the AUD part of the TLSSCS algorithm is nearly proportional to the square of Nr . Hence, it is clear that our StrOMP algorithm is more suitable for mMIMO in conjunction with large antenna arrays. Furthermore, after obtaining the active MTDs, the data detection part of the TLSSCS algorithm becomes an LS operation (i.e., line 15 of Algorithm 1 in [7]) associated with a limited BER performance for the media-modulated signal. Hence, our proposed SIC-SSP algorithm attains a better data detection performance at the cost of a higher computational complexity.

9.5 Summary In this chapter, a media modulation based mMTC UL scheme relying on mMIMO detection at the BS is proposed for achieving reliable massive access with an enhanced throughput. The sparse nature of the mMTC traffic motivate us to propose a CSbased solution. First, an StrOMP algorithm is proposed to detect the active MTDs exhibiting block-sparsity and structured sparsity of the UL signals, which improved the performance. Then, an SIC-SSP algorithm is proposed for detecting the data of the detected MTDs by exploiting the structured sparsity of media-modulated symbols for enhancing the performance. Furthermore, we analyze the computational complexity of the proposed algorithms. Finally, our simulation qualify the benefits of the proposed solution.

References 1. Bockelmann, C., Pratas, N., Nikopour, H., Au, K., Svensson, T., Stefanovic, C., Popovski, P., Dekorsy, A.: Massive machine-type communications in 5G: physical and mac-layer solutions. IEEE Commun. Mag. 54(9), 59–65 (2016) 2. Wang, B., Dai, L., Mir, T., Wang, Z.: Joint user activity and data detection based on structured compressive sensing for NOMA. IEEE Commun. Lett. 20(7), 1473–1476 (2016) 3. Du, Y., Cheng, C., Dong, B., Chen, Z., Wang, X., Fang, J., Li, S.: Block-sparsity-based multiuser detection for uplink grant-free NOMA. IEEE Trans. Wirel. Commun. 17(12), 7894–7909 (2018) 4. Jeong, B.K., Shim, B., Lee, K.B.: Map-based active user and data detection for massive machine-type communications. IEEE Trans. Vehic. Technol. 67(9), 8481–8494 (2018) 5. Wang, B., Dai, L., Zhang, Y., Mir, T., Li, J.: Dynamic compressive sensing-based multi-user detection for uplink grant-free NOMA. IEEE Comm. Lett. 20(11), 2320–2323 (2016)

184

9 Compressive Sensing Massive IoT Access …

6. Du, Y., Dong, B., Chen, Z., Wang, X., Liu, Z., Gao, P., Li, S.: Efficient multi-user detection for uplink grant-free NOMA: prior-information aided adaptive compressive sensing perspective. IEEE J. Sel. Areas Commun. 35(12), 2812–2828 (2017) 7. Ma, X., Kim, J., Yuan, D., Liu, H.: Two-level sparse structure-based compressive sensing detector for uplink spatial modulation with massive connectivity. IEEE Comm. Lett. 23(9), 1594–1597 (2019) 8. Choi, J.W., Shim, B., Ding, Y., Rao, B., Kim, D.I.: Compressed sensing for wireless communications: useful tips and tricks. IEEE Commun. Surveys Tutor. 19(3), 1527–1550 (2017) 9. Zhang, L., Zhao, M., Li, L.: Low-complexity multi-user detection for MBM in uplink largescale MIMO systems. IEEE Commun. Lett. 22(8), 1568–1571 (2018) 10. Shamasundar, B., Jacob, S., Theagarajan, L.N., Chockalingam, A.: Media-based modulation for the uplink in massive MIMO systems. IEEE Trans. Vehic. Technol. 67(9), 8169–8183 (2018) 11. Khandani, A.K.: Media-based modulation: a new approach to wireless transmission. In: 2013 IEEE International Symposium on Information Theory, pp. 3050–3054. IEEE (2013) 12. Naresh, Y., Chockalingam, A.: On media-based modulation using RF mirrors. IEEE Trans. Vehic. Technol. 66(6), 4967–4983 (2016) 13. Basar, E.: Media-based modulation for future wireless systems: a tutorial. IEEE Wirel. Commun. 26(5), 160–166 (2019) 14. Xiao, L., Yang, P., Xiao, Y., Fan, S., Di Renzo, M., Xiang, W., Li, S.: Efficient compressive sensing detectors for generalized spatial modulation systems. IEEE Trans. Vehic. Technol. 66(2), 1284–1298 (2016) 15. Xiao, L., Xiao, Y., Yang, P., Liu, J., Li, S., Xiang, W.: Space-time block coded differential spatial modulation. IEEE Trans. Vehic. Technol. 66(10), 8821–8834 (2017) 16. Gao, Z., Dai, L., Wang, Z., Chen, S., Hanzo, L.: Compressive-sensing-based multiuser detector for the large-scale SM-MIMO uplink. IEEE Trans. Vehic. Technol. 65(10), 8725–8730 (2015) 17. Qiao, L., Zhang, J., Gao, Z., Chen, S., Hanzo, L.: Compressive sensing based massive access for IoT relying on media modulation aided machine type communications. IEEE Trans. Veh. Technol. 69(9) 10391–10396 (2020)

Chapter 10

Sparse Channel Estimation in TDS-OFDM Systems

Abstract This chapter introduces a low-complexity sparse channel estimation (CE) scheme for time-domain synchronous (TDS)-OFDM systems to address the performance challenge in the presence of doubly selective fading channels. To tackle the issue, an overlap-add technique in the time-domain transmission symbol (TS) processing is adopted to coarsely estimate the channel length, path delays, and path gains. Even though the severe fading channel has a long delay spread, the robustness of the coarse CE can be improved by leveraging the temporal correlation of channels. Based on the coarse results, a priori-information aided iterative hard threshold (PA-IHT) algorithm is introduced to facilitate accurate channel estimation of doubly selective fading channels. Note that the considered method eliminates the convergence constraint and reduces the iteration numbers compared to the classical IHT algorithm, and has lower computational complexity and higher accuracy than the existing compressive sensing-based CE solutions. Simulation results verify the superiority of the considered methods for CE of TDS-OFDM systems over the existing schemes, particularly when dealing with severely doubly selective fading channels.1

10.1 Introduction OFDM technology has been widely applied in high-speed broadband wireless communication systems [1]. In the digital terrestrial television broadcasting (DTTB) field, both European second generation digital video broadcasting standard (DVBT2) [2] and Chinese digital terrestrial multimedia broadcasting standard (DTMB) [3] adopt OFDM as the key modulation technology. DVB-T2 uses CP based OFDM, where a CP, serving as the guard interval, is inserted between successive OFDM data blocks to eliminate the inter-block-interference (IBI) caused by multipath channels [4]. Unlike DVB-T2, DTMB uses TDS-OFDM, which replaces the CP with a time-domain TS. Compared with the classical CP based OFDM, TDS-OFDM has superior performance in terms of fast synchronization and CE, and it also achieves 1

The work introduced in this chapter is based on the reference [23].

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3_10

185

186

10 Sparse Channel Estimation in TDS-OFDM Systems

a higher spectral efficiency [5–7]. Owing to the good performance of TDS-OFDM, DTMB has been officially approved as an international DTTB standard, and has been successfully deployed in China and several other countries [5]. However, because of the mutual interferences between the TS and the OFDM data block, an iterative interference cancellation has to be used in TDS-OFDM systems to decouple the TS and the OFDM data block for CE and frequency-domain demodulation [6]. This iterative interference cancellation will result in the performance degradation under doubly selective fading channels [5], whereby the perfect removal of the IBIs is difficult to realize. To solve this problem, several schemes have been proposed [7–9]. Among these solutions, the dual pseudo-noise OFDM (DPN-OFDM) has attracted more attention owing to its simplicity and its ability to deliver an accurate channel estimate without imposing complex iterative interference cancellation [7], at the cost of sacrificing some spectral efficiency. Recently, a CE method was proposed based on CS for the current DTMB system [10], whereby the small IBI-free region in the received TS is exploited to estimate the long multipath channel by exploiting the greedy signal recovery algorithms for CS, such as the subspace pursuit (SP) algorithm [11] and the CoSaMP algorithm [12]. However, this scheme suffers from the drawback of high computational complexity owing to the matrix inversion operation required as well as the inevitable performance degradation under the adverse conditions of severe multipath channels with very long delay spread. Against this background, in this chapter we develop a low-complexity highaccuracy CS based CE scheme for TDS-OFDM systems. Our contribution is twofold. Firstly, we derive the overlap-add method of the TS to obtain the coarse estimates of the channel length, path delays and path gains of the wireless channel, whereby the temporal correlation of the multipath delays and multipath gains among several consecutive TDS-OFDM symbols is exploited. More specifically, the TS tail part caused by the multipath channel is superposed on the preceding TS main part, and then this overlap-add result is circularly correlated with the local pseudo-noise (PN) sequence to obtain some coarse CSI. Moreover, the temporal correlation of the multipath delays and gains among several consecutive TDS-OFDM symbols is jointly exploited to improve the robustness and accuracy of the coarse CE. Assisted by the prior information of the wireless channel obtained by the overlap-add method, our proposed CS based CE method, referred to as the PA-IHT, is capable of obtaining an accurate channel estimate while only imposing a low computational complexity. In particular, unlike the classical iterative hard threshold (IHT) algorithm [13] whose convergence requires the l2 norm of the measurement matrix being less than 1, our PA-IHT algorithm utilizes the available priori information of the wireless channel to remove such a restriction as well as to reduce the number of required iterations. Also benefiting from the acquired priori information of the wireless channel, our PA-IHT algorithm significantly improves signal recovery accuracy as well as considerably reduces the computational complexity, in comparison to the the existing CS based CE method, such as the modified CoSaMP algorithm of [10]. Moreover, unlike the modified CoSaMP algorithm which suffers from invertible CE performance degra-

10.2 System Model

187

dation under severe multipath channels with very long delay spread, the PA-IHT algorithm remains robust and accurate under such adverse channel conditions. Notation: The boldface capital and lower-case letters stand for matrices and column vectors, respectively. The exception is for frequency-domain vectors, whereby the DFT of the time-domain vector x is denoted by X. The operators ∗ and ⊗ represent the linear convolution and circular correlation, respectively, while · denotes the integer floor operator and abs{x} is the vector whose elements are the absolute values of the corresponding elements of the vector x. The transpose and conjugate transpose operations are denoted by (·)T and (·)H , respectively, while (·)† denotes the Moore-Penrose matrix inversion and the l p norm operation is given by  ·  p . The r -sparse vector of x is denoted by xr which is generated by retaining the r largest elements of x and setting the rest of the elements to zero. The support of the vector x is denoted by supt{x}, and x| denotes the entries of x defined in the set , while | denotes the sub-matrix whose columns comprise the columns of  that are defined in the set . Additionally, | · | denotes the absolute value, and | · |c is the cardinality of a set. Finally, δ(·) represents the unit impulse function.

10.2 System Model In the time domain, TDS-OFDM signals are grouped in symbols, and each TDST  OFDM symbol consists of a TS, which is a known PN sequence c = c0 c1 · · · c M−1 of length M, and the following OFDM data block of length N , which can be expressed T  as xi = xi,0 xi,1 · · · xi,N −1 with i denoting the TDS-OFDM symbol index. Hence,  T T    , the ith TDS-OFDM symbol is represented by si = cT xiT T = cT FHN X i T  where F N is the DFT matrix of size N × N and X i = X i,0 X i,1 · · · X i,N −1 is the frequency-domain ith OFDM data block. At the receiver, the received ith OFDM symbol can be written as ri = si ∗ hi + ni , where ni is the channel AWGN vector having a zero mean, while hi = T  h i,0 h i,1 · · · h i,L−1 is the time-varying CIR of length L which can be considered to be quasi-static during the time period of the ith TDS-OFDM symbol. Since the wireless channel is sparse in nature [14], its CIR comprises only P resolvable propagation paths, where P  L. In other words, only P coefficients of hi are non zero, and therefore the coefficients of the CIR can be expressed by the following model [15, 16] h i,l =

P−1  p=0

  αi, p δ l − τi, p , 0 ≤ l ≤ L − 1,

(10.1)

188

10 Sparse Channel Estimation in TDS-OFDM Systems

Fig. 10.1 Existing CE schemes for TDS-OFDM [23]: a the scheme based on iterative interference cancellation, b the scheme based on dual PN sequences, and c the scheme based on compressive sensing

where αi, p is the pth path gain and τi, p is the pth path delay. Obviously, we have  h i,l =

αi, p , if l = τi, p , 0, otherwise.

(10.2)

For CE and data demodulation in TDS-OFDM, the existing schemes may not be able to achieve satisfactory performance, especially under severe multipath channels. Figure 10.1 illustrates several existing CE schemes for TDS-OFDM systems. As shown in Fig. 10.1a, the conventional CE scheme for TDS-OFDM using single PN sequence has the advantage of maintaining a high spectral efficiency. However, this scheme suffers from the mutual interferences between the PN sequence and the OFDM data block. Therefore, an iterative interference cancellation has to be adopted to decouple the mutual interferences [6], which reduces the accuracy of CE in the doubly selective fading channel [5]. In the DPN-OFDM scheme, as shown in Fig. 10.1b, an extra PN sequence is inserted to prevent the second PN sequence from being contaminated by the preceding OFDM data block. In this way, the DPN-OFDM scheme removes the complex iterative interference cancellation and improves the CE performance, but at the cost of reducing the spectral efficiency. The PN sequence length in a TDS-OFDM system is designed to be longer than the maximum CIR length in order to ensure the reliable system performance. Considering the wireless scenarios that the actual CIR length L is less or even much less than the length of the guard interval M, there is an IBI-free region of the small size G = M − L + 1 at the end of the received PN sequence, as illustrated in Fig. 10.1c. In

10.3 PA-IHT Based Channel Estimation

189

T  this IBI-free region, the received signal y = y L−1 y L · · · y M−1 can be represented by

y =h + n ,

(10.3)



where n is the related channel AWGN vector, and ⎡

c L−1 c L−2 ⎢ c L c L−1 ⎢  =⎢ . .. ⎣ .. . c M−1 c M−2

⎤ · · · c0 · · · c1 ⎥ ⎥ .. .. ⎥ . . ⎦ · · · c M−L G×L

(10.4)

is a Toeplitz matrix of size G × L which is completely determined by the TS. Generally speaking, the IBI-free region, as shown in Fig. 10.1c, is usually small. Thus it is difficult to obtain a unique solution to the unknown channel h in (10.3), since the observation dimension G is usually smaller than the CIR dimension L. Fortunately, the CS theory [12] has proved that the high dimension signal can be accurately reconstructed by the low dimensional uncorrelated observations if the target signal is sparse or approximately sparse. A wireless channel is sparse in nature [14] and the actual number of the resolvable paths P usually satisfies P  L. Therefore, even though the CIR dimension L is larger or even much larger than the observation dimension G, we may have P ≤ G. In [10], a modified CoSaMP algorithm is proposed to solve the under-determined equation (10.3). This scheme inherits the advantage of high spectral efficiency without changing the current TDS-OFDM signal structure. Furthermore, it improves the CE performance without the need of iterative interference cancellation. However, this scheme has a high computational complexity due to the required matrix inversion operations in the CS algorithm and moreover its performance degrades over severe multipath channels with long delay spread, where the observation dimension G may become too small.

10.3 PA-IHT Based Channel Estimation In this section, we present the CE method based on the proposed PA-IHT algorithm for TDS-OFDM systems and also provide the complexity analysis for this CE scheme.

10.3.1 The Proposed PA-IHT Based CE Method The proposed CE method consists of four steps, as shown in Fig. 10.2. To be more specific, in the first step, a coarse CIR length and path delays are estimated, while in the second step, coarse channel gains are obtained. With the aid of the coarse

190

10 Sparse Channel Estimation in TDS-OFDM Systems

Fig. 10.2 The proposed PA-IHT based CE which consists of four steps [23]. The first two steps use the proposed overlap-add method of the TS, whereby the temporal correlation of the wireless channel is exploited to obtain some priori information of the wireless channel. In the rest two steps, the proposed PA-IHT algorithm and the ML criterion are used to obtain an accurate CE

information of the wireless channel obtained in the first two steps, the proposed PAIHT algorithm estimates the accurate path delays in the third step, and finally at the fourth step, the accurate path gains are obtained based on a ML criterion. In the first two steps, we exploit the temporal correlation of the wireless channel to estimate the coarse CIR length, path delays, and path gains. For time-varying channels, the path delays usually vary more slowly than the path gains [17]. Even for mobile scenarios, although the path gains will change over adjacent TDS-OFDM symbols, the path delays may remain relatively unchanged. This is because the duration Tdelay for the delay of a particular path to change by one tap is inversely proportional to the signal bandwidth f s , while the coherence time of the path gains Tgain is inversely proportional to the carrier frequency f c [17, 18]. Since f s  f c for all practical wireless systems, path delays change much slower than path gains, i.e. Tgain  Tdelay . Figure 10.3 depicts the CIRs of four adjacent TDS-OFDM symbols over International Telecommunications Union Vehicular B (ITU-VB) channel [19] with 120km/h receiver velocity. From Fig. 10.3, we observe that although the path gains are different in adjacent TDS-OFDM symbols, the path delays remain nearly invariant. The characteristics of the time-varying channel over several TDS-OFDM symbols can therefore be exploited to assist the coarse CE. Let us now elaborate this formally. Firstly, for a time-varying channel,   the path Tdelay delays of the channel in the time interval length of Tdelay , or the duration of Ts (M+N ) TDS-OFDM symbols, can be considered to remain nearly unchanged [17, 18]. In other words, the CIR can be considered to share the samesparsity pattern during Tdelay , or in the time duration 2Rd − 1 TDS-OFDM symbols [20], where Rd = 2Ts (M+N ) of Tdelay the temporally common sparsity of the wireless channel is guaranteed [18]. In practice, Rd can be very large since Tdelay ∝ Tvs c = fcs v , where c is the velocity of light and v is the speed of mobile receiver, while Ts is the data symbol duration which may therefore  as the resolution of delays. Hence we may obtain approximately  be regarded c Rd ≈ 2v(M+N ) fs Ts . Taking the example of v = 100 m/s and M + N = 256 + 4096 for instance, Rd can be considered to be unchanged during a superframe in

10.3 PA-IHT Based Channel Estimation

191

Fig. 10.3 The CIRs of four adjacent TDS-OFDM symbols over the ITU-VB channel with120km/h receiver velocity [23]. The DTTB carrier frequency f c = 634MHz and the symbol rate 1/Ts = 7.56MHz

the DTMB system, where a DTMB superframe consists of multiple TDS-OFDM symbols. Secondly, over time of  the coherence   the path gains, Tgain , the channel gains can be expressed as αi, p  exp φ0 + 2π f d t , where φ0 is the initial phase, t denotes the time, and f d is the Doppler frequency offset which can be estimated at the receiver [21]. Hence, the phase variation of a complex path gain  π within the  is less than 1 TDS-OFDM time interval of 21fd length, or the duration of Rg1 = 2 fd Ts (M+N ) symbols. the path delays of the channel are high correlated during   In other words, 1 Rg1 = 2 fd Ts (M+N ) TDS-OFDM symbols. By averaging the CIR estimate over Rg1 adjacent TDS-OFDM symbols, therefore, we can reduce the effects of the channel AWGN and improve the accuracy and reliability of the path delay estimation. Clearly, for this averaging to be effective, we must have Rg1 > 1. It is well known that f d ∝ fcc v . Hence we may obtain approximately Rg1 ≈   c . Since f s  f c , clearly we have 2Rd − 1 > Rg1 > 1. In practice, 2v(M+N ) f c Ts the receiver can adaptively choose appropriate values for the parameters Rd and Rg1 based on the channel status and the estimated f d of the time-varying channel. Furthermore, in order to achieve the reliable coarse CE in the case of instantaneous deep channel fading occurring during Rg1 adjacent TDS-OFDM symbols, the CIR estimations over the 2Rd − 1 adjacent TDS-OFDM symbols are jointly exploited to further improve the coarse estimation of the channel length and path delays.

192

10 Sparse Channel Estimation in TDS-OFDM Systems

Lastly, let us define the wireless channel being quasi-static during the duration of 2Rg2 − 1 TDS-OFDM symbols. In general, we must assume that the path delays and path gains of the wireless channel remain unchanged at least during one TDS-OFDM symbol. Therefore, for time-varying channels, we can choose Rg2 = 1. For static channels in particular where f d = 0, both the path delays and path gains are time-invariant. We can simple choose a desired value of Rg1 > 1 for averaging, and further set 2Rd − 1 = 2Rg2 − 1 = Rg1 . We now detail our proposed PA-IHT algorithm.

10.3.1.1

Step 1. Acquisition of Coarse Channel Length and Path Delays

We propose the overlap-add method of the TS, which jointly utilizes the received TSs from the (i − Rd + 1)th to (i + Rd )th TDS-OFDM symbols to exploit the temporal correlation of the wireless channel. The proposed overlap-add method of the TS is illustrated in Fig. 10.4, and its operation is represented by rk =rk,main + rk,tail , i − Rd + 1 ≤ k ≤ i + Rd ,

(10.5)

in which the TS main part rk,main and the TS tail part rk,tail can be expressed respectively by rk,main = k hk + nk,main , i − Rd + 1 ≤ k ≤ i + Rd , rk,tail = k hk + nk,tail , i − Rd + 1 ≤ k ≤ i + Rd ,

(10.6) (10.7)

where nk,main and nk,tail are the corresponding AWGN vectors, while

Fig. 10.4 Illustration for overlap-add of the TS in the ith TDS-OFDM symbol [23]. Note that in Step 1, the length of the TS tail part is M, while in Step 2, the length of the TS tail part is the estimated CIR length

10.3 PA-IHT Based Channel Estimation

193



⎤ c0 xk−1,N −1xk−1,N −2· · ·xk−1,N −L+1 ⎢ c1 c0 xk−1,N −1· · ·xk−1,N −L+2⎥ ⎢ ⎥ ⎢ .. ⎥ .. .. . . .. ⎢ . ⎥ . . . . ⎢ ⎥ k =⎢ , ⎥ c c · · · c c L−3 0 ⎢ L−1 L−2 ⎥ ⎢ . ⎥ .. .. . . .. ⎣ .. ⎦ . . . . c M−1 c M−2 c M−3 · · · c M−L M×L ⎡

xk,0 xk,1 .. .

c M−1 c M−2 xk,0 c M−1 .. .. . . xk,L−2 xk,L−3 .. .. . .

⎤ · · · c M−L+1 · · · c M−L+2 ⎥ ⎥ ⎥ .. .. ⎥ . . ⎥ · · · xk,0 ⎥ ⎥ ⎥ .. .. ⎦ . .

⎢ ⎢ ⎢ ⎢

k = ⎢ ⎢ xk,L−1 ⎢ ⎢ . ⎣ .. xk,M−1 xk,M−2 xk,M−3 · · · xk,M−L

.

(10.8)

(10.9)

M×L

Subsequently, the overlap-add results of the TS of Rg1 adjacent TDS-OFDM symbols are averaged, and then circularly correlated with the known TS, whereby the good auto-correlation and circular cross-correlation property of the TS is exploited. Specifically, we have ⎛ ⎞ q+Rg1 −1  1 1  ⎝c ⊗ rk ⎠ = hq = M Rg1 R g1 k=q

q+Rg1 −1



hk + vk ,

k=q

i − Rd + 1 ≤ q ≤ i + Rd − Rg1 ,

(10.10)

where vk represents the circular correlation of the interference plus the channel AWGN with the PN sequence averaging over the TS of Rg1 adjacent TDS-OFDM symbols. Note that by averaging over Rg1 adjacent TDS-OFDM symbols, the effect of the AWGN is significantly reduced.Consequently, the coarse CE h¯ is given by h¯ =

1 2Rd − Rg1

i+Rd −Rg1



  abs  hq .

(10.11)

q=i−Rd +1

For time-varying channels, 2Rd − Rg1 (> 1) estimates hq are utilized to obtain the ¯ and this reduces the effect of instantaneous deep channel fading occurcoarse CE h, ring during the duration of a particular TDS-OFDM symbol. For static channels, ¯ 2Rd − Rg1 = 1, and only single estimate hq is used to obtain the coarse CE h, since the channel delays and gains are constant over adjacent TDS-OFDM symbols. Finally, only the propagation path delays of the most significant taps  L−1    D0 = τ1 : h¯ τ1  ≥ E th τ1 =0

(10.12)

194

10 Sparse Channel Estimation in TDS-OFDM Systems

  L−1 ¯ where are retained, and the retained h¯ τ1 τ1 =0 form the resulting coarse estimate h, E th is the power threshold which can be determined according to [22]. In this way, the channel length can be estimated from the coarse CE according to  L = max τ1 + a, τ1 ∈D0

(10.13)

where a is a variable parameter used to define the IBI-free region comprising the last G samples of the received TS, which can be determined according to [18]. With an initial channel sparsity level given by S0 = |D0 |c , the channel sparsity level is then determined according to S = S0 + b, where b is a positive number used to combat the interference effect, since some low power paths may be treated as noise, and the value of b can be calculated according to [18]. Effectively, S is a coarse estimate of the number of resolvable propagation paths P.

10.3.1.2

Step 2. Acquisition of Coarse Channel Path Gains

The received TSs from (i − Rg2 + 1)th to (i + Rg2 )th TDS-OFDM symbols are then used to provide the coarse estimate of the channel path gains according to

h¯ =c ⊗

1 2Rg2 M



i+Rg2

 

rk,main + rk,tail ,

(10.14)

k=i−Rg2 +1

where rk,tail is the vector whose first  L elements are the first  L of rk,tail , while its rest elements are all zeros. The coarse estimates of the channel length, the channel path delays and path gains acquired in Steps 1 and 2 provide the priori information of the wireless channel to assist the accurate CE using the PA-IHT algorithm in the following two steps.

10.3.1.3

Step 3. Acquisition of Accurate Path Delay Estimate

We now present the proposed PA-IHT algorithm which utilizes the priori information from the coarse CE to improve the signal recovery accuracy and to reduce the computational complexity. Define the measurement vector as 1 y¯ = 2Rg2



i+Rg2

yk ,

(10.15)

k=i−Rg2 +1

where yk is the received SV in the IBI-free region of the kth TDS-OFDM symbol as = M − given in (10.3) but its size is G L + 1. The corresponding Toeplitz matrix  × of size G L can be formed according to (10.4). The pseudocode of the proposed

10.3 PA-IHT Based Channel Estimation

195

PA-IHT algorithm is summarized in Algorithm 10.1 [23]. The final estimated channel  L−1   L−1    h τ2  > 0 τ2 =0 , with  h τ2 τ2 =0 being the elements of  h. path delays are D = τ2 :  Algorithm 10.1 Priori-Information Aided Iterative Hard Threshold (PA-IHT).

Input: (1) Initial path delay set D0 , coarse CE h¯ , channel sparsity level S; (2) Noisy measurements y¯ , observation matrix .  Output: S-sparse  estimation h. 

 0  ¯ 1: x D ← h  ; 0   D0 2: u current = y¯ − x0 2 ; 3: u previous = 0; 4: while u previous ≤ u current , do 5: k ← k + 1;   6: z = xk−1 + H y¯ − xk−1 ; 7:  = sup {abs{z} S }; 8: xk ← xk−1 ; 

 9: xk  ← h¯  ;



10: xk ← xk  S ; 11: u previous =u current ;  12: u current = y¯ − xk 2 ; 13: end while 14:  h ← xk−1 .

In contrast to the classical IHT algorithm or other CS based algorithms, the proposed algorithm has several attractive features. Firstly, the PA-IHT algorithm exploits the available priori information of the coarse path delays and gains (or equivalently the locations and values of the partial large components in the target signal) as the initial condition, and this significantly enhances the signal recovery accuracy and reduces the number of required iterations. Secondly, unlike the modified CoSaMP algorithm [10], the sizes of the IBI-free region and the measurement matrix are adaptively determined by the coarse channel length estimate  L. Thirdly, the coarse path gains serve as the nonzero element values of the target signal in every iteration. By contrast, in order to obtain these values, the modified CoSaMP algorithm has to apply the LS estimation with high-complexity matrix inversion operation while the classical IHT algorithm uses the correlated results of the measurement matrix and the residual error [13], whose convergence condition requires that 2 < 1.

10.3.1.4

Step 4. Accurate Path Gain Estimation Based on ML

The final channel estimate is obtained as the ML estimate [10]  −1   † H H 

  | D | D h  = | D y¯ = | D y¯ , D

(10.16)

196

10 Sparse Channel Estimation in TDS-OFDM Systems

where  h is a vector of length M, whose elements outside the set D are zeros. We also give the CRLB of the proposed CE method [10]

  

  Var CRLB = E h¯ − h = 2

S , 2Rg2 Gρ

(10.17)

where ρ is the signal to noise ratio (SNR).

10.3.2 Convergence Properties In contrast to the conventional greedy algorithms for CS [12] which are generally used to solve the problem of   min x is an S-sparse vector: y − x2 , x

the proposed PA-IHT algorithm in Step 3 is used for the support detection. This support detection can be written as     

   min D: y − | D h¯   , D

D 2

(10.18)

where D is an S-dimension set, whose elements are in ascending order and they are

specified by the indexes of the elements in h¯ . Obviously, D is uniquely determined

when y,  and h¯ are given. The proposed algorithm solves the problem of (10.18) in a greedy manner. When the iterative procedure meets the stopping criterion, the algorithm converges to at least a locally optimal solution.

Moreover, from (10.14), it is clear that h¯ is an unbiased estimate of the channel, because the data part mixed in the overlap-add result of the TS can be regarded as a noise with zero mean. Thus the estimate of the coarse channel delays in Step 3 contains the true channel path delays with high probability [22]. By exploiting the priori information of the coarse channel path delays, the number of required iterations can be reduced, and the detected support tends to be a globally optimal solution.

10.3.3 Computational Complexity Steps 1 and 2 implement the M-point circular correlation using FFT, whose com plexity is in the order of O (M log2 M)/2 . In Step 3, owing to the priori information of the acquired coarse channel gains, our algorithm avoids the matrix inversion operation. In Step 4,the ML estimate requires the matrix inversion operation with the  complexity of O G S 2 + S 3 . Obviously, the main computational burden comes from  Step 4. and the complexity of our proposed algorithm is CPA−IHT = O G S 2 + S 3 .

10.4 Simulation Results

197

The conventional CoSaMP algorithm and the modified CoSaMP algorithm can  3 = O 4G S + 8S 4 and be shown to have the computational complexity of C CoSaMP   2 3 CmCoSaMP = O (S − S0 )(4G S + 8S ) , respectively [10]. The main computational burden of those two algorithms comes from the matrix inversion operation required to obtain the sparsity information or nonzero element values in the target signal. By contrast,  acquires this information at the cost of very low complexity  our algorithm of O (M log2 M)/2 . Considering the typical case of the ITU-VB channel [19] where we have S = 6, G = 104 and S0 = 3. The computational complexity of the three schemes are CCoSaMP = O(100224) given respectively by CPA−IHT = O(3960),   and CmCoSaMP = O(50112). We then have CPA−IHT CCoSaMP ≈ 4% and CPA−IHT CmCoSaMP ≈ 8%.

10.4 Simulation Results A simulation study was carried out to compare the performance of the proposed PAIHT scheme with those of the existing state-of-the-art methods for the TDS-OFDM system, including the modified CoSaMP based TDS-OFDM scheme [10] and the DPN-OFDM based scheme [7]. Simulation system parameters were set as: f c = 643 MHz, 1/Ts = 7.56 MHz, N = 2048, and M = 256 for the conventional TDSOFDM transmission and M = 2 × 256 for the DPN-OFDM transmission, while the perfect synchronization was assumed. We adopted the ITU-VB channel [19] and the China digital television test 8th channel model (CDT-8) channel [22] in the simulation, where both the static and mobile scenarios were investigated. The parameters Rg1 and Rg2 were adaptively set based on the channel state, while we considered Rd = 40 in both the mobile and static scenarios. The simulation was carried out using MATLAB R2012a tool. In the simulation, the variable parameter a in (13) was approximately chosen as a ≈ 0.1 maxτ1 ∈D0 τ1 , while the positive number b in determining the channel sparsity level S was empirically set to b ∈ [0, 5] where the chosen value of b was inversely related to SNR. Figure 10.5 shows the signal recovery probabilities as the function of the IBI-free region size G achieved by the four different algorithms for the static ITU-VB channel, given SNR = 20 dB. In this simulation, if the MSE of the signal estimation was lower than 10−2 , the recovery result was considered to be correct [10] and hence the signal recovery probability was assumed to be 1. It can be clearly seen from Fig. 10.5 that the proposed PA-IHT algorithm outperforms the other three algorithms significantly. The original IHT algorithm fails to work in this case, because its convergence requires that 2 < 1 [13], but the measurement matrix (10.4) does not meet this condition. Compared with the CoSaMP algorithm and the modified CoSaMP algorithm which require the IBI-free region of size 40 and 30, respectively, to recovery the signal correctly with probability 1, the proposed PA-IHT algorithm only needs an IBI-free region of size 7. This means that the PA-IHT algorithm reduces the required observation samples by 82.5% and 76.7%, respectively, compared with the CoSaMP algorithm and the modified CoSaMP algorithm. This is because the pro-

198

10 Sparse Channel Estimation in TDS-OFDM Systems

Fig. 10.5 Target signal recovery probabilities versus IBI-free region size attained by the four schemes for the static ITU-VB channel given SNR = 20 dB [23]

posed algorithm benefits from the priori information acquired, in terms of both the locations and the values of the partial large components in the target signal. Therefore, our PA-IHT algorithm are particularly effective in combating the CIR with a very longer delay spread, while the existing CS based schemes may suffer from the serious performance degradation under such adverse channel conditions. Figure 10.6 compares the CIR estimates obtained by the three schemes for the time-varying CDT-8 channel with 120km/h receiver velocity given SNR = 10 dB. It can be clearly seen from Fig. 10.6b that the modified CoSaMP based scheme performs poorly. Four actual channel path taps, including the strongest echo path with a long delay spread, are missing from the CIR estimate provided by this scheme. By contrast, only one relatively insignificant channel path tap is missing from the CIR estimate obtained by the proposed scheme, as can be observed from Fig. 10.6c. This is because the CDT-8 channel has a very strong 0 dB echo with an extremely long delay spread. The coarse CE method in the modified CoSaMP scheme of [10] only uses the TS main part and discards the TS tail part (see the illustration of Fig. 10.4). Therefore, it cannot effectively detect the path delays with long delay spreads. By contrast, the proposed overlap-add method of the TS resolves this problem effectively. Moreover, by exploiting the temporal correlation of the wireless channel, the proposed PA-IHT scheme significantly improves the robustness of the coarse CE. Since the modified CoSaMP scheme [10] only utilizes the TSs preceding and following the current

10.4 Simulation Results

199

Fig. 10.6 Time-domain CIR estimates of the three different schemes for the CDT-8 channel with the mobile speed of 120km/h and given SNR = 10 dB [23]: a the DPN-OFDM based scheme, b the modified CoSaMP based scheme, and c the proposed PA-IHT based scheme

OFDM data block, its coarse path delay detection may fail to work properly under instantaneous deep fading channel situations. From Fig. 10.4a, it can be observed that the estimated gains of the first and fourth paths by the DPN-OFDM based scheme are lower than the noise floor. Hence, from the obtained CIR estimate, we cannot decide the delays of the first and fourth paths. Similar to the modified CoSaMP scheme, the DPN-OFDM based scheme suffers from serious performance degradation under instantaneous deep fading channel conditions as some of the estimated channel taps may be buried by the noise. Furthermore, the DPN-OFDM based scheme has an additional drawback of lower spectral efficiency. Figures 10.7 and 10.8 compare the achievable CE MSE performance and the data demodulation BER performance of the three schemes, respectively, where the dynamic channel refers to the CDT-8 or the ITU-VB channel with the mobile speed of 120km/h. The modulation scheme employed was the quadrature phase shift keying (QPSK). It is clear that the existing modified CoSaMP based scheme achieves better performance than the existing DPN-OFDM based scheme for the ITU-VB channel,

200

10 Sparse Channel Estimation in TDS-OFDM Systems

Fig. 10.7 MSE performance comparison of the proposed PA-IHT scheme with the existing DPNOFDM based and modified CoSaMP based schemes [23]: a the ITU-VB channel, and b the CDT-8 channel

10.4 Simulation Results

201

Fig. 10.8 BER performance comparison of the proposed PA-IHT scheme with the existing DPNOFDM based and modified CoSaMP based schemes [23]: a the ITU-VB channel, and b the CDT-8 channel. The modulation scheme employed is QPSK

202

10 Sparse Channel Estimation in TDS-OFDM Systems

but the modified CoSaMP based method completely fails for the CDT-8 channel. This again confirms that the modified CoSaMP scheme suffers from serious performance degradation under severe multipath propagation environments. The results of Figs. 10.7 and 10.8 clearly demonstrate that our PA-IHT based scheme significantly outperforms the two existing schemes in various wireless scenarios, especially in fast time-varying and severe multipath propagation scenarios, such as the CDT-8 channel with 120km/h receiver velocity. More specifically, for the static ITU-VB channel, the MSE performance of the proposed PA-IHT based scheme are more than 20 dB and 30 dB better than the modified CoSaMP and DPN-OFDM based schemes, respectively, while it outperforms the other two schemes by approximately 8 dB and 15 dB, respectively, for the dynamic ITU-VB channel. For the dynamic and static CDT-8 channels, the MSE performance attained by our method are more than 5 dB and 20 dB better than the DPN-OFDM based method, respectively. Moreover, the MSE performance of our proposed method is very close to the theoretical CRLB for the two static channels. In terms of achievable BER, the superior performance of our scheme over the two existing ones are self-evident in Fig. 10.8, where the performance gain of our method over the existing methods is particularly noticeable under doubly selective fading channel environments. This is owing to the following reasons. The overlap-add method of the TS based on several consecutive TDS-OFDM symbols significantly improves the robustness and accuracy of the coarse estimates for the channel length and path delays. This provides the accurate priori information to assist the PA-IHT algorithm. Furthermore, the sizes of the IBI-free region and the measurement matrix are adaptive, which further improves the CE accuracy of the PA-IHT algorithm. It is also worth pointing out again that the proposed scheme does not alter the current TDS-OFDM signal structure and it achieves a higher spectral efficiency than the existing DPN-OFDM scheme.

10.5 Summary In this chapter, we propose a low-complexity and high-accuracy CS based CE method, referred to as the PA-IHT algorithm, for widely deployed TDS-OFDM systems, which significantly outperforms the existing state-of-the-arts in terms of both estimation accuracy and computational complexity. The classical IHT algorithm for TDS-OFDM requires that the l2 norm of the measurement matrix is smaller than 1 in order to guarantee convergence. By contrast, our proposed PA-IHT algorithm removes such a restriction and it only requires a very few iterations. We also demonstrate that our scheme significantly outperforms the conventional DPN-OFDM based scheme. Compared with the DPN-OFDM scheme, our proposed PA-IHT based scheme has the additional advantage of achieving a higher spectral efficiency and it does not alter the current TDS-OFDM signal structure. Compared with the existing CS based methods, such as the modified CoSaMP algorithm for TDS-OFDM, our PA-IHT algorithm significantly improves the accuracy of the channel estimate while imposing a much lower computational complexity. Most significantly, our scheme

References

203

maintains its effectiveness under fast time-varying severe multipath environments. Under such adverse channel conditions, the existing CS based scheme for TDSOFDM fails to work completely.

References 1. Bingham, J.A.: Multicarrier modulation for data transmission: an idea whose time has come. IEEE Commun. Mag. 28(5), 5–14 (1990) 2. Digital Video Broadcasting (DVB); Frame structure channel coding and modulation for a second generation digital terrestrial television broadcasting system (DVB-T2). ETSI Standard, EN 302 755, V1.3.1 (2012) 3. Framing Structure, Channel Coding and Modulation for Digital Television Terrestrial Broadcasting System. International DTTB Standard, GB 20600-2006 (2006) 4. Van Waterschoot, T., Le Nir, V., Duplicy, J., Moonen, M.: Analytical expressions for the power spectral density of CP-OFDM and ZP-OFDM signals. IEEE Signal Process. Lett. 17(4), 371– 374 (2010) 5. Dai, L., Wang, Z., Yang, Z.: Next-generation digital television terrestrial broadcasting systems: key technologies and research trends. IEEE Commun. Mag. 50(6), 150–158 (2012) 6. Wang, J., Yang, Z.X., Pan, C.Y., Song, J., Yang, L.: Iterative padding subtraction of the PN sequence for the TDS-OFDM over broadcast channels. IEEE Trans. Consum. Electron. 51(4), 1148–1152 (2005) 7. Fu, J., Wang, J., Song, J., Pan, C.Y., Yang, Z.X.: A simplified equalization method for dual PN-sequence padding TDS-OFDM systems. IEEE Trans. Broadcast. 54(4), 825–830 (2008) 8. Huemer, M., Onic, A., Hofbauer, C.: Classical and Bayesian linear data estimators for unique word OFDM. IEEE Trans. Signal Process. 59(12), 6073–6085 (2011) 9. Dai, L., Wang, Z., Yang, Z.: Time-frequency training OFDM with high spectral efficiency and reliable performance in high speed environments. IEEE J. Sel. Areas Commun. 30(4), 695–707 (2012) 10. Dai, L., Wang, Z., Yang, Z.: Compressive sensing based time domain synchronous OFDM transmission for vehicular communications. IEEE J. Sel. Areas Commun. 31(9), 460–469 (2013) 11. Dai, W., Milenkovic, O.: Subspace pursuit for compressive sensing signal reconstruction. IEEE Trans. Inf. Theor. 55(5), 2230–2249 (2009) 12. Duarte, M.F., Eldar, Y.C.: Structured compressed sensing: from theory to applications. IEEE Trans. Signal Process. 59(9), 4053–4085 (2011) 13. Blumensath, T., Davies, M.E.: Iterative thresholding for sparse approximations. J. Fourier Anal. Appl. 14, 629–654 (2008) 14. Bajwa, W.U., Haupt, J., Sayeed, A.M., Nowak, R.: Compressed channel sensing: a new approach to estimating sparse multipath channels. Proc. IEEE 98(6), 1058–1076 (2010) 15. Yang, B., Letaief, K.B., Cheng, R.S., Cao, Z.: Channel estimation for OFDM transmission in multipath fading channels based on parametric channel modeling. IEEE Trans. Commun. 49(3), 467–479 (2001) 16. Iyer, A., Rosenberg, C., Karnik, A.: What is the right model for wireless channel interference? IEEE Trans. Wirel. Commun. 8(5), 2662–2671 (2009) 17. Telatar, I.E., Tse, D.N.C.: Capacity and mutual information of wideband multipath fading channels. IEEE Trans. Inf. Theory 46(4), 1384–1400 (2000) 18. Dai, L., Wang, J., Wang, Z., Tsiaflakis, P., Moonen, M.: Spectrum-and energy-efficient OFDM based on simultaneous multi-channel reconstruction. IEEE Trans. Signal Process. 61(23), 6047–6059 (2013) 19. Zhang, C., Wang, Z., Pan, C., Chen, S., Hanzo, L.: Low-complexity iterative frequency domain decision feedback equalization. IEEE Trans. Vehic. Technol. 60(3), 1295–1301 (2011)

204

10 Sparse Channel Estimation in TDS-OFDM Systems

20. Van Den Berg, E., Friedlander, M.P.: Theoretical and empirical results for recovery from multiple measurements. IEEE Trans. Inf. Theor. 56(5), 2516–2527 (2010) 21. Cai, J., Song, W., Li, Z.: Doppler spread estimation for mobile OFDM systems in Rayleigh fading channels. IEEE Trans. Consum. Electron. 49(4), 973–977 (2003) 22. Wan, F., Zhu, W.P., Swamy, M.: Semi-blind most significant tap detection for sparse channel estimation of OFDM systems. IEEE Trans. Circ. Syst. I Regular Pap. 57(3), 703–713 (2009) 23. Gao, Z., Zhang, C., Wang, Z., Chen, S.: Priori-information aided iterative hard threshold: A low-complexity high-accuracy compressive sensing based channel estimation for TDS-OFDM. IEEE Trans. Wirel. Commun. 14(1) 242–251 (2015)

Correction to: Sparse Signal Processing for Massive MIMO Communications

Correction to: Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3 In the original version of the book, the following belated corrections have been incorporated: In Chapters 1 to chapter 10, many text corrections have been updated in the chapter content. The book and the chapters have been updated with the changes.

The updated version of the book can be found at https://doi.org/10.1007/978-981-99-5394-3

© Beijing Institute of Technology Press 2024 Z. Gao et al., Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3_11

C1

Appendix A

Proof of Theorem 3.1

We first provide the definition of SRIP for  in our problem Y = D + W (3.9), where D has the structured sparsity as illustrated in (3.10). Particularly, the SRIP can be expressed as √ √ 1 − δ D  F ≤  D  F ≤ 1 + δ D  F ,

(A.1)

where δ ∈ [0, 1),  is an arbitrary set with ||c ≤ P, and δ P is the infimum of all δ satisfying (A.1). Note that for (A.1),  = [1 , 2 , . . . ,  L ] ∈ C N p ×M L with l ∈ C N p ×M for 1 ≤ l ≤ L, D = [DT1 , DT2 , . . . , DTL ]T ∈ C M L×R with Dl ∈ C M×R for 1 ≤  T   l ≤ L,  = (1) , (2) , . . . , (||c ) and D = DT(1) , DT(2) , . . . , DT(||c ) , and (1) < (2) < · · · < (||c ) are elements in the set . Clearly, for two different sparsity levels P1 and P2 with P1 < P2 , we have δ P1 ≤ δ P2 . Moreover, for two sets with 1 ∩ 2 = φ and the structured sparse matrix D with the support set 2 , we have     H  D =  H  D  ≤ δ| | +| | D F , (A.2) 2 2 F 1 c 2 c 1 1 F     δ|1 |c +|2 |c   (1 −  )2 D2  F ≤ (I−1 † 1 )2 D2  F (1 − δ|1 |c )(1 − δ|2 |c )   ≤  D  , 2

2

(A.3)

F

which will be proven in Appendices B and C, respectively.    ˆ To prove (3.21), we need to investigate the upper bound of D−D  , which can F be expressed as

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al. (eds.), Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3

205

206

Appendix A: Proof of Theorem 3.1

          †  ˆ  ≤ Dˆ − ˆ Y + DT /ˆ  D − D F F  F       † = Dˆ − ˆ (T DT + W) + DT /ˆ  F      F      †   † ≤ Dˆ − ˆ T DT  + ˆ W + DT /ˆ  F F   F        †  †   = ˆ T /ˆ DT /ˆ  + ˆ W + DT /ˆ  , F

F

(A.4)

F

ˆ is the estimated support set, T is the correct support set, and T / ˆ where  ˆ denotes a set whose elements belong to T except for . The first inequality is 2  2    due to D2 = D ˆ  + D ˆ  . The second equality is due to  D = F

 F

T /

F

T /ˆ DT /ˆ + T ∩ˆ DT ∩ˆ and Dˆ =     For †ˆ T /ˆ DT /ˆ  , we have

T

T

†ˆ T ∩ˆ DT ∩ˆ .

F

    −1  †    ˆ T /ˆ DT /ˆ  = (Hˆ ˆ ) Hˆ T /ˆ DT /ˆ  ≤

 δ2P    DT /ˆ  , F F F 1 − δP (A.5)     where the inequality of (A.5) is due to (A.1) and (A.2). Similarly, we have †ˆ W ≤ √ 1+δ P 1−δ P

F

W F . Thus we have

√     + δ 1 + δP 1 − δ P 2P     ˆ ≤ W F . (A.6) D − D DT /ˆ  + F F 1 − δP 1 − δP     Then we will investigate the relationship between DT /ˆ  and W F . It should  F   ˆ we have Rk−1  ≤ Rk  , which inspires us be pointed out that, after we get , F  F    to first  study the relationship between Rk  F and Rk−1  F . For Rk  , we can obtain F

   k  † R  =  D+W−˜ k ˜ k (D+W) F    F      † ≤ (I−˜ k ˜ k )T /˜ k DT /˜ k  +W−˜ k †˜ k W F F     ≤ T /˜ k DT /˜ k  +W F  F    ≤ 1+δ P DT /˜ k  +W F .

(A.7)

F

where we have D = T ∩˜ k DT ∩˜ k + T /˜ k DT /˜ k , T ∩˜ k DT ∩˜ k = ˜ k     †˜ k T ∩˜ k DT ∩˜ k , and the second inequality is due to (A.3) and W−˜ k †˜ k W F ≤ W F .

Appendix A: Proof of Theorem 3.1

207

  On the other hand, we consider Rk−1  F , which can be expressed as    k−1   † R  ≥  (I−  ) D   −W F k k−1 k−1 ˜ ˜ ˜  T / T / ˜k F  F   1−δ P −δ2P   ≥ T /˜ k−1 DT /˜ k−1  −W F F 1−δ P  1−δ P −δ2P    ≥ √ DT /˜ k−1  −W F , F 1−δ P

(A.8)

where the second inequality is due to (A.3). To further investigate the relationship and (A.8), we will derive   between (A.7)     the relationship between DT /˜ k  and DT /˜ k−1  . For convenience, we denote F F   L  = s {Zl  F }l=1 in Step 2.3 of Algorithm 3.1, then we can get    H k−1    R 

F

   H  =  (Y −  ˜ k−1  †˜ k−1 Y)    F    H  † =  (D + W −  ˜ k−1  ˜ k−1 (D + W))    F      H   H  ≤  (D −  ˜ k−1  †˜ k−1 D)  +  (W −  ˜ k−1  †˜ k−1 W)  . 





F







F

(A.9) = For the first part of the right-hand in the inequality of (A.9), we denote R † D − ˜ k−1 ˜ k−1 D, and  k−1

R

k−1

= (I−˜ k−1 †˜ k−1 )(T /˜ k−1 DT /˜ k−1 + T ∩˜ k−1 DT ∩˜ k−1 )  D /˜ k−1 T = [T /˜ k−1 , ˜ k−1 ] −†˜ k−1 T /˜ k−1 DT /˜ k−1

(A.10)

˜ k−1 , = T ∪˜ k−1 D where 

˜ T ∪

k−1

= [

˜ T /

k−1

˜ k−1 = [DT , ˜ k−1 ] and D 

˜ T /

k−1

, −(†˜ k−1 T /˜ k−1

DT /˜ k−1 )T ]T . The second equality of (A.10) is due to T ∩˜ k−1 DT ∩˜ k−1 −˜ k−1 †˜ k−1 T ∩˜ k−1 DT ∩˜ k−1 = 0. It should be pointed out that if W = 0, we have R k−1 = Rk−1 . For the second part of the right-hand in the inequality of (A.9), we have         H † †  (W − ˜ k−1 ˜ k−1 W)  = H (I − ˜ k−1 ˜ k−1 )W  F F  ≤ 1 + δ P W F .

(A.11)

By substituting (A.10) and (A.11) into (A.9), we have     H k−1  H  R  ≤  ˜ k−1    + 1 + δ P W F   k−1 D ˜    ∪   T F F     k−1  = H R  + 1 + δ P W F , F

(A.12)

208

Appendix A: Proof of Theorem 3.1

On the other hand, we have    H k−1    R 

   H k−1  ≥  R  T F F       H   H † † ≥  (D −   D) − (W −   W)    k−1 k−1  k−1 k−1 ˜ ˜ ˜ ˜ T T     F F     H  k−1  ≥  R  − 1 + δ P W F . T F

(A.13)

Combining (A.12) and (A.13), we have        H  k−1  k−1   ≥ HT R  − 2 1 + δ P W F .  R F

F

(A.14)

Due to the following inequality              H  k−1   ≥ HT R k−1  ≥ HT /˜ k−1 R k−1  ,  R F

F

F

(A.15)

(A.14) can be further expressed as the following inequality by removing the common ˜ k−1 , i.e., set of  and T /         H   H k−1  R  /T R k−1  ≥ {  − 2 1 + δ P W F , ˜ k−1 }/ T / F

  H here {

˜ T /

k−1 }/



F

(A.16)

   R k−1  can be expressed as F

       H    {T /˜ k−1 }/ R k−1  = HT /˜  k R k−1  F F    H ˜ k−1  =  /˜  k T ∪˜ k−1 D  T F   H ˜ k−1 k−1 =  /˜  k ({T ∪˜ k−1 }/{T /˜  k } D ˜ ˜ k } {T ∪ }/{T / T  ˜ k−1  k ) + T /˜  k D  ˜ T / F    H  k−1 ˜ ≥  /˜  k T /˜  k D /˜  k  T T F     ˜ k−1 k−1 − H /˜  k {T ∪˜ k−1 }/{T /˜  k } D k  ˜ ˜ {T ∪ }/{T / } F T      ˜ k−1   ˜ k−1  D ≥ (1 − δ P )D − δ     3P ˜ k T /  F  F    ˜ k−1   = (1 − δ P )DT /˜ k  − δ3P D  , F F (A.17) ˜ k−1 = φ and  ∪  ˜ k−1 =  ˜ k , the secwhere the first equality is due to  ∩  ˜ k−1 . ond equality is due to (A.10), of D   and the last equalityis due tothe definition     ˜ k−1    ˜ k−1  Since H /T R k−1  = H /T T ∪˜ k−1 D  ≤ δ3P D  , by substitutF F F ing (A.17) into (A.16), we have

Appendix A: Proof of Theorem 3.1

  (1 − δ P )D

 

˜ k  T / F

209

    ˜ k−1  ≤ 2δ3P D  + 2 1 + δ P W F . F

(A.18)

   ˜ k−1  It should be pointed out that for D  , we can further get F

       ˜ k−1    †   D  ≤ DT /˜ k−1  + ˜ k−1 T /˜ k−1 DT /˜ k−1  F F F     −1 H     H = D /˜ k−1  + (˜ k−1 ˜ k−1 ) ˜ k−1 T /˜ k−1 DT /˜ k−1  T F F    (A.19) δ2P      ≤ D /˜ k−1  + DT /˜ k−1  T F F 1 − δP  1 − δ P + δ2P    = DT /˜ k−1  , F 1 − δP ˜ k−1 . By substituting (A.18) into where the first inequality is due to the definition of D (A.19), we have     DT /˜ k−1  ≥ F

√   1 + δ P (1−δ P ) (1 − δ P )2   W F . (A.20) D ˜  k  − 2δ3P (1−δ P +δ2P ) T / F δ3P (1−δ P +δ2P )

Then, we investigate DT /˜ k , which can be expressed as         DT /˜ k  = DT ∩{˜  k /˜ k }+T /˜  k  F     F      ≤ DT ∩{˜ k /˜ k }  + DT /˜  k  F F          = D˜ k /˜ k  + DT /˜  k  , F

(A.21)

F

   ˜k ⊂ ˜  k . For  where we use the fact that  D˜  k /˜ k  , we can further obtain F

         D˜ k /˜ k  = D˜  k ∩{˜  k /˜ k } + E˜  k /˜ k  F F            ≤ D˜ k ∩{˜ k /˜ k }  + E˜ k /˜ k  F F           ≤ D˜ k ∩  + E˜ k /˜ k  F   F       = D˜  k ∩ − E F + E˜  k /˜ k   F   ≤ D  F + E  F + E˜  k /˜ k  F     = 0 + E  F + E˜  k /˜ k  F

≤ 2E F ,

(A.22)

210

Appendix A: Proof of Theorem 3.1 



where we introduce the error variable E = D˜  k − D˜  k ( D˜  k is obtained in Step ˜  k , and 2.3 of Algorithm 3.1), and  is an arbitrary set satisfying  c = P,  ⊂  ˜  k / ˜ k is the  ∩ T = φ. The second inequality in (A.22) is due to the fact that  discarded support in the step of support pruning in Algorithm 3.1. According to the definition of E, we further obtain      E F = D˜  k − D˜  k  F      † = D˜ k − ˜  k Y F      † (A.23) = D˜ k − ˜  k (D + W)    F      ≤ D˜  k − †˜  k D +†˜  k W F   F        † = D˜ k − ˜  k T DT  +†˜  k W . F

F

    For D˜  k − †˜  k T DT  , we can have F

         † † D˜ k − ˜  k T DT  = D˜  k − ˜  k (T ∩˜  k DT ∩˜  k + T /˜  k DT /˜  k ) F F    = (D˜  k−†˜  k T ∩˜  k DT ∩˜  k )−†˜  k T /˜  k DT /˜  k  F    † † = (D˜  k − ˜  k ˜  k D˜  k ) − ˜  k T /˜  k DT /˜  k  F      †    = D˜ k − D˜ k − ˜  k T /˜ k DT /˜ k  F     † = ˜  k T /˜  k DT /˜  k  F  δ3P    ≤ D ˜  k  , 1−δ2P T / F (A.24)     †  k ˜ where the last inequality is due to | |c = 2P. While for ˜  k W in (A.23), we F have      † (A.25) ˜  k W ≤ δ2P / 1−δ2P W F . F

By substituting (A.22)–(A.25) into (A.21), we can obtain   √     (1−δ2P )DT /˜ k  − 2δ P 1−δ2P W F   F . DT /˜  k  ≥ F 1−δ2P + 2δ3P

(A.26)

Appendix A: Proof of Theorem 3.1

211

Furthermore, by substituting (A.26) into (A.20), we can obtain     DT /˜ k−1  ≥ F

  (1 − δ P )2 (1 − δ2P )   DT /˜ k  F 2δ3P (1 − δ P + δ2P )(1 − δ2P + δ3P )  C1

√ (1 − δ P ) δ P (1 − δ P ) 1 − δ2P  − ( + 1 + δ P ) W F . δ (1 − δ P + δ2P ) (1 − δ2P + 2δ3P ) 3P  C2

(A.27)     As we have discussed, if Rk−1  F ≤ Rk  F , the iteration quits, which indicates that ˆ = ˜ k−1 . Then we can the estimation of the P-sparse signal D is obtained, and  combine (A.7), (A.8), and (A.27) to obtain     (A.28) DT /ˆ  ≤ C3 W F , F

where C3 =

2C1



√ 1−δ P +C2 1−δ 2P √ 2 . By substituting (A.6) into (A.28), we have

C1 (1−δ P −δ2P )−

1−δ P

   ˆ  ≤ C4 W F , D − D F

(A.29)



C3 (1−δ P +δ2P )+ 1+δ P . Thus we prove (3.21). Finally, in the iterative pro k−11−δ  P  k     have R > R F , and by substituting (A.7) and (A.8) into (A.27), F

where C4 =

cess, we we can obtain

   k−1  R  > C1 (1− δ P − δ2P ) Rk  F F 1 − δ 2P

 (1 − δ P − δ2P )(C1 + C2 1 + δ P )  − (1 + )W F . 1 − δ 2P

In this way, we prove (3.22).

(A.30)

Appendix B

Proof of (A.2)

We consider two matrices D and D have the structured sparsity as illustrated in (3.10), and both of them have the respective structured   support set 1 and 1 , where ¯  = D /D  and D ¯ = D/D F . According 1 ∩ 2 = φ. Moreover, we consider D F to (A.1), we can obtain    2   ¯ D  ≤ 2(1 + δ|1 |c +|2 |c ), 2(1 − δ|1 |c +|2 |c ) ≤ [1 , 2 ] ¯ 1  D 

(B.1)

   2  ¯  D   1 2(1 − δ|1 |c +|2 |c ) ≤ [1 , 2 ] ¯ 2  ≤ 2(1 + δ|1 |c +|2 |c ). −D

(B.2)

2

F

F

From (B.1) and (B.2), we obtain   ¯  , 2 D ¯ 2 } ≤ δ|1 | +|2 | , −δ|1 |c +|2 |c ≤ Re{ 1 D c c 1 A+B2

(B.3)

F −A−B F where for two matrices A and B, we have Re{ A, B } = . More4 over, we exploit the Cauchy-Schwartz inequality A F B F ≥ | A,B |, where the equality holds only for A = cB and c is a complex constant. Particularly,

     ¯   H ¯ 2  = D1  1 2 D F F

=

2

max

 

¯  ,2 D ¯ 2

1 D 1

max

 





D1 ,2 N D2 } ) ( Re{ 1 N

¯  =c  H  D ¯ D 1 1 2 2 ¯  =c  H  D ¯ D 1 1 2 2

(B.4)

≤ δ|1 |c +|2 |c ,

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al. (eds.), Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3

213

214

Appendix B: Proof of (A.2)

 where equality  constant,  of (B.4) is due to  and the second  c is a complex ¯  , 2 D ¯ 2 } = c Im{ H 2 D ¯ 2 , H 2 D ¯ 2 }=0. In this way, we Im{ 1 D 1 1 1 have  H      D  ≤ δ| | +| | D  , (B.5) 1

and (A.2) is proven.

2

2

F

1 c

2 c

2

F

Appendix C

Proof of (A.3)

Clearly, we have           † † (I−1 1 )2 D2  ≥ 2 D2  F −1 1 2 D2  , F

F

(C.1)

 2   For 1 † 1 2 D2  , we have F

 2     † † † 1 1 2 D2  = 1 1 2 D2 , 1 1 2 D2 F   = Re{ 1 † 1 2 D2 , 1 † 1 2 D2 }  = Re{ 1 † 1 2 D2 , 1 † 1 2 D2  +2 D2 −1 † 1 2 D2 }   = Re{ 1 † 1 2 D2 , 2 D2 }       ≤ δ|1 |c +|2 |c † 1 2 D2  D2  F F       † 1 1 2 D2   D  2 2 F F   ≤ δ|1 |c +|2 |c , 1−δ|2 |c 1−δ|1 |c

(C.2)

where the first inequality in (C.2) is due to (B.3), and the third equality in (C.2) is due to the following equality, 

1 † 1 2 D2 , 2 D2 −1 † 1 2 D2



† H † H H H = DH 2 2 (1 ) (1 2 D2 −1 1 1 2 D2 )

(C.3)

† H H H H = DH 2 2 (1 ) (1 2 D2 −1 2 D2 )

= 0. © Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al. (eds.), Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3

215

216

Appendix C: Proof of (A.3)

Here † 1 = (H1 1 )−1 H1 . Moreover, (C.2) can be expressed as     δ|1 |c +|2 |c 2 D2  F   † . 1 1 2 D2  ≤  F (1 − δ|1 |c )(1 − δ|2 |c )

(C.4)

By substituting (C.4) into (C.1), we have     † (I−1 1 )2 D2  ≥ (1 −  F

δ|1 |c +|2 |c (1 − δ|1 |c )(1 − δ|2 |c )

  )2 D2  F , (C.5)

Thus, the right inequality of (A.3) is proven. Finally, due to (C.3), we have  2  2      † †  D 2 =    D +  ) D  (I−  1 1 2 2 1 1 2 2  , 2 2 F F

which indicates

F

     †  D  ≥   ) D . (I−    2 2 F 1 2 2 1 F

Hence the left inequality of (A.3) is proven.

(C.6)

(C.7)

Appendix D

Derivation of Eq. (6.7)

Sampling the delay-domain continuous H q (τ ) in (6.30) with the sampling period Ts yields Lq ∞     H q,l p τ −τq,l  δ (τ −nTs ) H q (nTs ) = βq n=−∞

l=1

= βq

Lq ∞  

(D.1)

  H q,l p nTs − τq,l ,

n=−∞ l=1

where  and δ(·) represent the linear convolution operation and Dirac delta function, respectively. The Fourier transform of H q (nTs ) is then given by Lq ∞ βq   Hq( f ) = H q,l P( f )e−j2π f τq,l δ ( f −n f s ) , Ts l=1 n=−∞

(D.2)

where P( f ) is the Fourier transform of p(τ ). Obviously, H q ( f ) exhibits periodicity with period f s . Thus, H q ( f ) within a period of f ∈ [− f s /2, f s /2] can be expressed as Lq Lq βq  βq  P( f )H q,l e−j2π f τq,l ≈ C H q,l e−j2π f τq,l . (D.3) Hq( f ) = Ts l=1 Ts l=1 The approximation in (D.3) is valid because the PSF p(τ ) is designed to realize the ideal passband filter characteristics of P( f ) = C for f ∈ [− f s /2, f s /2] and P( f ) ≈ 0 for f ∈ / [− f s /2, f s /2]. For convenience, we consider C = Ts . Therefore, the frequency-domain channel matrix H q [k] at the kth subcarrier, where 0 ≤ k ≤ K − 1, can be written as  H q [k] = H q

k fs K

 = βq

Lq 

H q,l e−j

2πk f s τq,l K

.

(D.4)

l=1

© Beijing Institute of Technology Press 2024, corrected publication 2024 Z. Gao et al. (eds.), Sparse Signal Processing for Massive MIMO Communications, https://doi.org/10.1007/978-981-99-5394-3

217