119 92 32MB
English Pages 773 Year 2023
Signals and Communication Technology
Xingqin Lin Jun Zhang Yuanwei Liu Joongheon Kim Editors
Fundamentals of 6G Communications and Networking
Signals and Communication Technology Series Editors Emre Celebi, Department of Computer Science, University of Central Arkansas, Conway, AR, USA Jingdong Chen, Northwestern Polytechnical University, Xi’an, China E. S. Gopi, Department of Electronics and Communication Engineering, National Institute of Technology, Tiruchirappalli, Tamil Nadu, India Amy Neustein, Linguistic Technology Systems, Fort Lee, NJ, USA Antonio Liotta, University of Bolzano, Bolzano, Italy Mario Di Mauro, University of Salerno, Salerno, Italy
This series is devoted to fundamentals and applications of modern methods of signal processing and cutting-edge communication technologies. The main topics are information and signal theory, acoustical signal processing, image processing and multimedia systems, mobile and wireless communications, and computer and communication networks. Volumes in the series address researchers in academia and industrial R&D departments. The series is application-oriented. The level of presentation of each individual volume, however, depends on the subject and can range from practical to scientific. Indexing: All books in “Signals and Communication Technology” are indexed by Scopus and zbMATH For general information about this book series, comments or suggestions, please contact Mary James at [email protected] or Ramesh Nath Premnath at [email protected].
Xingqin Lin . Jun Zhang . Yuanwei Liu . Joongheon Kim Editors
Fundamentals of 6G Communications and Networking
Editors Xingqin Lin NVIDIA Santa Clara, CA, USA Yuanwei Liu Queen Mary University of London London, UK
Jun Zhang Department of ECE Hong Kong University of Science and Technology Kowloon, Hong Kong Joongheon Kim Korea University Seoul, Korea (Republic of)
ISSN 1860-4862 ISSN 1860-4870 (electronic) Signals and Communication Technology ISBN 978-3-031-37919-2 ISBN 978-3-031-37920-8 (eBook) https://doi.org/10.1007/978-3-031-37920-8 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.
Contents
Introduction to 6G Communications and Networking. . . . . . . . . . . . . . . . . . . . . . . Xingqin Lin, Jun Zhang, Yuanwei Liu, and Joongheon Kim
1
Part I 6G Vision and Driving Forces 6G: Vision, Applications, and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Benevides da Costa, Qiyang Zhao, Marwa Chafii, Faouzi Bader, and Mérouane Debbah
15
6G Visions and Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yifei Yuan
71
Next G Applications and Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mitch Tseng
93
Bridging the Digital Divide. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Maurilio Matracia, Aniq Ur Rahman, Ruibo Wang, Mustafa A. Kishk, and Mohamed-Slim Alouini Part II Enabling Technologies for 6G Communications AI-Native Air Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Jiajia Guo, Chao-Kai Wen, and Shi Jin Waveform and Modulation Design of Terahertz Communications . . . . . . . . . 165 Yongzhi Wu and Chong Han OTFS and Delay-Doppler Domain Modulation: Signal Detection and Channel Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Qinghua Guo, Zhengdao Yuan, Fei Liu, and Jinhong Yuan Index Modulation: From Waveform Design to Reconfigurable Intelligent Surfaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Ali Tugberk Dogukan, Miaowen Wen, and Ertugrul Basar
v
vi
Contents
Advanced Channel Coding for 6G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Kai Niu Developing NOMA to Next-Generation Multiple Access . . . . . . . . . . . . . . . . . . . . 291 Wenqiang Yi, Yuanwei Liu, and Zhiguo Ding Near-Field Beamforming and Multiplexing Using Extremely Large Aperture Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 Parisa Ramezani and Emil Björnson Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving Multiplicative Spectrum Efficiency for 6G Communications. . . 351 Wenchi Cheng, Liping Liang, Haiyue Jing, Hailin Zhang, and Zan Li Integrated Sensing and Communications for Emerging Applications in 6G Wireless Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 Zhen Du and Fan Liu Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to Simultaneous Transmission and Reflection (STAR) . . . . . . . . . . . . . . . . 399 Xidong Mu, Jiaqi Xu, and Yuanwei Liu Full-Duplex Transceivers for Next-Generation Wireless Communication Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 Ian P. Roberts and Himal A. Suraweera Optical Wireless Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 Iman Tavakkolnia, Hossein Kazemi, Elham Sarbazi, and Harald Haas Part III Network Evolution Towards 6G Cell-Free Massive MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501 Giovanni Interdonato and Stefano Buzzi 6G Radio Access Implementation: Challenges and Technologies. . . . . . . . . . . 533 Alan Gatherer, Chaitali Sengupta, and Sudipta Sen Network Disaggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547 Soohyun Park, Chanyoung Park, Jae Pyoung Kim, Minseok Choi, and Joongheon Kim AI-Native Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563 Hankyul Baek, Haemin Lee, Soohyun Park, Hyunsoo Lee, Jihong Park, and Joongheon Kim AI-Native Network Algorithms and Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . 573 Haemin Lee, Soohyun Park, Hankyul Baek, Chanyoung Park, Seokbin Son, Jihong Park, and Joongheon Kim
Contents
vii
Pareto Deterministic Policy Gradients and Its Application in 6G Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585 Zhou Zhou, Yan Xin, Hao Chen, Charlie Zhang, Lingjia Liu, and Kai Yang Ultra-Reliable and Low-Latency Communications in 6G: Challenges, Solutions, and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611 Changyang She and Yonghui Li Deterministic Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633 Jessie Hui Wang, Yipeng Zhou, and Yuedong Xu UAV Communications and Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667 Soohyun Park, Ju-Hyung Lee, Soyi Jung, and Joongheon Kim Non-terrestrial Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687 Xinrui Li, Yijia Huang, Cheng Zhan, and Yong Zeng Convergence of 6G and Wi-Fi Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719 Hyunsoo Lee, Soohyun Park, Minjae Yoo, Chanyoung Park, Hankyul Baek, and Joongheon Kim Semantic Communications and Networking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733 Won Joon Yun, Soohyun Park, Rhoan Lee, Jihong Park, Young-Chai Ko, and Joongheon Kim Network Security and Trustworthiness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 747 Soyi Jung, Soohyun Park, Seok Bin Son, Haemin Lee, and Joongheon Kim Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763
Introduction to 6G Communications and Networking Xingqin Lin, Jun Zhang, Yuanwei Liu, and Joongheon Kim
1 Book Objectives The 3rd Generation Partnership Project (3GPP) completed the first release of the fifth-generation (5G) mobile communication systems in its Release 15 in June 2018. Since then, service providers have been deploying commercial 5G networks worldwide. The 5G networks provide new communication capabilities and new services that transform our society. In February 2021, the International Telecommunication Union (ITU) Radiocommunication Sector (ITU-R) published the Recommendation ITU-R M.2150 on the International Mobile Telecommunication (IMT)-2020 specifications (a.k.a., 5G specifications). This major milestone marks the official ITU approval of the IMT-2020 technologies. 3GPP continues 5G evolution to further improve performance and address new use cases after Release 15. The next significant wave of 5G evolution is 5G Advanced, starting from 3GPP Release 18 which addresses enhanced mobile broadband and expanded vertical use cases across the end-to-end 5G system.
X. Lin (O) NVIDIA, Santa Clara, CA, USA e-mail: [email protected] J. Zhang Hong Kong University of Science and Technology, Hong Kong, China e-mail: [email protected] Y. Liu Queen Mary University of London, London, UK e-mail: [email protected] J. Kim Korea University, Seoul, South Korea e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_1
1
2
X. Lin et al.
The expectations from society, industries, and consumers on future networks are growing. The future networks must be trustworthy, resilient, secure, and privacypreserving. Sustainability must be at the core of the future networks to achieve carbon neutrality in the coming decades. The future networks need to be costefficient to be deployed in various environments, including urban, suburban, rural, and remote areas, to bridge the digital divide. The new capabilities provided by the future networks will need to enable multisensory experiences, enhance personalized services, and improve human-human, human-machine, and machinemachine interactions. The growing expectations and needs of society, industries, and consumers about the future networks will eventually be beyond what 5G can address. This calls for further evolution of mobile communication systems to the sixth generation (6G). To meet the expectations of the future networks, we need to develop new advanced technologies such as artificial intelligence (AI) native communication and networking, distributed cloud, and network virtualization, among others. 6G research is ramping up. We anticipate seeing large-scale 6G deployments in the 2030s. ITU-R Working Party 5D (WP 5D) has been working on a new draft Recommendation on “IMT Vision for 2030 and beyond,” including user and application trends, the evolution of IMT, usage scenario, capabilities, framework, and objectives for 6G. It is expected that the ITU will issue its IMT-2030 vision document in 2023. WP 5D is also developing a new draft on future technology trends to provide information on the technical and operational characteristics of terrestrial IMT systems. Meanwhile, we anticipate that 3GPP will start the studies on 6G around 2025 and complete the first release of 6G specifications around 2028. Once 3GPP starts the 6G work, the chances of new technologies becoming part of the 6G standards and commercial products reduce quickly. We need urgency in the coming years to develop basic and advanced technologies that will underpin 6G systems. The main objective of this book is to provide an accessible, complete stateof-the-art treatment of the fundamentals of 6G communications and networking. This book will cover the foundation of an end-to-end 6G system, key radio technology components for 6G communications, and major enablers for intelligent 6G networking. We anticipate that a substantial portion of the book will be of interest even beyond 6G. The following sections present more details about the intended scope of this book. We first provide a historical overview of the evolution of cellular systems from 1G to 5G. We then present a vision of what 6G will be. The following section describes the major 6G research activities worldwide. We close the chapter by outlining the contents of the remaining chapters of this book.
2 Evolution of Cellular Systems The evolution of wireless communications can be traced back to the nineteenth century. Following the pioneering research on electromagnetic radiation by Maxwell
Introduction to 6G Communications and Networking
3
and Hertz, Marconi demonstrated that signals could be transmitted to an unprecedented distance of 2000 miles across the Atlantic Ocean using radio waves. By the end of the 1920s, wireless communications were mainly utilized by police, gangsters, and the rich and famous. With the development of wireless technology, early mobile telephone systems were introduced in Europe, including the Swedish system in 1957 and the Norwegian system in 1966. By developing cellular technology, Bell Labs launched the world’s first cellular system in Chicago in 1977. Later in 1979, Nippon Telegraph and Telephone Corporation (NTT) established the first nationwide commercial cellular system in Japan, which signified the dawn of the first-generation (1G) network. Soon after, various nationwide 1G systems were developed during the 1980s. However, 1G systems mainly employed analog frequency modulation combined with frequency division multiple access (FDMA), which had a limited capability of digital signal processing. Hence, the achieved speech quality and capacity were poor. Due to the lack of encryption, 1G systems also suffered from a high susceptibility to undesired eavesdropping. Moreover, the heavy weight, large size, and excessively short battery life of the 1G mobile phones limited the practical wireless applications. To circumvent the drawbacks of 1G systems, the standardization of the secondgeneration (2G) system, namely, the Global System for Mobile Communications (GSM), was established by the European Telecommunications Standards Institute (ETSI) in 1988. GSM replaced the 1G analog radio communications with the novel digital international mobile communications, which rapidly became the predominant 2G standard across the globe. The time-division multiple-access (TDMA) technology was exploited to improve spectrum sharing. Following the success of GSM, a variety of digital standards were put forward, including the Digital Advanced Mobile Phone System (D-AMPS) and the Interim Standard (IS)95 system. IS-95 was the first ever digital communication technology to introduce code-division multiple-access (CDMA), which enabled multiple radios sharing the same time and frequency resource to be distinguished in the code domain. Owing to these disruptive wireless technologies, the data speed achieved by 2G had increased to 64 kbps, while that of 1G was 2.4 kbps. Moreover, different from 1G which only supported voice service, 2G further introduced the Short Message Service (SMS) in 1992 to offer texting in mobile networks. 2G systems also supported roaming and digital conversation encryption that were historically infeasible. In pursuit of higher bit rate communications, the third-generation (3G) systems attracted increasing research attention during the early 1990s. 3G systems aimed to provide high data speeds and global roaming. Moreover, advanced mobile applications should be supported, such as video, mobile TV, and location-based services (LBS). In 1999, the family of 3G standards was formally named as IMT2000 by ITU. During the 3G standardization progress, CDMA became the dominant multiple access technology owing to its spectral efficiency and economic advantage. As a part of the ITU IMT-2000 family, Universal Mobile Telecommunications System (UMTS) was developed by 3GPP, which upgraded the core network of GSM and introduced the Wideband CDMA (W-CDMA) as the multiple access scheme. Thereafter, it became the predominant 3G system in Europe and Japan. As another
4
X. Lin et al.
competitive world-leading 3G system, the CDMA2000 system has emerged as an evolution of the IS-95 standard developed by a parallel organization 3GPP2, which provided the first multi-carrier CDMA system. During the 2000s, the popularity of smart phones such as the iPhone has spurred demands for the mobile broadband experience and mobile Internet access. This motivated the 3GPP organization to develop Long-Term Evolution (LTE) and further the LTE-Advanced (LTE-A) systems, leading to the standardization of the fourth-generation (4G) systems. Different from previous generations, LTE systems fully relied on the Internet Protocol (IP)-based packet-switched networks, whereas the conventional circuit-switched networks were completely suspended. Orthogonal frequency division multiple access (OFDMA) was introduced, which divided channel bandwidth into orthogonal OFDM subcarriers to prevent intersubcarrier interference and inter-symbol interference. By flexibly allocating OFDM subcarriers to multiple users, the capacity can be improved compared to CDMA and TDMA. Despite the impressive achievements of 4G, the explosion of smart devices and innovative applications in the past decade had presented unprecedented quality-ofservice (QoS) requirements in terms of the multi-gigabit-per-second (multi-Gbps) data speed, massive connectivity, and ultrahigh reliability and low-latency communications. These ambitious aspirations significantly overwhelmed the network capability of 4G systems and expedited the 5G system evolution. In 2017, 3GPP defined and specified a novel radio interface, known as 5G New Radio (NR). The flexible slot structure was introduced to stringently guarantee different latency requirements. Moreover, scalable OFDM waveforms and numerology were enabled to support both the sub-6 GHz spectrum and the newly exploited millimeter-wave (mmWave) spectrum. While 4G LTE supported basic multiple-input multiple-output (MIMO) communications, 5G NR further introduced massive MIMO with advanced large-scale antenna architecture to enhance the system capacity and connectivity. As an expansion to meet the requirements of vertical industries, 5G NR has been evolved with enhanced features such as sidelink communications for vehicle-toeverything (V2X) applications, high-precision positioning, and support for satellite communications, among others.
3 What Will 6G Be? As the 5G standard is being further evolved, and commercial 5G networks are being deployed around the world, it is time to think about 6G. While being very ambitious with brand-new application scenarios and stringent performance targets, 5G ends up more as an evolutionary upgrade than a revolutionary leap. For example, the deployed 5G networks mainly improve performance for the enhanced mobile broadband (eMBB) scenario, but with limited support for the two other key scenarios, namely, massive machine-type communications (mMTC) and ultrareliable and low-latency communications (URLLC). Moreover, mmWave
Introduction to 6G Communications and Networking
5
communication, regarded as a key enabler for 5G, has encountered many practical difficulties and has not been widely deployed. It is expected that the later versions of 5G will partially address some of these issues and achieve further performance enhancement. Nevertheless, given the emerging applications and new performance requirements, a new generation of wireless networks, i.e., 6G, will be required to fully address the demands, and disruptive technologies and architectural innovations will be needed. Unprecedented enthusiasm has been witnessed on the discussion and research on 6G, with many visionary papers and articles erupted since 2019, when the deployment of 5G networks barely started. On the one hand, this phenomenon reflects the increasingly significant role of wireless networks in our society and daily life, as well as the great interest from different parties on their further evolution; on the other hand, it demonstrates the diverse views on the application scenarios and performance indicators, which in turn indicate the tremendous challenges faced by 6G. As we are at the nascent stage of 6G standardization, the overall picture is not clear yet. In this section, we will provide some of our visions and summarize a few key trends in the 6G development, while detailed elaborations on individual topics will be covered in the later chapters.
3.1 New Application Scenarios and Demands Mobile Internet is a representative revolution brought by wireless networks. Particularly, the mobile broadband access of 4G LTE networks has empowered various groundbreaking mobile applications, such as online-to-offline commerce, mobile payment, and mobile social networks. This great success is driving broader application scenarios. Besides enhancing mobile broadband services, 5G is expected to support a plethora of new applications, ranging from smart city and smart home, augmented/virtual reality (AR/VR), to industry automation. These mMTC and URLLC scenarios are highly challenging, and it is unlikely that 5G will be able to fully support them. Moreover, emerging applications pose even more stringent performance requirements, e.g., eHealth, holographic telepresence, Industry 4.0, and digital twin. 6G will have to meet the full specifications of these application scenarios by achieving very high data rates (up to 1 Tb/s), extremely low latency, massive and near-instantaneous connectivity, and ubiquitous broadband space-airground-sea coverage. Meanwhile, fueled by recent breakthroughs in deep learning, we are witnessing an upsurge in AI-enabled use cases, e.g., smart healthcare, autonomous vehicles, and service robots. AI as a service (AIaaS) has been envisioned as a critical new application scenario for 6G networks, which are expected to provide seamless support for ubiquitous AI-enabled services. Thus, 6G will revolutionize wireless from connected things to connected intelligence, enabling interconnections between humans, things, and intelligence within a deeply intertwined and hyper-connected cyber-physical world. To support effective distributed learning (e.g., federated
6
X. Lin et al.
learning) and inference (e.g., multi-vehicle perception), 6G will incorporate innovative network architectures that can facilitate the integration of communications, intelligence, sensing, control, and computing, with a device-edge-cloud synergy. This trend will place edge AI as one main theme in 6G, which will leverage the proximate edge computing resources for the effective data acquisition, storage, and processing. Moreover, new communication protocols and design principles will also be developed, e.g., semantic and task-oriented communication. 6G networks also need to consider other performance requirements, including sustainability and security. The wide deployment of 5G networks has raised serious concerns on their carbon footprint and environmental effects, as well as the operational cost. 6G networks must achieve higher energy efficiency, reduce the dependency on nonrenewable energy sources, and use more renewable and ambient sources of energy. They should also support regenerative/zero-energy devices with energy harvesting capabilities. As wireless networks get more and more complex, and more user data and novel IoT devices are involved, trust, security and data privacy issues become ever more prominent. A trustworthy 6G is needed, which requires efforts spanning technology, regulation, techno-economics, politics, and ethics. Particularly, from the technical aspect, potential solutions include physical layer security techniques, quantum communication and computing technologies, and blockchain-based solutions.
3.2 Key Technologies for 6G The performance enhancement in wireless communications is essentially obtained via exploiting additional radio resources in the frequency and space domains. Particularly, the three key technologies for 5G are network densification, massive MIMO, and mmWave communication. While the first two increase the network throughput and spectral efficiency by deploying more access points and more antennas at each access point, i.e., exploiting the space domain resources, mmWave communication is to expand the spectrum. To meet the new performance requirements, the development of 6G technologies will follow similar principles. In the space domain, 6G will continue the trend of network densification that started in 5G. 3D networking that integrates terrestrial, airborne, and satellite networks will be explored to provide ubiquitous space-air-ground-sea coverage. Meanwhile, the cell-free or cell-less network architecture will help to boost the network throughput and provide more uniform coverage, which will be supported by high-capacity backhaul and fronthaul networks, as well as advanced distributed signal processing techniques. On the other hand, massive MIMO has proven to be the most successful technology for 5G, and its evolution will continue in 6G, toward ultra-massive MIMO antenna systems with thousands of antennas. The implementation of ultra-massive MIMO is made possible because of new plasmonic materials such as graphene and nanomaterials that can be used to build nano-antennas and
Introduction to 6G Communications and Networking
7
transceivers. Meanwhile, effective high-dimensional channel estimation and signal processing techniques will be essential to realize the substantial performance gains. New spectrum technologies will be a critical enabler for 6G, and the efforts in unleashing more spectrum resources will continue. MmWave communication will be made more mature, in both hardware and algorithms. Meanwhile, THz communication is promising to address some of the critical problems in 6G by providing spectrum resources to achieve terabits per second (Tbps) data rates. Many countries are opening up such bands, e.g., the US FCC already allocated the 95 GHz to 3 THz spectrum for 6G experimental purposes. More research will be needed on new packaging/interconnect techniques, THz transceiver design, channel measurements, and modeling. Considering the new characteristics and potential impairments of mmWave and THz hardware components, algorithm-hardware codesign will be important. Moreover, spectrum sharing and coexistence, supported by intelligence spectrum sensing and access algorithms, will also play important roles in 6G. Subsequently, an old idea, namely, cognitive radio, will enjoy a revival and evolve into intelligent radio. There are other disruptive technologies that are also promising for 6G. For example, intelligent surface-assisted communication has recently received great attention. There are many names of the same concept, such as reconfigurable intelligent surfaces (RIS), holographic beamforming, and holographic MIMO. It is envisaged as massive MIMO 2.0. Such systems have the capability of manipulating electromagnetic waves, e.g., steering, backscattering, and absorption, and they can effectively control the amplitude, phase, and frequency of RF signals without resorting to complex signal processing techniques. The orbital angular momentum (OAM) provides another opportunity to increase the spectrum efficiency. Electromagnet waves contain angular momentum, which comprises spin angular momentum (SAM) and orbital angular momentum (OAM). While SAM has been used in radar systems, OAM has not been exploited yet. Different from frequency/time/codedomain-based orthogonal division, OAM offers a new mode domain to support the orthogonal access of multiple users and thus can significantly increase the spectrum efficiency in wireless communications.
3.3 Native Artificial Intelligence (AI) in 6G 6G networks need to serve a vast number of heterogeneous mobile devices, provide diverse mobile applications, and meet stringent demands for data rate, energy efficiency, and latency. 6G is in turn characterized by densely deployed access points, enormous antenna arrays, and a wide range of frequency bands (including the mmWave and THz spectrum). This brings formidable modeling difficulties (e.g., to capture complex temporal and spatial variations in wireless channels), algorithmic challenges (e.g., to solve non-convex mixed integer programming problems with high-dimensional variables), and prohibitive operating expenses (OPEX) and capital expenditure (CAPEX). Conventional design approaches based
8
X. Lin et al.
on explicit modeling and mathematical optimization are losing their effectiveness. While specifications differ, there is a consensus that AI will serve as a defining technology. Different standard organizations and industrial alliances have reached a consensus that AI-enabled technologies will be key enablers for 6G networks. For example, 3GPP has defined a network data analytics function that provides interfaces for AI model development and implementation in 5G already, and the European Telecommunications Standards Institute (ETSI) has also established an industry specification group to leverage AI for network management. AI will play critical roles in designing and optimizing 6G architectures, protocols, and operations. Particularly, the design of the 6G architecture shall follow an AI-native approach where intelligentization will allow the network to be smart, agile, and able to learn and adapt itself according to the changing network dynamics. Thanks to their flexibility, context awareness, adaptivity, and the powerful capability to automatically extract useful features from data, AI-based design approaches have recently been developed for different layers in wireless networks. For the physical layer, AI methods have been applied to problems such as channel estimation, signal detection, precoder design, and end-to-end transceiver design. For the medium access control (MAC) layer, AI-based algorithms have been developed for radio resource management, user scheduling, and power control. AI-enabled network planning, optimization, and management have also been explored. Nevertheless, there are major obstacles for the practical application of AI-based solutions, ranging from model training, deployment, trustworthiness, and safety. The training of deep learning models heavily depends on high-quality large-size training datasets, which are not as widely available for wireless networks as for other application domains. The intensive computation requirement of large-size models poses serious challenges for real-time execution, which calls for effective design and compression of neural network architectures, as well as the development of AI accelerators. The black box nature of deep learning-based methods causes serious concerns on the safety and robustness, for which model interpretability and theoretical guarantees will be imperative. It is largely open how deep learningbased methods can achieve the same level of reliability as traditional methods. Thus, it is important to carefully identify the suitable application scenarios of AIbased methods. The intelligentization of 6G will require joint efforts from both the wireless communication community and machine learning/AI community, with endeavors from both the academia and industry.
4 Book Outline This book provides an accessible, comprehensive state-of-the-art treatment of the fundamentals of 6G communications and networking. The rest of this book consists of three parts.
Introduction to 6G Communications and Networking
9
• Part I—6G Vision and Driving Forces: The first part presents 6G vision, driving forces, key performance indicators, and societal requirements on digital inclusion, sustainability, and intelligence. – Chapter “6G: Vision, Applications, and Challenges” presents a vision on what 6G will be as well as its potential applications and main challenges for its successful implementation. – Chapter “6G Visions and Requirements” discusses three key scenarios for 6G mobile networks and proposes key performance indicators and potential enabling technologies. – Chapter “Next G Applications and Use Cases” focuses on 6G applications and use cases that will improve the quality of life through providing an immersive digital world experience. – Chapter “Bridging the Digital Divide” summarizes the recent developments aimed at bridging digital divide from a technical and economic standpoint. • Part II—Enabling Technologies for 6G Communications: The second part presents key enabling technology components for the 6G communications to deliver extreme performance. – Chapter “AI-Native Air Interface” explores the use of AI as a key technology in 6G air interface. – Chapter “Waveform and Modulation Design of Terahertz Communications” discusses waveform and modulation design of terahertz communications in 6G. – Chapter “OTFS and Delay-Doppler Domain Modulation: Signal Detection and Channel Estimation” provides an overview of the orthogonal time frequency space (OTFS) waveform and discusses signal detection and channel estimation in an OTFS system. – Chapter “Index Modulation: From Waveform Design to Reconfigurable Intelligent Surfaces” offers a comprehensive overview of the index modulation technologies and outlines potential research directions. – Chapter “Advanced Channel Coding for 6G” reviews the advancement of channel coding for 6G wireless communications. – Chapter “Developing NOMA to Next Generation Multiple Access” focuses on applying non-orthogonal multiple access (NOMA) to 6G networks. – Chapter “Near-Field Beamforming and Multiplexing Using Extremely Large Aperture Arrays” describes near-field beamforming and multiplexing using extremely large aperture arrays (ELAA). – Chapter “Orbital-Angular-Momentum Embedded Massive MIMO: Achieving Multiplicative Spectrum Efficiency for 6G Communications” proposes orbital angular momentum (OAM)-embedded massive MIMO to improve spectral efficiency for 6G communications. – Chapter “Integrated Sensing and Communications for Emerging Applications in 6G Wireless Networks” provides a comprehensive overview of the
10
X. Lin et al.
integrated sensing and communications (ISAC) technology for emerging applications in 6G wireless networks. – Chapter “Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to Simultaneous Transmission And Reflection (STAR)” presents an overview of reconfigurable intelligent surface (RIS)-assisted wireless communications with a focus on the simultaneous transmission and reflection RIS. – Chapter “Full-Duplex Transceivers for Next-Generation Wireless Communication Systems” discuses full-duplex transceivers for 6G wireless communication systems. – Chapter “Optical Wireless Communication” describes the fundamental aspects of optical wireless communications. • Part III—Network Evolution Toward 6G: The third part describe key network evolution directions toward 6G. – Chapter “Cell-free Massive MIMO” provides a comprehensive treatment of distributed MIMO with a focus on its latest enhanced embodiment known as cell-free massive MIMO. – Chapter “6G Radio Access Implementation: Challenges and Technologies” examines the implementation challenges and technologies for open radio access networks (open RAN). – Chapter “Network Disaggregation” reviews the newly proposed network disaggregation technology for 6G networks. – Chapter “AI-native Communications” discuses AI-aided and AI-native optimization methods with a focus on physical layer. – Chapter “AI-Native Network Algorithms and Architectures” discuses AIaided and AI-native optimization methods with a focus on medium access control (MAC) layer. – Chapter “Pareto Deterministic Policy Gradients and Its Application in 6G Networks” proposes a reinforcement learning-based approach to jointly optimize cell load balance and network throughput for 6G cellular systems. – Chapter “Ultra-Reliable and Low-Latency Communications in 6G: Challenges, Solutions, and Future Directions” focuses on ultrareliable and lowlatency communications (URLLC) in 6G by presenting their challenges, solutions, and future directions. – Chapter “Deterministic Network” reviews the latest developments for deterministic networks, including framework, key techniques, and protocol stack, among others. – Chapter “UAV Communications and Networks” discusses unmanned aerial vehicle (UAV) communications and networks. – Chapter “Non-terrestrial Network” presents an overview of non-terrestrial network (NTN), which is anticipated to become an integral part of 6G mobile communication networks. – Chapter “Convergence of 6G and Wi-Fi Networks” focuses on the convergence of 6G and Wi-Fi networks.
Introduction to 6G Communications and Networking
11
– Chapter “Semantic Communications and Networking” reviews the concepts of semantic communication and semantic networking. – Chapter “Network Security and Trustworthiness” discusses network security and trustworthiness issues in 6G. Acknowledgment This work of Jun Zhang was supported by the Hong Kong Research Grants Council under the Areas of Excellence scheme grant AoE/E-601/22-R.
Part I
6G Vision and Driving Forces
6G: Vision, Applications, and Challenges Daniel Benevides da Costa, Qiyang Zhao, Marwa Chafii, Faouzi Bader, and Mérouane Debbah
1 Brief Overview of 5G The development of 5G has empowered a technological revolution from connecting people to connecting things. Since 5G supports stringent bandwidth, latency, and reliability requirements, new services and applications have emerged, such as extended reality (XR), Internet of Things (IoT), Internet of Vehicles (IoV), industry automation, and mission critical communications. This has been stimulated by thorough end-to-end (E2E) innovations across radio access network (RAN), core network (CN), and transport network (TN), promoting enhanced mobile broadband (eMBB), ultrareliable low-latency (URLLC), and massive machine-type communications (mMTC) capabilities, which pushes network performances up to 10 Gbit/s peak data rate, 1 ms E2E latency, 99.9999.% reliability, 1,000,000/km.2 devices, and 500 km/h mobility. The impressive 5G capabilities and applications have been driven by a number of novel features, technologies, and architectures standardized in the E2E telecommunication system [1, 2]. Standardization of 5G in 3GPP began from Release 15, which specifies 5G Core (5GC) and 5G New Radio (5G NR) to support eMBB, URLLC, and mMTC scenarios [3]. The non-standalone (NSA) and standalone (SA) access architectures were specified to support dual connectivity (DC) between 4G and 5G. In 5GC, network
D. B. da Costa (O) · Q. Zhao · F. Bader · M. Debbah Technology Innovation Institute, Masdar City, Abu Dhabi, UAE e-mail: [email protected]; [email protected]; [email protected]; [email protected] M. Chafii New York University Abu Dhabi, Abu Dhabi, UAE NYU Wireless, NYU Tandon School of Engineering, New York, NY, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_2
15
16
D. B. da Costa et al.
function (NF) constitutes the system architecture. Control and user plane separation (CUPS) allows the network operator to place user plane function (UPF) close to the application and the edge, reducing therefore the latency. Service-based architecture (SBA) has been developed to simplify NF connections through a common pipeline over HTTP/2 protocol. The network repository function (NRF) allows autonomous management of network services, and the network exposure function (NEF) enables communication with application functions (AF). The network data analytics function (NWDAF) empowers data collection and analysis with integration of machine learning (ML) and artificial intelligence (AI) in further releases. Network slicing has been built to support diverse services and performances. In 5G NR, two frequency ranges were defined: FR1 (450 MHz–6 GHz) and FR2 (24.25–52.6 GHz). Carrier bandwidth reaches 100 MHz in FR1 and 400 MHz in FR2. Physical layer supports carrier aggregation and dual connectivity (DC/CA) up to 16 carriers. Massive multiple-input multiple-output (MIMO) has been supported by hybrid digital analog beamforming in high frequency, providing antenna elements up to 256 at the base station and 32 at user equipment (UE). Flexible frame and physical channel structure were developed with multiple numerology, with subcarrier spacing ranging from 15 to 120 KHz. 5G support to vertical industry has been enhanced in Release 16 [4]. Specifically, time-sensitive networking (TSN) has been integrated into 5G to support industrial Internet of Things (IIoT), such as smart factory, smart grid, and smart port. This has enabled 5G NR to support deterministic network in manufacture with strict time synchronization and URLLC requirements. URLLC was further enhanced with PDCCH monitoring, multiple hybrid automatic repeat request ACK (HARQ-ACK), multiple physical uplink channel (PUSCH) scheduling, and inter-UE priority and multiplexing. Packet data convergence protocol (PDCP) duplication has been supported on four DC/CA legs to enhance reliability in poor radio channel. Nonpublic network (NPN) was developed for 5G private network, including non-standalone deployment, which leverages 5G E2E network slicing to share control plane of RAN and CN with operators; and standalone deployment allowed dedicated control, user plane, and spectrum. 5G NR in unlicensed spectrum (NR-U) was developed on 5 and 6 GHz, and it operates in licensed-assisted access (LAA) mode which utilizes operator’s spectrum as anchor to support high-density scenarios; and standalone mode allowed the support of individual 5G access point or private network. Vehicle-to-everything (V2X) was studied over 5G NR to support platooning, sensor cooperation, and remote driving. 5G NR positioning was developed over beambased MIMO to support 3 m indoor and 10 m outdoor accuracy. Furthermore, several enhancements on 5G functionalities have been studied. The two-step random access (RACH) was introduced to reduce latency and signaling cost. Integrated access and backhaul (IAB) was developed for wireless backhaul on mmWave and inband. In mobility enhancement, dual active protocol stack (DAPS) was introduced to enhance handover reliability using DC. Multi-TRP with massive MIMO was supported to enhance cell edge performance, and wakeup signal (WUS) was introduced for cross-slot scheduling to reduce UE power consumption.
6G: Vision, Applications, and Challenges
17
Release 17 completed the first phase of 5G [5]. The FR2 spectrum was extended to 71 GHz. Reduced capability (RedCap) was developed to support low power, low complexity, massive connection, and wide coverage mid-range IoT devices, including industry sensors, cameras, and wearable devices. Non-terrestrial network (NTN) was studied to integrate satellite with narrowband IoT (NB-IoT) and enhanced MTC (eMTC) networks. NR coverage enhancement was developed on physical channel at low frequency. Indoor positioning accuracy has been enhanced to 20 cm. Extended reality (XR) over NR was evaluated for the first time to optimize latency, processing, and power consumption. Radio resource control (RRC) inactive state was introduced to support NR small data transmission, by maintaining UE connection (RAN context, UE capability) in sleep mode to enable fast access, low latency, and low power. NR sidelink was enhanced for V2X to support public safety, emergence services, and V2P communications. Further enhancements in UE power saving, multi-RAT DC, multi-SIM, dynamic spectrum sharing, and RAN slicing were studied. 5G-Advanced research began in Release 18 [6]. It aimed to offer improved experience for people and machines, extensions for new use cases, and expansions to offer new services beyond pure communication. The enhanced experience was empowered by better support of XR over NR, with further development of MIMO, improving mobility and flexible duplexing. The extended use cases included improved coverage [8], enhanced low-cost massive IoT, further support of NTN, and drones. The expanded services include enhanced positioning with sub-10 cm indoor and outdoor, time synchronization as a service for smart power grid control, industrial automation, and real-time financial transactions. 5G-Advanced will be powered by gradual introduction of AI and ML enablers, network slicing enhancements, wireline and wireless convergence, network coordination, and energy efficiency enhancement. Furthermore, operational enhancements for network infrastructure and devices’ energy efficiency will be in focus to reduce operational expenses (OPEX). 5G-Advanced is expected to complete around the end of 2023 on major Release 18 items and will be available in early 2024. The research will continue to innovate for Releases 19 and 20 that provide a bridge into the 6G era. The roadmap of 5G standardization is shown in Fig. 1. Despite that 5G has successfully realized eMBB, URLLC, and mMTC scenarios, and 5G-Advanced has expanded the era in use cases, experience, service, and operation, there exists a capability gap where 5G cannot achieve at its completion. New applications and businesses will continue to explode in the following years.
Fig. 1 5G standardization roadmap [7]
18
D. B. da Costa et al.
For example, XR services will expand from point to point to a worldwide coverage, providing immersive and haptic experience everywhere. This will require the communication network to integrate, digitize, and transfer information between immersive sensors and haptic devices effectively. For instance, the network should understand to converge the data generated by the sensors of image, sound, smell, and radio wave. 5G can hardly achieve the capacity and performance requirements to deliver such information for target applications. Firstly, the bandwidth consuming data inherent to immersive videos or point clouds can hardly be afforded by 5G network. Secondly, the extreme latency and reliability requirements of haptic experience cannot be achieved by 5G technologies. Thirdly, the power consumption of immersive and haptic applications will be significantly higher than nowadays mobile services, in which the 5G system design is not sufficient to jointly support an energy-efficient sensing, communication, and computing for such applications. The trend of digitization and transformation envisaged by key industry sectors, such as transportation, logistics, commerce, production, health, smart cities, and public administration, is expected to keep growing. In 2030, human intelligence will be augmented by tightly coupled and seamlessly intertwined with network and digital technologies, and AI will enable machines to transform data into reasoning and decisions that assists humans to understand and act better in the world. The network will be able to provide seamless integration of physical, digital, and human worlds. This will call for the communication system to be redesigned, not only to serve the massively growing traffic and exploding number of devices and services but also to understand the information from diverse physical and digital domains, and effectively interact, reasoning with different humans, devices, robots, vehicles, and industries. Furthermore, the network will enable higher possible standards of energy efficiency and strong security, privacy, and efficiency in deployment and operation to enable sustainable growth in a trustworthy way.
2 6G: Objectives, Technologies, and Roadmap 6G has been popularly known as the next generation of wireless communication, transiting from an era of connecting people and connecting things to connecting an intelligent society. The rise of AI with the creation of digital twins will become key foundations driving future technologies. 6G will be designed for and will be built on these foundations, expanding network from everything connected to everything sensed and intelligent. On the one hand, the connection performance will be pushed further to multi-Tbit/s data rates, sub-millisecond latency, and seven nines reliability. On the other hand, sensing and intelligence will be native to wireless network. Furthermore, sustainability and trustworthiness will be the foundation of wireless network design. To target 6G visions and capabilities, new technologies have been widely studied by the scientific community. Spectrum can be seen as the foundation of wireless communications, in which 5G has expanded frequencies from sub-6 GHz
6G: Vision, Applications, and Challenges
19
to mmWave frequency bands. 6G will investigate sub-THz band to provide ultrawide bandwidth beyond tens of GHz. This is expected to support extreme data rate, latency, and capacity for applications like immersive XR, digital twin, and haptic Internet. However, THz communication faces challenges due to the hostile propagation environment and high-power consumption, requiring innovations in signal processing, transceiver, antenna, electronics, and so on. Furthermore, visible light communication (VLC) has the potential to provide Tbit/s data rate for short-range communication with low-power consumption. This can be utilized for vehicle-to-vehicle (V2V), sensors, and server communications. Integration of sensing into communication is envisioned as a disruptive technology in 6G. The communication system has the potential to expand simple positioning to object detection, tracking, imaging, and reconstruction. This will be empowered by mmWave and THz frequencies with short wavelength, as well as massive antenna arrays and cells providing high accuracy. Compared to sensing with dedicated devices for radar and light detection and ranging (LiDAR), radio sensing has the benefit of wide coverage with massive devices, which can also compensate for the standalone sensing to deliver a full immersive sensing system. Furthermore, sensing has the capability to improve communication systems, such as better prediction of device mobility and channel blockage, and further optimize cell deployment, beam management, and resource and power allocation. AI and wireless system will be tightly integrated from the design of 6G. This should include utilizing AI as a service and feature in 6G and to connect intelligent AI agents. Firstly, AI powered by data-driven mechanisms will be integrated with classical model-driven design on different components of communication systems. The traditional wireless PHY, MAC, and RRM functionalities will be driven by AI algorithms, models, and frameworks. This does not mean that AI will act as an optimization tool for a specific feature, as considered in 5G, but rather it will act as a holistic processing function in both user and control plane protocols learned from data. Significant innovations in mathematical theories for wireless AI are essential, including the model architecture, learning algorithm, computing framework optimized for wireless signals, environment, and traffic. Secondly, 6G network will integrate AI into devices and links. This calls for distributed intelligence on devices including distributed training, inference, reasoning, decision, observation, and storage. The wireless devices have limited processing power and energy, which demands disruptive innovation in lightweight AI and network. Furthermore, a new paradigm of communication that understands the meaning of data and coordinates intelligent devices for diverse applications is demanded. With such innovations, the 6G network will no longer be a pipeline to transfer data, but will intelligently emerge with protocols that bridge the correlations between information and application. Integration of terrestrial and non-terrestrial communications has arisen as another key objective of 6G. To this end, satellite communication has been widely studied to expand eMBB, URLLC, and mMTC services to uncovered areas. Large deployment of low-cost lightweight satellites will become a reality in the following years, such as low-earth and very low-earth orbit (LEO, VLEO) satellite. Moreover,
20
D. B. da Costa et al.
low- and high-altitude platforms (LAP and HAP) can be seamlessly integrated with terrestrial network to enhance capacity in temporary events and coverage in emergency scenarios. Furthermore, 6G is also expected to connect aerial devices, such as unmanned aerial vehicle (UAV) and aircraft. This has been challenged due to high mobility, low power, and severe interference inherent to these user case scenarios. Underwater communication has also been studied for integration with 6G, providing connections to both industry and consumer services. Thus, with the above technologies developed, 6G is expected to provide an integrated space, aerial, terrestrial, and underwater communication system. Green communication has been a key milestone in wireless system design, and 6G will further develop toward a zero energy network. Technology innovations in radio devices, interfaces, and architectures are being evolved. In particular, intelligent reflecting surface (IRS) has been a hot research domain in recent years, which leverages passive antenna arrays or meta-surfaces to reflect radio signals to dedicated areas controlled by intelligent beamforming. This will significantly enhance coverage, capacity, and radio transmission in complex propagation environments without increasing the power of base stations, access points, and radio units. Wireless backscattering is another type of sustainable strategy which relays the signal plus coded information without power amplifiers. Moreover, wireless power transfer has been studied as potential 6G technology to utilize radio signal to power small devices. Together with distributed AI technologies, 6G is expected to provide energy-efficient deployment and operation to reduce operating expenditure (OPEX) and capital expenditure (CAPEX). 6G research has been initiated in 2018 by both industry and academia around the globe, including Europe, East Asia, and North America. It has been quickly expanded to emerging regions. The ITU-R has envisaged IMT for 2030 and beyond to study the next-generation IMT technologies of 6G. Europe has been the powerhouse of several communication standards from 2G to 5G. It initiated from Horizon 2020 framework program to promote research on different technologies for 6G. The flagship program being Hexa-X project gathered diverse industry and academic partners to provide a comprehensive 6G solution. This was accompanied by a number of technology research projects in IRS, sensing, satellite, edge AI, and so on. China has deeply participated in IMT-2030 and initiated a wide range of research in 6G radio access and networking technologies. National R&D working group in China Communications Standards Association (CCSA) was established. Furthermore, research of native AI has been initiated in the 6GANA global forum. Japan has initiated roundtable to formulate beyond 5G promotion strategy roadmap toward 6G. South Korea has also invested deeply in research and development of technological innovations in 6G to promote in the global market, while the USA has established a call for promoting leadership on the path to 6G, gathering cross-government, academic, and industry collaboration to maintain US technology leadership in 6G. Standardization of 6G is expected to begin from 2023 with initial objective studies, technology proposals, and performance evaluations, which will be in line with the evolutionary path of 5G-Advanced (Releases 18 and 19) until 2025. 6G
6G: Vision, Applications, and Challenges
21
is likely to be in the first phase study of 3GPP from Release 20, and specification might be studied from the end of 2027, to achieve the goal of the first release at around 2030.
3 Toward Hyper-connected, Intelligent, Secure, and Sustainable World In this section, we will discuss 6G from the perspectives of connectivity, sensing, intelligence, sustainability, and trustworthiness. The applications, enabling technologies, and challenges will be introduced, followed by a holistic view to integrate all technical domains together for 6G network.
3.1 All Things Sensed and Connected Sensing, network, computation, and intelligence are the main components of many advanced technical systems. This concept is inspired by the human nervous system, which consists of senses, nerves, brain, and muscles. The common five senses collect information about the world via stimulating the sensory cells and converting the physical stimulus to signals. These signals are transferred via a network of nerves to the brain for processing. Thereafter, the brain generates responses in the form of commands conveyed via other nerves to the actuating muscles. The concepts of sensing, processing, actuation, and feedback are applied in different fields [9], such as closed-loop industrial control systems, software engineering, and client service models. Moreover, human developed communication for the purposes of entertainment, education, collaborative work, and social contact, by exchanging messages by means of speech, vision, or haptic contact. However, the feasibility of communication over long distances is limited by the necessity of physical presence and the abilities of human senses. To overcome such limitations, many developments have emerged over time to emulate the human senses: microphones for hearing, cameras for sight, and haptic sensors for touch. These sensors work similar to the equivalent human senses, but they convert the physical stimulus to electrical signals, which can then be transmitted over a communication system and converted back to physical stimulus. The process of changing the representation of the physical signals to electrical signals is denoted as transducer. Inspired by the human communication, other types of communications between human and machines and among machines have been invented. The early analog wireless communications focused on transmitting voice and video and audio broadcasting. The following revolution happened in the digital communications, where the main scope was to improve the quality of voice and TV picture. This leads to the digitization of signals and representing them by
22
D. B. da Costa et al.
digits suitable for computer processing. The digital communication systems are then used to transport any type of digital signals, leading to the data-focused communication in 4G, initially to access the Internet, and then to be used for more services in 5G. In parallel to the advancement in communications, sensor technology progressed and different types of sensors have been developed for sensing varieties of physical phenomenon. Moreover, computing power significantly increased with diverse platform capabilities, which enables the implementation of smart devices. This results in a global vision where 6G needs to jointly reconsider the whole components (sensing, connection, computation, and intelligence) in the design to provide a holistic platform rather than focusing on add-on services on top of the network. In particular, the 6G network is expected to provide ubiquitous connectivity and sensing capabilities to enable new applications beyond the abilities of existing systems [10].
3.1.1
Applications
The emergence of new applications and services is one of the main drivers of 6G development, where 5G cannot response well to extreme requirements in communication and sensing [11]. These include applications with extremely immersive experience in XR with haptic feedback and holographic display. The realization of such applications requires massive sensory information from dedicated sensors and other sensing technologies as well as extreme connectivity requirements with exponential increase of the data rate per device, strict latency and reliability, and extreme system capacity to support many connections. Moreover, emerging industrial applications require high level of synchronization and timing accuracy, deterministic latency, and low jitter, in addition to high availability and reliability. Such critical requirements are important for human and machine collaboration to enable security and seamless experience. The 6G applications are expected to improve human life in different areas including public services, education, healthcare, social interaction, manufacturing, science, and entertainment [13]. It is envisioned that 6G will be built on connecting physical, digital, and human/virtual worlds [14]. The physical world will be transformed to a digital word using varieties of sensors and sensing technologies and augmented by a virtual world. The human can be connected and involved by means of new devices. The three worlds will intersect in a metaverse, where human can enter that world and interact with other people and virtual objects by means of existing devices such as smartphones, smart watches, and augmented reality (AR)/virtual reality (VR) glasses, in addition to new devices such as smart clothes, haptic gloves, and in-body implants. Huge data traffic is needed to synchronize the digital and virtual worlds by bidirectional communications from the sensing, processing, and actuation [12]. Based on this vision, many use cases and applications are considered for 6G, some of them are described in this section as examples for the extreme connectivity and sensing requirements.
6G: Vision, Applications, and Challenges
23
Fully Immersive XR Experience XR refers to the combination of real and virtual environments, where the interaction between human and other machines is generated by means of computing and special human-machine interfaces. It involves AR, where physical objects are enhanced by generated information; mixed reality (MR), in which physical and virtual objects coexist and interact; and VR, which is based on simulated experience similar or different from the real world [16]. Extended reality will be used to create applications in entertainment, education, remote working, remote operation, manufacturing, navigation, and tourism. Operation in both indoor and outdoor environments is required, and the experience needs to be seamless in mobility conditions. Therefore, the connectivity should be provided along different routes and areas, which requires wide coverage anytime and everywhere. For example, people will be able to play sport games anytime and anywhere or watch live events with new experience that is even not available in the real life. For instance, people will be able to watch sport events from different perspective and augmented with information when needed. This allows getting information about performers, such as football players, and also enables virtual city tours merging live images with augmented historical details. Thus, the visitors would not only see the current state of buildings, but switch between different historical views. The immersive experience will bring value to people and society by providing alternatives to avoid unnecessary transportation and to improve remote education and healthcare. Extreme connectivity and high positioning accuracy are needed to deliver experience without feeling dizziness. The video resolution in number of pixels, and color depth with high refresh rate, leads to a data rate requirement of 100x of the current raw data rate. In addition, a motion-to-photon latency of 10 ms is required to approach the human vision limit. While low-complexity devices need to be used, the processing of rendering needs to be performed at the cloud, which requires transferring huge amount of data in the uplink. Based on that, the communication system needs to provide high peak and experienced user data rate and low transmission latency in the order of 2 ms, in addition to high capacity to allow many simultaneous users. 6G will bring the fusion of virtual and digital worlds by means of digital services for extending the human senses via digital tactile solutions and integration of ambient data. This allows the interaction between humans from different places as they are sitting in the same room, in addition to seamless interaction with machines, and exploring and browsing physical products digitally. Extended reality will rely on other applications such as digital twins to transform the physical world into a digital representation. In addition, achieving a truly immersive experience of the interaction with the digital world requires tactile and haptic communications to deliver information about surfaces, touches, vibration, forces, friction, motion, and actuation. Holographic representation of objects with multisensory holographic teleportation is foreseen to go beyond conventional 3D representation, which enables acting as in real world with all senses included.
24
D. B. da Costa et al.
Digital Twins Digital twin refers to the ability of cloning a physical object into a virtual object in a comprehensive software representation of structure, properties, conditions, and behavior [17]. It composes models that can simulate the behavior in the respective environment and reflect the status of the physical twin, which can be used to forecast and predict the outcomes of actions and behaviors. In addition, digital twins will be used to evaluate, optimize, and test actions before sending the commands to the physical systems. Many applications involve digital twins for the representation of environment and other physical assets in a digital form. Digital twins were first proposed for industry to manage the infrastructure and evaluate different scenarios based on what-if simulations. Moreover, for designing a new product, first, a virtual object can be designed and tested in the digital twin of the operating environment. After passing the test and meeting the exact specifications, a physical product is manufactured. Finally, the product will be linked to its twin by means of sensors and actuators to track the life cycle. In smart city applications, digital twins will enable building an interactive 3D map of the city with live update about people and other systems. The smart city encompasses a large number of parameters corresponding to wide application areas in infrastructure, climate, healthcare, education, and culture. By providing realtime data, digital twins will be used for evaluation and planning to enhance the operation and control of transportation, energy, water, heating, etc. For sustainable food supply, digital twins can be used to optimize the agricultural process, by monitoring the conditions of soil, inspecting the development and optimizing the plant treatment, and controlling ground robots. Digital twins have the potential of preserving the environment via reducing the waste of material in industry, transportation energy consumption, and CO2 emission by optimizing roads and traffics. Moreover, it can help monitoring the environment for predicting hazards and preventing accidents. In the implementation of digital twins, a massive number of sensors are deployed to gather information about the environment to accurately update and synchronize the model with its physical twin, as well as to integrate different actuators to perform actions. This requires a communication network with extreme performance in transporting massive information volume, with ultralow delays, ultrahigh reliability, and high capacity level, which is beyond the 5G URLLC capabilities. Moreover, high-resolution and interactive 4D spatiotemporal map is needed in conjunctions with means to influence the physical world to provide high-resolution indoor and outdoor mapping. In addition, other usages of digital twins will be enabled by 6G extreme coverage. For instance, digital twins of different earth environments using bio-friendly and energy-harvesting sensors are connected via non-terrestrial networks for real-time surveillance and monitoring of environmental aspects, such as weather and climate, in order to enable early warning and actions in response to natural disasters or to protect endangered species from illegal poaching.
6G: Vision, Applications, and Challenges
25
Haptic Communications Haptic communication involves extending people senses through the digital domain and exchanging real-time haptic information about surfaces, touch vibration, forces, fraction, motion, and actuation. This information is transmitted alongside with the conventional audiovisual data, changing the way the flat screen is used for audio and video by allowing multidimensional, multiparty, and multisensory techniques [18]. This communication enables remote social contact, such as delivering hugs to a faraway family member. In addition, it will realize teleoperation of machines, telesurgery, or telediagnosis, where the operator will get all haptic feedback as dealing with the physical object closely. The human brain will be simulated by the haptic feedback to help the operator to adjust. Remote sensing, haptic feedback, and actuation are combined to enable XR interaction with remote or inaccessible objects. It has applications in remote surgery, in remote inspection and repair, and in merged reality use cases. For instance, in gaming, some players are present in person with special wearable, whereas other players join virtually with visual, haptic, and olfactory sensation. Similarly, a digital meeting can be arranged with some attendees in person and others with their holograph. On the one hand, haptic communication requires developing new haptic interfaces, such as smart clothes and gloves full of sensors to enable feeling the texture, weight, and pressure. On the other hand, to achieve seamless experience, highresolution sensory data should be exchanged with high-throughput and deterministic low latency. Typically, the haptic interaction happens in a latency of order of 1 ms, and therefore, the transmission latency needs to go below 0.2 ms. Besides, the synchronization between different sensors needs to be very accurate to avoid car sickness.
High-Fidelity Hologram In conventional 3D displays, the users need to focus on the screen all the time, which leads to dizziness and unwanted effects after a short time. The next-generation technology will focus on glass-free holographic, exploiting the light field and holographic displays. This enables bringing a person closely without wearing special glasses [19]. Holographic representation of objects with multisensory holographic teleportation will go beyond the 3D representation of objects and people, allowing all senses to be transmitted. Usage includes attending meetings in a virtual room while hearing the discussion and being able to interact with other participants with all senses included as in a physical environment. Mixed reality and holographic telepresence will be the norm for work and social networks. Holographic presence allows one to appear in a certain place while being physically in another, which facilitates collaboration and remote home-working beyond the home office and telepresence meetings with friends and family and enhances the student-teacher experience in e-learning class. The holograph of a person can be seen as a digital twin that is constructed by means of rich sensing of
26
D. B. da Costa et al.
multitypes attached and synchronized to the body to enhance the experience, with accurate reflection of body language in gesture, intonation, expression, surrounding sounds, and touching. Holographic communication requires extremely high data rate for transmitting 3D images over the network. This data rate depends on the resolution, the refresh rate, and the color depth. It can vary from 1 Tbit/s to a few hundreds of Tbit/s depending on the involved compression. In addition, as haptic communications are also involved, low latency and high reliability are also needed.
Radio Sensing Radio sensing allows new types of utilization of the RAN in applications beyond communications. The sensing information can be used for localization, imaging, activity recognition, and gesture and for reconstructing the environment. By processing the Doppler, delay, and angular spectrum from the scattering and reflection of the electromagnetic signals with the environment, localization information in terms of coordinates and orientation can be obtained about connected devices and passive objects in the 3D space. Performance metrics include detection probability and resolution in terms of range, angle, and velocity. High-accuracy localization and fine resolution of resolving small objects are key requirements for cyber system applications. For instance, when robots are used in cooperative tasks, they need to be able to retrieve objects with small spacing and localize themselves with high accuracy in relation to the others. Such high resolution enables imaging, which operates in all weather conditions unlike visible cameras, in addition to the ability of wider view such as seeing through the corners [24]. Both imaging and localization procedures can be exploited to construct a map iteratively. Objects are first detected from captured images and then localized to set the map entries. This map can help to improve the localization accuracy gradually by simultaneous mapping and localization (SLAM) techniques, which in turns enhances the imaging quality as well [21]. This functionality will be enhanced in 6G by fusing sensing information from many distributed devices. Radio sensors can be used to augment different human senses, relying on the interaction characteristics of electromagnetic waves with materials. For instance, the penetration properties make it possible to overcome the limitations of human eye by seeing behind nontransparent objects to perform security scan [23]. Moreover, with large frequency range, it is feasible to replace other medical sensors to see through the skin using the spectrogram of the electromagnetic and photonic waves [22]. Furthermore, based on the absorption characteristics of different materials, it can be possible to find food ingredients. Other applications, such as gesture detection, enable macro and micro recognition [20]. The macro recognition allows detecting breath and heart beats, which gives more details than using camera for tracking patients. The micro recognition enables to track finger and facial expressions, which has the potential of replacing conventional control interfaces, especially in extended reality.
6G: Vision, Applications, and Challenges
3.1.2
27
Enabling Technologies
The discussed 6G applications need connectivity and sensing with one or more extreme requirements. Therefore, the 6G network should integrate new technical solutions at different layers to support heterogeneous capabilities and new services everywhere at any time [25]. In particular extreme cases, the 6G RAN should be able to achieve transport speed comparable to optical fiber, with peak data rate of 1 Tbit/s and 10–100 Gbit/s user-experienced data rate to support applications with ultrahigh video quality, leading to a network capacity increase by 1000x in comparison to 5G. Critical applications demand strict timing and reliability leading to an air interface latency as low as 0.1 ms and 0.1 .μs jitter that are fulfilled with reliability higher than 99.99999%. Considering massive need of sensory information from light IoT sensors and wearable devices, the access network should provide connection density exceeding 10 million connections per km.2 while ensuring high-energy efficiency better by 100x compared to what is available in 5G. The massive use of digital twin pushes the positioning accuracy toward 50 cm for outdoor and 1 cm for indoor environment, in addition to the need of 1–3 mm resolution for imaging services. Other sensing-based services require extreme resolution of estimating Doppler, attenuation, and angular parameters. Moreover, deployed IoT sensors require techniques to extend their battery life to more than 20 years [145, 146]. This section presents several key enabling technologies for wireless access that have the potential to contribute to the fulfillment of 6G communication and/or sensing requirements.
Integrated Sensing and Communications Modern applications in robotics, drones, and autonomous vehicles require wireless communication and sensing capabilities. However, sensing used to be traditionally a standalone function with dedicated modules and devices, being implemented at radar, LiDAR, computed tomography (CT), magnetic resonance, cameras, ultrasonic, and other special purpose sensors. Nevertheless, many of such technologies provide services that can be implemented by radio sensing. For example, radar and LiDAR are both used for detecting objects and measuring the range, but their uses depend on the operating environment and required performance. Thus, by enhancing the radio sensing capabilities, it becomes practical to replace other standalone sensing modules by a unified radio interface for communications and sensing. In fact, any radio communication system can be turned to a bistatic or multi-static passive radar, such that the signals emitted by the transmitters are used for sensing and the receivers act as radar sensors [15, 26]. Passive radar has already gained significant attention, especially in Wi-Fi networks, due to its wide deployment. Two main approaches have been investigated: one considers the statistics of the signal and the other exploits the reference signal embedded in the communications frame, such as pilots or preambles [27]. In the first case, the received signal strength (RSS) can be used to estimate the range and location by means of a wireless sensor network
28
D. B. da Costa et al.
[28]. In the second approach, the channel state information (CSI) is first obtained from the reference signal for coherence detection in the communications receiver, and further processed to extract information, such as delay estimation [29]. With the increase of the available bandwidth in 5G, especially at mmWave, higher-resolution ranging can be achieved. Nevertheless, the passive radar neglects the potential of exploiting the transmitted signal as a probing signal to enhance the communications devices with active radar. Unlike current 5G systems, where position of the connected devices is an elementary add-on sensing function, 6G radio will consider the sensing capability as a native functionality jointly designed with communications, which will reduce the cost of using dedicated sensing equipment. The integration of communications and sensing can be introduced at different levels, by sharing a single hardware, sharing the spectrum, and exploiting the same signal. Moreover, 6G will exploit the availability of widely deployed radio and user devices to improve the sensing capability. Therefore, the network can serve as a sensor that infers information from the radio waves interacting with the environment via reflection and scattering, in addition to different behavior in penetration properties at different frequencies. This will provide information about the physical environment, not only geometrical information but also spectroscopic features about the material types and the medium. The sensing will be part of the 6G network, where the whole network acts as a sensor, and sensing will be integrated as part of the new devices. Signals from many nodes will be used to improve estimating the parameters of delay Doppler, beam direction, to be provided for services such as localization and monitoring. The sensing functionality will assist communications, for instance, in beamforming to reduce the training and tracking overhead which becomes important for 6G with high frequencies that lead to narrow beams. Accordingly, by predicting the user location, and using the mobility information, beamforming can be proactive, which leads to handover-free mobility. ISAC as a design focus of 6G radio will be enabled by other technologies that have the potential of supporting extreme communications and providing high resolution and accuracy sensing.
New Spectrum As in each generation, the increased demand of data rate urges the need for exploiting new spectrum and larger bandwidth. In 5G, mmWave was introduced and 6G research considers currently exploiting the availability of spectrum at high frequency beyond mmWave, sub-THz, and THz [32]. However, because of the propagation challenges at those frequencies in reaching long distances, lower frequencies will also be considered for providing wider coverage. Thus, 6G will integrate all the available spectrum ranges in order to deliver an ultimate connectivity. Namely, low-band (100–900 MHz) will be used for large cell coverage, mid-band (3–5 GHz) for medium cell size, and higher frequency for short range.
6G: Vision, Applications, and Challenges
29
Additionally, more spectrum allocation in the mid-band is needed to support the continuous growth of traffic demand. It is foreseen to allocation a bandwidth of 1– 1.5 GHz at 6 and 10 GHz with reduced range. The allocated mmWave at 26 GHz is of particular importance, as its technology will become mature in 6G, and it offers large bandwidth for ultrahigh data rate, it can achieve cm sensing resolution, and it allows small form factor in comparison with mid-band frequencies. Additionally, more advanced radio techniques will make improvement on using E-bands (71– 76 GHz, and 81–86 GHz), which provides alternative solutions to replace optical fiber by integrating access and backhauling. The THz spectrum will open new opportunities with the availability of 230 GHz bandwidth in the range of 100–450 GHz. However, as the propagation is challenging, THz spectrum will only be suitable for short-range or medium-range fixed access. Moreover, THz communications will bring new sensing capabilities resorting to its large bandwidth and short wavelength, which enables the integration of THz sensing in user mobile devices. THz band creates revolutionary sensingbased applications such as sensing air quality, health monitoring, gesture detection, explosive detection and gas sensing, see in dark, HD video radar, body scan, and cm to mm positioning [30]. Moreover, due to the nonionizing radiation, THz band with its short wavelength can produce higher resolution for imaging, e.g., in medicine and nondestructive testing for quality monitoring of pharmaceutical products. As a result, 6G will encompass different spectrum bands, offering wideband and dense deployment and extended coverage, where massive antenna arrays will be distributed to provide wider coverage. With that, the radio communication signal can be used to provide sensing capabilities for optimizing the network. Thus, integrated radio sensing and communications will assist each other. In addition, the wide spectrum enables the applications of spectroscopy, where the interaction with different materials at different frequencies allows analyzing parameters such as absorption, emission, scattering, and reflection to determine the material type [31].
Optical Wireless Communications Optical wireless communications (OWC) technologies are envisaged to support traditional radio frequency (RF) to overcome the RF use restrictions, where they cause interference to other communication and navigation systems. In addition, OWC can provide a large amount of bandwidth with license-free operation. Moreover, OWC enjoys intrinsic security as the optical signal propagation does not penetrate through the walls, and thus, unlike RF signals, it can be held within a controllable location without leaking outside. Furthermore, OWC is robust against the interference from other electromagnetic sources. OWC systems operate in a wide range of spectra and have different forms including free-space optical (FSO) communication and VLC. Moreover, in the context of ISAC, OWC can be designed to provide the functionality of LiDAR. FSO communication employs optical radiation in the infrared range to convey information in free space. It employs simple infrastructure, where a laser diode
30
D. B. da Costa et al.
is used at the transmitter to convert the electrical signal to optical signal, and photodiode receivers for the inverse operation. The light beam propagates through the atmosphere to convey information in free space. Thus, FSO has the capacity to replace optical fibers in data center, for high-speed backhaul, and to overcome the data rate speed limitations of RF fixed links. It can be also used by UAVs for deploying flying network, especially for inter-flying platform connections, and in the gateway connection to the ground. VLC operates in the globally unlicensed visible band employing LED light by varying the intensity [33]. It offers high degree of spatial confinement and can be used for small cell network. The modulation needs to be fast to avoid flickering effect. As a green solution, VLC has dual function that the same energy can be used for illumination and connectivity, which supports the wider spread of the 6G network. Moreover, VLC is a candidate to provide connectivity where the use of RF is undesired, insecure, unsafe, or impossible in areas such as hospitals, nuclear, power plant, and underwater communications.
Heterogeneous Networks To support sustainability goals, 6G RAN needs to be flexible to allow the optimization of the network according to the use cases to achieve the required performance while maintaining low-energy consumption. This requires the integration of different types of deployments from dense network for short distance to centralized RAN with fewer cell sites and higher resource efficiency, as well as considering deviceto-device (D2D) links as a sort of cooperation between user equipment and base station [35]. To provide worldwide coverage, it is important to integrate terrestrial networks (TNs) and NTNs, where NTN can be used via satellites or UAVs to provide connectivity to remote or inaccessible areas. NTNs are currently designed separately, but in 6G they will be part of the network and will be used to optimize the network performance. Thus, NTN integration will consider the satellite as new type of nodes in the 6G network, which helps achieving seamless mobility with low-power consumption and deployment cost. This can be achieved relying on the decreased cost of launching huge fleets of LEO or VLEO satellites. These satellites also facilitate reducing the latency inherited by conventional geostationary earth orbit (GEO) and medium earth orbit (MEO) satellites. The dense deployment of VLEO can provide higher positioning accuracy for applications with mobility such as autonomous driving, in addition to the importance for earth imaging and sensing. Moreover, the latency of backhauling will be similar to what is achieved by fiber over a long distance. For temporary coverage in crowded events and disaster, flying nodes such as drones, and UAVs, and high-altitude platform stations (HAPSs) will be integral parts of 6G acting as infrastructure access points [34]. Another implication of NTN integration is the ability to provide 3D coverage with dynamically reconfigurable network deployment, such as changing the coverage size, or realizing multi-connectivity for high reliability. In the context of sensing,
6G: Vision, Applications, and Challenges
31
NTN nodes will provide means for high-accuracy localization as an alternative to global satellite navigation solutions.
Antenna and Smart Material The dependency on the spectrum arises when high frequencies are used, such that the antenna aperture decreases and thus the effective path increases with the increase of the carrier. The compensation can be achieved by increasing the aperture via antenna arrays, which helps to focus on the power in certain directions by means of beamforming. Beamforming is the transmission of coherence signals delayed to form a concentrated field in a certain direction, where the signal-to-noise ratio (SNR) is increased due to the constructive superposition gain. Beamforming can be realized by analog phase shifter and gains, which provides a single beam, or by means of digital precoding to enable multiple beams. 3D narrow beamforming in mmWave and sub-THz helps mitigating the path loss and providing space division with minimized interference of other locations. Channel measurement with respect to the beam space, by mono-static or multi-static transmitters and receivers, enables providing information about the link quality, and embedded sensing information on objects and humans, that can be extracted by further processing to estimate the angle of arrival (AoA) or angle of departure (AoD). Inferring such angles allows estimating the direction of active devices and other objects. The spatial resolution increases with pencil beams of mmWave and THz. Additionally, considering ultrawide bandwidth that increases the delay resolution, combining both measures leads to enhanced localization [37]. Intelligent reflecting surface (IRS) has been proposed to control the propagation environment, and thus the radio channel features, in scattering, reflections, and refraction to overcome the conventional random nature of the channel. This provides the operators with infrastructure to shape and control the electromagnetic wave propagation by dynamically adapting steering parameters, such as phase and gain and polarization, to improve the link budget and reduce the complexity of the transceiver algorithms. Thus, IRS can facilitate extending the coverage and range and providing non-line-of-sight (NLOS) communication. IRS is an electromagnetic surface realized by an array of reflective elements, liquid crystal, or software-defined meta surfaces, and it is not equipped with a radio front-end to radiate its own signals [38]. Reflectors have already been employed in radar and satellite communications, but still not used in mobile communications. IRS becomes more appealing in THz and mmWave because first, the implementation of regular large antenna array is challenging as the size of antenna elements becomes tiny; second, IRSs are needed to compensate for the penetration loss and low scattering and to extend the range; and third, the physical size of IRS decreases with the decrease of the wavelength, which allows the realization of IRS in small footprints.
32
D. B. da Costa et al.
In relation to sensing, IRS can be exploited for tracking and localization in NLOS. Moreover, a large surface leads to near-field effects at a reasonable distance, such that the wave becomes curved, and this provides opportunity for improving localization accuracy and enabling highly accurate synchronization between reference points. IRS operation benefits from the localization information to configure the IRS efficiently. In addition, IRS can be deployed on mobile nodes, and thus, their locations need to be tracked [39]. Holographic MIMO corresponds to the design of a MIMO system that is able to fully exploit the characteristics of the electromagnetic channel. It is an evolution of the massive MIMO concept deployed in 5G, by using dense antenna array with large size in relation to the wavelength [40]. Thus, the concept is similar to IRS, and the goal of its design is to produce the antenna array with low cost, light weight, and low-power consumption. When such antennas are electrically large, the radio propagation at mmWaves and even in the THz band may occur in the nearfield region of the antenna at practical distances. Therefore, traditional assumptions resorting to planar wave front cannot be considered valid anymore. The near-field provides the advantages of exploiting spatial diversity in line-of-sight (LoS) without a need of reaching multipath propagation. As a result, holographic MIMO gets rid of the limitation of far-field LoS that leads to low-rank channel and opens opportunities for increasing the spectral efficiency. As an example, in the mmWave band when antennas between 10 cm and 1 m are considered, typical operating distances between 1 and 100 m are included in the Fresnel region, where plane wave approximation does not hold, and spherical wave front should be considered.
Transmission Schemes and Algorithms In order to exploit the previous technologies, adaptive transmission schemes need to be integrated in the 6G design. The air interface will be flexibly intelligent and tailored for serving use cases while taking the channel into account. This allows to parameterize the air interface per user, to enhance the performance without sacrificing the system capacity. First, the spectrum utilization will be used in an optimized way, depending on the use case scenario and requirements, for instance, by adapting the waveform option based on the available bandwidth, channel conditions, and user demands. Thus, flexible waveform design is a key enabler, where it offers different classes with respect to the spectral availability and data rate demand [41]. Conventional linear waveforms such as orthogonal frequencydivision multiplexing (OFDM), single-carrier (SC) and new ones such as orthogonal time-frequency space (OTFS), and sparse Walsh-Hadamard transform (SWHT) [42] can be still used for high spectral efficiency, whereas other nonlinear waveforms, such as zero-crossing modulation (ZXM) [44], index modulation and chirp based [43], and spike [45], can be adopted for low spectral efficiency while focusing on reducing the energy consumption. In addition, it will be exploited new waveforms that combine the conventional waveform concept and beamforming such as orbital angular momentum (OAM) [36].
6G: Vision, Applications, and Challenges
33
Fig. 2 All things sensed and connected
Existing approaches to increase spectral efficiency have been proposed for 5G, but are still applicable in 6G, such as full duplex, which may arise due to the possibility of reusing beam separations for uplink and downlink to enhance isolation [46]. Full duplex is a key enabler to use the base station as a radar. Moreover, nonorthogonal multiple access (NOMA) schemes combined with different waveforms [47] and rate-splitting multiple access (RSMA) [48] provide a solution for dense connectivity by exploiting channel variations and multiuser joint decoding. New algorithms, which exploit sensing and localization information, will enable systematic approach of obtaining CSI. Differently from conventional channel estimation that is done by pilots and feedback, which increases the overhead and latency, the localization-based CSI will not only save radio resources but will also contribute to reducing the processing cost. Therefore, localization and channel estimation will assist each other similarly as performing SLAM. Accordingly, the transceiver design will be adapted according to the availability of CSI at the transmitter and receiver. Figure 2 illustrates the key technologies as well as some application scenarios for all things sensed and connected.
3.1.3
Challenges
Several challenges are still to be addressed in the way of developing 6G system according to the ambition worldwide goals. Firstly, in order to ensure interoperability and roaming over the globe, spectrum allocation needs to be unified, while the antenna and radio are typically designed and optimized for certain bands.
34
D. B. da Costa et al.
The challenges of high spectrum, and in particular THz, lay in the design of the front-end hardware. A new technology is required for developing high-power devices, new material for antennas, and radio power transistors, in addition to other components of the THz transceiver. New materials are needed to continue the evolution toward the utilization of sub-THz and THz by the integration of hybrid photonic and electric components on the same silicon to tackle the energy consumption. Moreover, THz antennas need to be integrated in the front-end circuit, but this is challenging because of the surface wave, which results in poor performance. Another solution considers antenna in package, where the interconnection cost is considerable. The alternative OWC solution has also difficulties. The main concern is the power efficiency for the LED, as an array with large number will be required to achieve extreme data rate with lower consumption, small form factor, and low cost. Other challenges include issues in handling mobility, uplink transmission, blockage, interference of ambient light, and impact of atmospheric conditions such as fog and rain which impact the transceiver design. Research directions are still in need to be pursued to overcome these challenges at the design and signal processing. Moreover, spatial mode multiplexing such as OAM is promising to boost the FSO capacity. Therefore, considering hybrid system of RF and OWC is of significant interests to choose the best of both technologies in an optimized way. Channel modeling is needed to design and simulate radio systems and to compare different approaches before prototyping, testing, manufacturing, and operating in real systems. In practice, the channel model depends on the number and distribution of antennas, bandwidth, and carrier frequency, in addition to the environment. Among the channel modeling method is the deterministic model, which considers the real physical propagation environment properties and use computational methods such as ray tracing and location-based measurement. The statistical modeling is determined by the evaluation of extensive measurement and assuming randomly distributed scattering clusters. Statistical models are suitable for large-scale scenarios, where the computational method becomes complicated. However, this model is not able to provide direct relation with the environment for the applications of sensing. Thus, a deterministic model with high-accuracy representation of the environment is required. Additionally, to enable high-accuracy sensing, it is required to learn about different targets and sizes and to elaborate the properties of the materials in modeling the channel. These properties are also relevant to IRS to work effectively. As a result, different channel modeling approaches should be tailored to certain scenarios, including hybrid stochastic and deterministic approaches for the tradeoff of the computation cost and accuracy. Additionally, the hardware needs to be considered in the channel modeling to improve the performance beyond conventional assumptions, especially in relation to sensing. Rather than using simplified inaccurate models, the modeling of hardware impairments that affects the signal quality, especially nonlinearity and phase noise, needs to be tackled with nonlinear processing. Such modeling activities are relevant to AI/ML.
6G: Vision, Applications, and Challenges
35
Other challenges are related to signal processing and algorithm solution. For instance, as IRSs do not have own radio front-end, they need to collect channel information, between the communications segments, to adjust the steering parameters. This requires energy-efficient methods for channel estimation and to develop strategies for the deployment. Device-free localization is challenging, so that it is required to distinguish the target location from background objects, which is more challenging with multiple targets. As a convenient solution, the static environment information can be collected offline, and the measurement in the run time can be done continuously to observe changes. Finally, it is expected to gradually develop new devices for the emerging applications, which introduces challenges to optimize the design. To handle this issue, the developed solutions at all network levels need to be generalized with sufficient number of parameters to facilitate the customized optimization.
3.2 All Things Intelligent 3.2.1
Introduction
AI and ML have achieved great success in communication applications along the last years, particularly due to the breakthroughs in deep learning (DL) empowered by fast-growing computing power and big data. In line with this trend, AI/ML has been extensively exploited in wireless communications and has been identified as a key enabling technology/technique in 6G. In information theory, Claude E. Shannon pointed out the machine potential for designing communication network, such as filters, equalizers, relays, switching circuits, routers, and symbolic mathematical operations, based on the individual circumstances rather than by fixed patterns [49]. It was anticipated that machine will be an extension over the ordinary use of numerical computers since (i) the entities dealt with are not primarily numbers, but rather applications; (ii) the proper procedure involves general principles, something of the nature of judgment, and considerable trial and error, rather than a strict, unalterable computing process; and (iii) the solutions of these problems are not merely right or wrong but have a continuous range of “quality” from the best down to the worst. Under this routine, people might be satisfied with a machine providing reasonable results rather than best possible results. The evolution of AI/ML in recent decades has followed the perspectives of Shannon on machine intelligence. Specifically, the data-driven algorithms have shown significant advances in applications of computer vision (CV), natural language processing (NLP), control and robotics, neural science, and games. In such use cases, one can collect a relatively large amount of labeled data (i.e., ImageNet) to train a deep neural network (NN) model with millions of parameters (i.e., ResNet), to accomplish various downstream tasks. Such process usually requires large computing power for iterative mathematical optimizations. However,
36
D. B. da Costa et al.
the application of AI/ML in wireless communications emerged much later. A notable milestone was the proposal of cognitive radio (CR) by John Mitola [50], in which a radio knowledge representation language (RKRL) was created for the radio network elements, such as devices, protocols, propagation, use needs, and application scenarios. With RKRL, CR agent is supposed to perform automatic reasoning to support the needs of the user, i.e., to manipulate the protocol stack or utilize the empty radio spectrum. CR inspired a significant amount of pioneering research of AI/ML in wireless communications [51]. For example, reinforcement learning (RL) has been widely studied in CR to exploit the temporal spectrum space for communications and subsequently optimize resource allocation, power control, user association, and routing. In recent years, self-organizing network (SON) has brought the potential of AI/ML into telecom network operation, management, and maintenance. For example, the operator can rely on AI to detect an incident in a network element or configure the network parameters according to the dynamic traffic, reducing effectively the operation and capital expenses. Even though AI has been a hot research area in the telecom community, its application in the wireless industry is still at its infancy. There are a couple of unique challenges in this domain. In particular, the limited computing power and energy of wireless devices and the capacity of radio network constrains the capability of implementing large ML models. The requirement of ultralow latency makes conventional centralized learning approaches infeasible. The performance (i.e., accuracy) of most state-of-the-art ML models, i.e., convolutional neural network (CNN) and recurrent neural network (RNN), cannot reach the stringent requirement of ultrahigh reliability (i.e., 99.9999%). Furthermore, data collection has arisen as a challenging task for network operators since the reliability requirement of telecom system does not permit a large trial-and-error explorations. To guarantee a successful and effective implementation of AI in 6G systems, a native network design integrated with AI is essential. In this context, AI will act as a driving technology for wireless radio interface and protocols, rather than solely an optimization tool, as in current 5G systems. It is foreseen that AI can drive the evolution of wireless network toward scenarios, applications, and requirements without human designs. Furthermore, the whole network should natively support AI applications. In particular, the network and mobile devices will be intelligent agents with computing and learning capabilities. The key performance indicators (KPIs) will go beyond the throughput, delay, and reliability of transmitting data bits. Instead, it will target the final goal at the agents, for example, the accuracy of a prediction task under the cost of energy during training and inference. An efficient communication will be able to deliver the information that is essential to achieve such goals. This will require a new design of communication systems in order to understand the meaning of information and transfer toward improving collaboration capabilities between agents. In this realm, communication and learning will be deeply integrated to empower distributed AI in networks, bringing promising gains in terms of KPIs (i.e., accuracy, latency, and scalability) and costs (i.e., resource, energy, and operation) compared to cloud and edge AI.
6G: Vision, Applications, and Challenges
37
In the following sections, we will discuss intelligent 6G from two aspects: (i) AI for network, which sees AI as an enabler for 6G communications, and (ii) network for AI, which redesigns communication to drive intelligent agents, followed by the architecture evolution to native AI network.
3.2.2
AI for Network: Learn to Communicate
AI has been currently studied as a native enabler for 6G. In the air interface, AI can potentially empower the transceiver components, such as coding, modulation, and symbolization [52], while in network protocols, AI can drive channel access and resource management as well as radio signaling toward new mechanisms with selfadaption and optimization to targeted scenarios and applications [53]. Moreover, in network management and orchestration, AI has the capability to enhance network operation and optimization in terms of closed-loop automation, network function placement, proactive alerting, hidden patterns discovery, and preventive incident analysis [54]. Finally, the evolution of digital twin has enabled the development of the telecom brain, in which new ML models need to be built and trained on big network data for seamless network optimization [55].
AI Native Air Interface The AI native air interface aims to serve an application in the most efficient way by taking into account the constraints of the available hardware and the radio environment. From this perspective, the objective of communications is not only to reliably transmit bits but also to serve an application with the proper data in an optimal way. Conventional radio air interface consists of encoding, symbol mapping, modulation, waveform design, synchronization, channel estimation [147, 148], equalization, symbol demapping, and decoding processes. AI was studied to first replace or enhance individual processing blocks, then it evolved to a joint design of multiple processing blocks, and finally it has designed part of the air interface which does not need any specification. For example, a fully CNN-based receiver was proposed in [56] to replace equalization, channel estimation, and symbol demapping, enabling the detection of the uncoded bits directly from the frequency-domain antenna signals. It was shown to outperform conventional receivers at reduced bit error rate (BER), approaching that with full channel knowledge. Moreover, AI has been used to design error correction codes. An RL-driven polar code with successive cancellation list decoder was investigated in [57]. The NN-based decoder showed reduced complexity and compensation of nonlinearity. Furthermore, joint source and channel coding has also been explored by AI. A popular approach refers to the autoencoder, which shows improved information recovery with higher capacity in transmitting images [61] and text [62]. The potential benefits of AI in radio air interface come from more efficient usage of spectrum and power, with optimized adaption to the hardware and channel.
38
D. B. da Costa et al.
It is highly indicated to solve problems which can hardly be done by analytical solutions from closed-form expressions. On the other hand, the conventional goal of physical layer is to correctly deliver bits between transceivers. This eases the data collection and objective specification for training the ML models, where the “labels” can be treated as effective reconstruction of information. The receiver tasks can be trained in a supervised or self-supervised manner. For example, demodulation and decoding can be modeled as classification tasks by maximizing the cross-entropy over original information, while channel estimation and synchronization can be formulated as regression tasks by minimizing the absolute error over the actual signal and time. ML can also balance the optimization and scalability, by searching a near-optimal solution but with better generalization to different scenarios. This is seen as advantage over conventional approaches, where the optimal solution is under the constraints of specific assumptions.
AI Native Network Protocol AI native radio protocol is another important part of the native AI architecture. The emergence of RL and DL pushes successful application in channel access, scheduling, resource management, power control, spectrum sharing, mobility management, sleep mode control, and cell breathing. The majority of the tasks in MAC and radio resource management (RRM) can be seen as partially observable Markov decision process (POMDP). For example, a mobile device agent can sense partially its surrounding spectrum occupation by measuring interference power at different frequencies and make decision to transmit data on a selected resource block and power level. This action changes the spectrum environment which transits to a new state. However, it may hardly be observed by an agent under blockage or at distance. In this case, multi-agent reinforcement learning (MARL) arises as a helpful tool to formulate the POMDP behavior and search for a decision policy which benefits multiple agents. This relies heavily on a properly defined reward function. The objective of communication can be seen as a combination of multiple KPIs, which makes it difficult to specify the rewards compared to the games with simple goals. Furthermore, a balance between long-term global rewards and shortterm local rewards should be carefully designed, to avoid aggressive behavior from a single agent. The study of AI in radio protocols begins from enabling individual or joint processing blocks. Link adaption targets to assign the optimal modulation and coding scheme adapted to the SINR and reliability. Q-learning has been studied on link adaption to improve throughput over traditional outer-loop approach [63] and to satisfy packet delay budget in XR with reduced resource usage [64]. These points are usually investigated jointly with semi-coordinated scheduling, including UE attachment, signal-to-interference-plus-noise ratio (SINR) prediction, channel quality indicator (CQI) mapping, and UE grouping for interference avoidance [65], in which improvements on spectral efficiency and reliability can be achieved. Resource and power allocation is another popular application for ML, which
6G: Vision, Applications, and Challenges
39
has been extensively explored in various scenarios. For example, distributed RL was shown to improve throughput of uplink power control in 5G [66], which benefits from adaptivity to UE radio condition. Furthermore, deep RL was shown to effectively duplicate data packets to improve reliability [67] and split data flow to improve throughput [68] in 5G multi-connectivity with carrier aggregation, by learning the dynamics of propagation, interference, mobility, and load. In mobility management, RNN revealed to reduce outage and failure by learning the SINR pattern [69], while trajectory prediction with seq2seq model was shown to improve conditional handover in 5G networks [70].
AI Native Immersive Sensing Immersive sensing has emerged as another key feature in 6G networks. In contrast to conventional 5G which provides localization of wireless users, 6G is expected to connect and integrate diverse sensors, generating a wider scope of information, such as image, sound, smell, touch, and taste. Such multi-modality information should be abstracted and transferred in an efficient manner, and to accomplish targeted tasks in applications. On the other hand, the sensed information can be used to manage and optimize wireless networks, in which AI arises as an enabler in the scope of mapping wireless signals to environmental information (network sensing), transforming image, sound, signals to the digital domain (Internet of senses), reasoning on wireless network (sensing-assisted communications), and a wider downstream applications (task-oriented communications). The application of network sensing has been extensively studied in recent years, with AI being used to localize, detect, and reconstruct the targeted objects in proactive or passive manner. DL has been first studied with fingerprinting method for Wi-Fi indoor localization [71, 72] and mmWave outdoor localization [73]. The method was based on supervised learning, in which the signal power in a grid of known location was firstly measured. The DNN was trained to approximate the location-based signals emitted from multiple spatial locations. The trained model was then used to estimate the locations in a general scope of received signals. In order to learn the geometrical signal information, CNN and graph neural network have been applied. For example, signals from Wi-Fi access points (APs) deployed in a grid topology can be trained over CNN, in which the closer APs’ signal provides better location estimation. However, such method has a number of constraints in practical implementations. Firstly, the measurement of fingerprinting location signal dataset is expensive in reality. It may also be difficult to represent a complex radio environment. Secondly, the use of CNN or GNN assumes fixed correlations of radio signal in 3D space. However, this could be changing dynamically due to the mobility of users and objects. In this context, dynamic graph or dynamic Bayesian network could be instrumental with the capability of learning the topological information based on the varying correlations between raw signals. RL is helpful to calibrate the trained model, with online rewards of location errors potentially provided by the users. However, it still requires a large amount of trial-and-error actions. Self-
40
D. B. da Costa et al.
supervised learning can be a promising direction in the future. For example, a variational autoencoder (VAE) or transformer NN could encode the sensed signal to latent embedding and reconstruct the signal. With E2E training, it is possible to obtain the embedding representing the locations without the need of collecting real locations. A further extension of radio sensing refers to simultaneous localization and mapping (SLAM) based on CSI [74]. In this context, acquiring and learning the features of the propagating signals in full-space domain is essential, which includes time difference of arrival (TDoA) and time of flight (ToF). This enables localization in a 3D space and also full range orientations, which cannot be achieved in traditional fingerprinting methods. Unsupervised learning based on NN can be applied to SLAM. The encoder maps CSI to a predicted user location, and the decoder models the physics of propagation by parameterizing the environment using virtual anchors. The NN is trained on E2E to reconstruct the set of ToF extracted from CSI, and then the interpretable latent is learned, which represents the user location. The unsupervised learning relieves the need of human measurements on large labeled locations and enables online learning relying on standard CSI. Another application of AI-enabled sensing is the use of sensing to improve wireless communications. Conventional wireless system utilizes radio measurements to estimate the radio environment and adjust transmissions. For example, 3GPP RAN specifies reference signal received power (RSRP), reference signal received quality (RSRQ), CSI, and CQI for measuring the propagation loss, fading, interference, etc. In future 6G networks, immersive sensors integrated in different network devices will potentially provide a holistic view of the wireless environment. For example, advanced driver-assistance system (ADAS) can install LiDAR, camera, and radar for perception of the road environment, to assist route and motion planning, optimization, and further driving control. The image, point cloud, and radar wave can have better senses of the environment, such as mobility, blockage, light, and weather. These elements have large impact on radio propagation, especially in high frequencies like mmWave, THz, and VLC. It is instrumental to fuse multimodality sensor data to optimize transmit power, beam, frequency, antenna, and cell, to boost capacity, reliability, and latency performance and to improve energy and resource efficiency. A RNN framework has been studied on vision and GPS data to predict the mmWave beam blockage [76, 77], based on a large-scale multimodal communication and sensing dataset [75]. It shows effective beam prediction and tracking with reduced beam training compared to conventional 5G synchronization block (SSB) measurement-based approach. The research in this area is still in preliminary stage. A fundamental issue is what information (i.e., shape and location of obstacles) can or cannot be retrieved from RF signals, in which the immersive sensor information can compensate [78]. Furthermore, such application on a wider communication system is not well studied, including beam, resource, power management, multi-connectivity, and handover. Finally, abstraction of immersive sensor information in different devices and transfer over the wireless network for multiple integrated task is essential to the success of these applications.
6G: Vision, Applications, and Challenges
41
AI Native Telecom System Telecom network management and orchestration (NAMO) experiences increasing complexity in 5G, due to the integration of eMBB, mRLLC, mMTC services, and massive applications. Meanwhile, the software-defined networking (SDN), network function virtualization (NFV), and service-oriented architecture (SOA) have been widely used, enabling ultra-flexible network orchestration. In order to provide an agile, seamless, automatic, and secure management of network functions, services, features for diverse applications, tenants, and users, AI has been extensively studied and developed for network automation, operation, and maintenance. Autonomous E2E network slicing is a key enabler for multiservice and multitenant in 5G. In an E2E network, heterogeneous communication resources, including access network spectrum, transport network bandwidth, and core network processor, should be jointly optimized to guarantee E2E quality of service (QoS) for application and end user priorities. This becomes a complex problem with no analytical solution. Deep RL has been studied on resource management for network slicing [79], which shows to implicitly incorporate more deep relationship between the demand (user activities) and supply (allocated resources) in resource-constrained scenarios and enhance the effectiveness and agility for network slicing. Intentdriven network is an important component in autonomous network management, which aims to understand the higher-level intervention from the users by extracting required information for different network nodes and provide a configuration for the involved network functions [80]. ML drives forces to achieve the intent level inference given the contextual awareness and sufficient data availability from various network and intent functional blocks, including extraction of service primitives, optimization of service orchestration, intent to service mapping with historical user behavior, profile generation from service descriptors, and intent and service assurance from proactive resource, expectation, and environment monitoring. Telecom brain is an emerging concept that aims to integrate multiple components and blocks driven by ML to build a foundation model, which will represent diverse wireless environment and will be configured to a specific task. Foundation model has been studied in the AI domain, which centralizes the data information from various modalities, and can be adapted to a wide range of downstream tasks. This is trained on broad data for a surrogate task and adapted to the downstream task of interest via fine-tuning, which is called transfer learning and scale. On the other hand, self-supervised learning becomes popular in this area, especially in natural language processing embracing the transformer architecture. It applies powerful encoder to raw data and scale up to larger models and datasets. The foundation model faces significant challenges when being applied to telecommunications. Data collection is extremely expensive in wireless networks, especially to cover generalized scenarios and tasks. For example, the performance of data transmission can be evaluated only when the transmission has completed, which may result in harmful impacts to the users without the optimal configuration. This is especially serious in POMDP problems, like scheduling, where the environment is difficult to formulate for an analytical solution. The concept
42
D. B. da Costa et al.
of digital twin attracts increasing attention in industry due to its capability of emulating real networks and scenarios in the digital domain. In that case, the operator can implement and optimize the mechanisms and architectures without impacting the real network, helping the development of foundation model and experimenting in the virtual network to collect the performance data. Another challenge is the deployment of such large model in wireless networks. Even though model compression has been explored, it still lacks a consolidated architecture developed for the wireless use cases. Furthermore, the framework of foundation model is still unclear on how the architecture is transferable between wireless tasks (i.e., modulation, coding) and how the scenario data is correlated (i.e., indoor, outdoor). Generic modeling of physical network in virtual domain based on data also needs a better understanding.
3.2.3
Network for AI: Communicate to Learn
In conventional Shannon’s level A communication system, the objective is to effectively transfer and reconstruct information bits. Under such perspective, the network is transparent to the content being delivered and execution of the applications. Instead, the end user evaluates the performance of the application, “translates” it to a set of human-defined KPI, and guides the network design and optimization. In future 6G networks, the diversity of applications and the embedding of AI as a service is expected to explode. The level A communication will become ultra complex and cannot provide the end performance effectively. This trend has been seen since the standardization of 5G, where a large number of wireless features, functions, and procedures are defined to support various services, making the interface, protocol, system, and architecture very complex. Furthermore, the light embedding of AI in network largely constrains its applicability, performance, and scalability. To overcome these issues and push 6G toward an AI native network, it is worth to redesign communication system such that it understands and extracts the meaning of information (Shannon’s level B) and transfers it effectively to enable multiple agents to collaboratively solve a task or goal (Shannon’s level C). This enables communication to deeply participate and drive the process of learning, either in network or in end user applications. Furthermore, a framework integrating distributed learning into wireless network (in-network learning) moves computation closer to the data, which not only boosts the performances of learning but also improves system and energy efficiency. Lastly, level C communication with in-network learning eliminates transmission of raw information, which largely enhances privacy, robustness, and reliability. In summary, the evolution of network for AI can be classified as follows: (1) training in the cloud and inference at the edge, (2) training and inference in the network, and (3) goal- and task-oriented communications.
6G: Vision, Applications, and Challenges
43
Native AI Network Architecture The first evolution from AI as an application toward AI as a Service (AIaaS) has inspired the design of an AI native network architecture. 5G networks were designed to support application level AI in 3GPP Rel-18, in which the data collection, training, inference procedures, and interfaces between RAN and network data analytic function (NWDAF) were studied [81]. In such design, network is expected to support a number of defined AI applications. In contrast, AIaaS requires first an exposure of the AI performance metrics to the network entities. For example, the latency is no longer the duration of correctly delivering bits to the receiver, but achieving a targeted goal from the AI model, such as desired accuracy from prediction. This should include the iteration of transferring observations, models, intermediate outcomes, and other necessary information during training toward convergence, and also those for inference to obtain the final result. The novel network design for AI services includes requirements in the following perspectives. 6G should deeply integrate the capability of communication, sensing, computation, intelligence, and storage [82]. Firstly, the network should have the ability to discover, sense, and measure the computation capability of underlying devices, for example, the type of connected sensors and the data modalities that they provide, the computing power of a base station and the energy consumption, the public and private computing resources of a mobile phone, and the priorities of ongoing services. Secondly, the network should evaluate the AI algorithms, including the execution time, computing complexity, data requirement, convergence, interactions, and architecture evaluations. For example, the network should customize model to devices with different computing power while guaranteeing the unified training and inference performances. Thirdly, the network should provide data collection, analysis, storage, privacy, and openness capability. For example, the heterogeneity of data collection on different devices during the training of a large model should be aware and evaluated by the network, such that it can proactively provide the resources and schedule the time-slots for environment sensing. In the network architecture, 6G is expected to have a knowledge-driven network AI layer for training and evaluating the models. Knowledge graph is a mathematical model representing the correlations between entities and their features. In an AI native network, such knowledge includes the correlations between data, service, applications, models, and algorithms, as well as network devices, functions, wireless links, and so on, both horizontally and vertically. Network knowledge graph can enable an automatic MANO of the AI services, by extending its domains from network operations to the description of AI computation, algorithm, and data. It will allow the network to identify the reasons to faults, QoS, and traffic and make decision on self optimization, healing, and maintenance. Furthermore, the network AI layer should be integrated into all network devices to support operation of AI services and in-network learning. The interfaces between network devices will be able to support transferring of AI models efficiently while preserving the data privacy.
44
D. B. da Costa et al.
Distributed Intelligence in Network In the evolution of AI beyond 5G, a step forward is the development of distributed learning in wireless networks. Different from cloud AI or edge AI in which training is performed above RAN, network AI is anticipated to empower diverse wireless devices to perform collaborative training and inference, utilizing the growing on-device computing capability opened to the network. The benefit of in-network learning first reduces the E2E latency for users obtaining results, improves communication efficiency, and ensures privacy. Training can be performed close to the observed or sensed data, reducing data transfer in the network and thus increasing its capacity. The local model is optimized for local environment which can be beneficial to model personalization. Second, the energy and resource efficiency of communication and computing can be improved. By leveraging ondevice computing, the energy consumed for transferring large data in network can be released, and the energy used for training can be reduced with improved convergence from small and personalized data. Third, the system robustness and security can be enhanced, in which the failures or attacks over communication channels or computing devices can be recovered by other nodes. Despite such benefits, distributed learning faces various challenges in achieving the generalized performance as centralized learning. Data heterogeneity exists in observation from devices in different scenarios, which makes the model hard to converge and optimize. Collaborative training is beneficial to learn global optimal solution in a distributed manner. However, transferring ML models is more expensive than data due to large parameters. It is thus essential to study advanced distributed learning algorithms specifically for wireless networks, such as federated, split, swarm, and multi-agent reinforcement learning. Federated learning (FL) is being studied extensively for wireless networks, as a promising approach to achieve global optimization through distributed learning without data exchange. It should consider unique challenges in wireless communications, including heterogeneous resources, limited bandwidth and battery, and large model size, plus small, nonindependent and identically distributed (IID), imbalanced data, information redundancy, and privacy. The essential algorithm design includes (1) device selection, criteria related to resource constraints and data quality; (2) resource allocation, adaptive and optimized resource usage; and (3) model aggregation, weighted, selective, and secure aggregation [83]. Federated learning can be built upon distributed training and inference on wireless devices. The distributed trained models are transmitted to a central node for aggregation, to produce a globally optimized model for distributed agents. The model parameters or gradients are transmitted over the network for aggregation. Quantization, sparsification, and distillation have been studied in AI to compress the ML models. However, such techniques compensate for the model distortion and introduce additional computation. To overcome this issue, over-the-air computation (AirComp) has been proposed, which leverages analog addition to perform model aggregation in the air interface [84]. The constraint of this technique is the synchronization requirement between participating agents. To handle data heterogeneity, personalization has been
6G: Vision, Applications, and Challenges
45
studied, to estimate optimal weighted combination of local models and cluster identities of users. Further, consensus federated learning attracts large attentions recently. It aims for a fully distributed model exchange, where each agent combines with adjacent agent’s model to achieve optimality. The approach leverages mutual cooperation of agents without central coordinator, which is particularly useful to achieve low latency and energy efficiency in V2X or IoT networks [85, 86]. Split learning (SL) has been studied to split the entire NN into parts and deploy split parts on distributed agents and network server, respectively. Each agent retains a part of the NN, and gathering together the parts from all agents constitutes a complete model. This reliefs the need of training an E2E model on network devices and enables flexible control of computing tasks between agents and communication resources in network. It also achieves better accuracy and convergence than federated learning when data distribution is unbalanced. The splitting strategy significantly affects the learning performance. Specifically, the selection of cut layer affects communication and computation load. Gradients should be transmitted iteratively during training which also increase load and latency. In summary, split learning is an approach using communication to compensate computing resources. Multi-agent reinforcement learning (MARL) empowers distributed agents to interact between each other to acquire optimal policies, which extends the interaction with environment. Unlike single agent RL, MARL is usually modeled as a Markov game or stochastic game. A joint set of state, action spaces, and rewards of multiple agents are modeled. In each discrete time step, every agent selects an action based on the current state and receives an immediate reward. The environment state transits to the next state according to the action set of all agents. Each agent finds its optimal policy to maximize its own discounted accumulative reward, which depends on the joint policy of all agents. Each agent observes partial state of the environment, and the action affects reward on other agents through the environment. Thus, MARL involves the interaction among multiple agents. A comprehensive and reasonably designed reward function plays a crucial role in solving these problems. MARL benefits from solving problems in a distributed and parallel manner, with more scalability and robustness. Each agent learns its own policy, such that sporadic changes in the number of agents have little impact on the learning process of other agents. However, a significant change to the environment could make the agents difficult to obtain optimal policies. Distributed training on MARL also decreases its capability and speed of converging to a global optimal strategy, because each agent has partial observation to the environment. Central training and distributed execution has been studied over these issues, but it also increases the communication cost and privacy during uploading data and downloading models.
Semantic-Native Emergent Communications Goal- and task-oriented communications are seen as the technology driving a fully native AI in network. The goal of communication goes beyond to accurately transfer and reconstruct information bits, but to enable intelligent agents to collaboratively
46
D. B. da Costa et al.
accomplish specified tasks [58]. With in-network distributed learning, the agents exchange models and updates (FL), latent vectors and gradients (SL), and state, action, and reward (MARL) to collaborate for optimal models and policies. However, none of these approaches design from the communication perspective, on when to transfer what information to which agents. Instead, the conventional network AI framework focuses on adapting and optimizing the communication resources, to achieve the KPIs required from AI. However, the Shannon information capacity constrains the capability to satisfy the growing AI applications. Therefore, it is essential to make communication systems understand the meaning of information and transfer it toward the final goal of specified tasks. Emergent communications have been studied for a better understanding of human language evolution and building more efficient representations. It has been extended to multi-agent systems to perform cooperative tasks, in which the agents learn an interpretable communication protocol that enables them to efficiently and optimally solve the task [87]. Such protocol defines an embedding of information observed from the environment, which is learned by performing the cooperative task, i.e., robotic navigation. In nowadays telecommunication networks, the data source is with diverse modalities such as text, image, sound, wave, and touch; and the targeted task is from diverse applications such as reconstruction, recognition, prediction, planning, control, and perception. A common representation of information is critical to abstract the meaning of data and learn the effective transmission and reasoning in multiple related or downstream tasks. This inspires the study of semantic communications [58]. It aims to extract semantic information from raw data which is specified for performing the targeted task. Under this perspective, communication efficiency and reliability can be improved by reducing the transmitted information, without the need of accurately reconstructing the original data. With the improved efficiency and reliability, the communication system is expected to provide higher capacity and accommodate more users. On the other hand, the intelligent agents can improve the efficiency of semantic abstraction through iterative communications and learning, resulting in better capability of solving more complex tasks with less communication costs. The ambitious goals of semantic communications introduce significant challenges to the communications system design, from fundamental information theory to practical implementation in a standardized platform. Unlike the traditional source and channel coding which relies on defined policies to minimize the BER, the semantic information varies on different data and tasks. For example, a sentence represented in text, speech, and image is coded to completely different embeddings with traditional source coding. The semantic coding aims to extract the same context between these data formats. However, there is lack of theoretical definitions of semantics especially on multimodal data. Moreover, the communication needed between agents varies for different task, even though with the same raw data. For instance, the semantic information needed for detect and localize an object on the same image are different. This also results in difficulty to define generic metrics to evaluate the performance of semantic communications for different data and task. Furthermore, it is challenging for multiple agents in a network to have
6G: Vision, Applications, and Challenges
47
a synchronized knowledge base, to guarantee the receivers effectively extracting semantic information sent from different transmitters. The above challenges also bring difficulties to standardize semantic coding and communication systems [59]. To tackle these issues, significant research works have evolved rapidly in recent years. ML has been deeply studied in data science, which is shown to successfully recognize the hidden patterns and understand the correlations among different data, resulting in effective decision-making and inference for various applications [60]. In this context, ML is seen as a promising technology for SC. For example, the CNN can enable the agents to extract the image patterns and embed on compressed representation for communication [61]. The correlations among different images learned by CNN can help the agents to complete related tasks without transferring all the raw data. Similarly the long short-term memory (LSTM) and Transformer achieved success in extracting semantic information from text message [88]. It has been demonstrated to provide better Bilingual Evaluation Understudy (BLEU) score than conventional source (Huffman) channel (Turbo) coding especially at low SNR, which benefits from higher tolerance of distortion in semantic channel than physical channel. This has been extensively analyzed in [58]. Deep learning has been extensively investigated for semantic communications [88], evolved from deep source-channel coding [89]. In the three-level communication theory proposed by Shannon, the level B semantic is to extract the features from source information to transmit and rebuild at the receiver [90]. A number of research in NLP have been investigated, with wide adoption of transformer NN. This includes an adaptive universal transformer proposed in [88, 91], and [92], which designs an autoencoder to encode text sentences on high-dimensional latent representation, with Channel State Information (CSI) used to adjust the attention weights according to the changing SNR. Similarly, the work of [89] incorporates hybrid automatic repeat request (HARQ) to transmit incremental bits according to reliability requirement. Since the attention layer weights the correlation between words, the embedding with higher weights can be applied with more redundancy in channel coding to guarantee the entire reliability. A further application of nonlinear transformer on images is proposed in [93], and it maps source image in latent representation and learns the entropy model of latent distribution, such that a variable length source-channel encoder can be applied to reduce the source distortion. Weng et al. [94] studies the problem of speech recognition and synthesis with joint CNN and RNN modules, which recovers the text and speech at the receiver. An application of transformer-based semantic communications in massive MIMO is investigated in [95], adding over the dedicated transformers for digital and analog beamformer, showing improved semantic similarity over joint sourcechannel coding. Transformer NN is powerful to learn the latent distance of sequences with its attention mechanism, but also significantly increase the computational complexity. Xie and Qin [96] proposed network sparsification and quantization on model compression for deployment in power-constrained devices. Moreover, the knowledge base at the transceivers can be distorted, making the receiver unable to decode semantic message. Thomas and Saad [97] proposed a generative flow networks
48
D. B. da Costa et al.
to learn the probabilistic structure generating the observed data, with measure of semantic distortion reflecting reasoning capability. It shows that the system can transmit at higher reliability with fewer transmitted bits. Some cross-disciplinary studies are carried out including quantum semantic communications [98], which proposes to learn semantic representation in Hilbert space suited for quantum computing and communications. The majority of the above works aims to improve the semantic similarity of raw image, text, and speech between the transceivers. The level C effective communication aims to transmit the perception. The goal of communication for the agents to effectively execute sequential tasks. Farshbafan et al. [99] studied a problem that the transmitter sends a hierarchical belief of environment observation to the receiver, which reconstructs the observation and takes an action on the environment, resulting in transition to new states in the next observation. It shows to improve task execution reliability and belief efficiency than traditional reinforcement learning. Despite the recent success of deep learning in semantic communications, the majority of works focus on learning the patterns of raw data on Euclidean space. This however could have the constraints that the model could only compress and reconstruct specific data domains. In addition to semantic information abstraction, a major goal of semantic communication is to leverage the pattern to infer correlated data or to further execute a related task. This demands the agents’ understanding of data relationships. Seo et al. [58] proposes the concept (meaning) space which aims to construct smallest meaningful representation of data in the semantic domain, which not only compresses the data but also creates an association of different observations based on their semantics. For example, graph [100] and hypergraph [101] NN have been proved as effective tools, taking into account the irregular correlations between data. Furthermore, simplicial complex [102] and cell complex [103] NN are also demonstrated to learn higher-order interactions beyond pairwise elements. The goal of communication for the agents is to effectively. The integration of semantic and emergent communications enables an efficient connection of intelligent agents. The communication system evolves beyond standards toward paradigms learned from the agent’s goals. The ultimate vision is to enable emerging applications operating over the optimal communication paradigms without specifications, in which the network will natively empower AI.
3.2.4
Challenges and Open Directions
As discussed previously, AI will effectively change the communication system design, and 6G will go beyond connections toward training and optimizing intelligent agents. Challenges persist in the development of AI native communications requiring significant cross-disciplinary research. The AI theory is not well studied and designed for wireless communications built on information theory. For example, it is unclear what AI algorithms, architectures, and models are optimal for communication problems. The unique features of wireless signals, network topologies,
6G: Vision, Applications, and Challenges
49
and propagation environments are not well considered in the design and application of ML mechanisms. This makes AI hard to approach the analytical performance in many communication use cases or is not able to scale to different scenarios, devices, and tasks. This requires fundamental research on AI models and algorithms to be specifically designed for wireless communications. In addition, data collection is expensive in wireless network. It might consume large amounts of energy and resources to sense and measure the radio environment and to transfer them. Moreover, many “labeled” data describing the performances can only be evaluated during operation of the network, which could either harm the user experiences without optimal configuration or constrain the observation of different scenarios. Optimal trade-offs between data collection cost on different devices and ML performances on different tasks still need deep understanding. Furthermore, wireless devices have constrained computation, storage, communication resources, and power, in which the cloud-based AI models and algorithms are not feasible. On the other hand, the communication network is not well utilized in collaborating multiple devices to train and share AI models for common tasks. The upsurge of emergent communications aims to learn the protocols for AI tasks. However, it is still unclear how the entire communication system, such as modulation, coding, scheduling, and power control, can learn and evolve toward task-oriented design. This needs fundamental research on semantic information theory up to radio transmission and network design. In addition to the theories and technologies, AI native network also faces challenges in terms of network operation and architecture. Conventional network is responsible to deliver information and so as the architecture has limited openness to the downstream applications. In order to achieve a fully AI native network which learns data to perform tasks, the network needs fully open to data and applications. New architectures integrating the network with data, application, and intelligence need further research, and a new operation model integrating information producer, deliverer, and consumer needs further development.
3.3 All Things Sustainable The aim for a carbon-neutral world has put energy efficiency and sustainability as central pillars of next-generation communication networks. Sustainable communications have become an active research topic. We provide in this section a comprehensive overview of strategies and technologies for accomplishing this ambitious goal, including the following technology perspectives: wireless power transfer (WPT) and harvesting, wireless backscattering, dynamic spectrum sharing, and AI-driven network energy management. We anticipate that these will be crucial technologies for achieving all things sustainable in future 6G communication systems, as illustrated in Fig. 3.
50
D. B. da Costa et al.
Fig. 3 All things sustainable
3.3.1
Wireless Energy Transfer and Harvesting
Wireless energy transfer and harvesting technologies have recently gained momentum as promising candidates for enabling sustainable communication in 6G. WPT holds great potential to support self-powered devices in a plethora of applications, ranging from IoT devices in smart environments to implants and artificial organs in medical applications. There are different technologies that can operate either in the near-field or farfield propagation regions. Near-field WPT technologies rely on inductive coupling or magnetic resonance coupling. The operation range of near-field WPT is limited by a few centimeters, and the energy transmitters and the receivers are required to be perfectly aligned [104], which can be restrictive. Nevertheless, these requirements deliver the highest energy transfer efficiencies among existing WPT technologies (efficiencies of more than 95.% have been reported in [105]). Due to this reason, and the relatively simple hardware, several practical implementations are already commercially available. Radio-frequency identification (RFID) tags and wireless charging of electric toothbrushes, smartwatches, and smartphones are examples of near-field WPT implementations. Charging stations for electrical vehicles based on near-field WPT are also expected to be available soon [106]. In far-field WPT systems, on the other hand, energy receivers can be located far away from the power source. Moreover, it is possible to harness and recycle energy both from dedicated power transmitters and from electromagnetic signals broadcast by ambient sources, e.g., TV towers, satellite systems, and cellular base stations. Far-field WPT and harvesting can be realized using legacy antennas and simple energy-harvesting diode-based circuitry [104]. These features allow easy
6G: Vision, Applications, and Challenges
51
implementation of simultaneous wireless information and power transfer (SWIPT) techniques [107], which are one of the promising technologies for enhancing the energy efficiency of 6G systems. However, the achievable power transfer density is usually very low, and inconstant when relying on ambient signals, compared to the near-field counterpart. Therefore, far-field WPT is attractive for low-powered applications, such as sensor networks, or to alleviate energy consumption in largerscale 6G applications. IRS has arisen as a disruptive technology capable of optimizing wireless channel propagation environment [111]. An IRS consists of a low-powered device engineered with multiple sub-wavelength reflecting elements, also known as metaatoms or unity cells. The electromagnetic characteristics of each element can be configured in real time via software to induce a distinct phase shift and reflection coefficient on impinging signals. Collectively, the reflecting elements can reach diverse objectives, such as steering, focusing, polarization control, and adsorption [112, 113]. These functions can be harnessed for improving information transfer in communication applications, amplifying the power transfer density in WPT systems, or achieving both objectives in SWIPT schemes [114]. Due to the countless optimization possibilities, the IRS technology is believed to pave the way toward smart and sustainable propagation environments with ultralow-energy costs and be another key component of 6G.
3.3.2
Backscatter Communication
Backscatter communication is a concept that dates back to 1948 [108]. Despite being an old concept, backscattering has evolved into an energy-efficient technology that is widely used until current days, and it is expected that it will continue to play a crucial role in low-data and low-power applications foreseen in future 6G communication systems. A classical backscattering system consists of a passive tag (also called the backscatter transmitter) and an active device, which performs power transfer through an unmodulated carrier signal and, jointly, acts as the backscatter receiver. Conventionally, the tag exploits inductive coupling to modulate its information through load modulation on the carrier signal while reflecting it to the receiver [109]. Thus, backscattering can be considered an application of WPT and energy harvesting. Different backscatter-based technologies exist nowadays, which can be classified into mono-static, bistatic, and ambient backscattering communication. Mono-static backscattering consists of those strategies based on the original concept, in which the power source and the backscatter receiver are implemented in a single active device. In turn, in bistatic backscattering, the power source, e.g., a WPT transmitter, and the backscatter receiver are implemented into separate independent devices. The strategy provides greater flexibility and room for deployment optimization that is not possible with the original technology [110]. Last, ambient backscatter systems do not count with a dedicated power source. Instead, ambient backscatter transmitters, such as low-powered sensors, harvest the energy of ambient electromagnetic
52
D. B. da Costa et al.
signals to transmit useful information. Beyond these classical implementations, new strategies for improving and extending the applications of backscattering are being proposed constantly. For instance, the idea of exploiting IRS for harnessing ambient energy and operating as backscatter transmitters has shown promising results [115]. Besides delivering high-energy efficiency, all backscattering technology variations also enhance the spectral efficiency of communication systems since it does not require a licensed dedicated spectrum band [116].
3.3.3
Dynamic Spectrum Sharing
The increasing number of data-hungry applications combined with the explosive growth of connected devices will demand ever-wider bandwidths as the 6G era approaches. Therefore, managing the scarce and expensive frequency spectrum smartly and efficiently has become crucial. In particular, spectrum sharing and cognitive radio techniques, where multiple applications can coexist in the same frequency band, have been widely investigated for improving spectrum utilization. In the cognitive radio paradigm, secondary users, such as sensors and IoT devices, can opportunistically access unoccupied frequency bands, ideally without interfering with primary users [117]. To this end, different schemes have been proposed, including spectrum sensing techniques, the utilization of unused TV radio frequencies, and geolocation-based schemes [118]. Even though these approaches are effective when the spectrum utilization is sparse, they do not perform well in highly dynamic and overloaded scenarios and can cause unwanted interference to primary users. As an attempt to overcome this issue and to further improve the spectrum sharing efficiency, the concept of cognitive radio has been recently combined with different multiple access techniques, such as power-domain NOMA and RSMA [48]. These promising multiple access techniques enable secondary users to access efficiently the same frequency resource used by primary users in a symbiotic manner with manageable interference levels [119]. Due to the low operating range and power, backscattering technologies also generate tolerable interference and, thus, can be easily integrated with different spectrum sharing strategies [120]. Given that the frequency spectrum is a limited resource, appropriate regulations and spectrum allocation policies are also fundamental to driving sustainable economical and technological developments. Conventional rigid policies with static spectrum allocation are not suited to future-generation communication networks. Fortunately, strategies for avoiding possible spectrum shortages are already being studied and standardized, as discussed in the previous section. For instance, as aforementioned, 3GPP Release 16 enables 5G NR to operate in unlicensed spectrum bands, e.g., frequencies used in Wi-Fi, both in standalone mode and using spectrum aggregation [121]. Spectrum aggregation allows the 5G NR to combine vacant noncontinuous frequency bands into an extended effective bandwidth, thus leading to a more efficient and sustainable spectrum utilization. Other recent standardization
6G: Vision, Applications, and Challenges
53
efforts include licensed shared access (LSA) in Europe and dynamic spectrum access (DSA) system in the United States and Australia [118].
3.3.4
Smart Energy-Efficient Networks
The foreseen ultra-densification and exorbitant data traffic in 6G networks should increase energy consumption and make resource management a challenging issue. Fortunately, such a massive connectivity should also provide an enormous amount of statistical data, which can be used to optimize the network through smart data-driven approaches. Specifically, advances in AI algorithms and computing paradigms, such as pervasive edge computing [122], can exploit this enormous data availability to drive communication networks to a new era with smart resource management and self-orchestration capabilities, which can lead to lower-energy consumption. To put into perspective, a recent report states that AI solutions alone can reduce energy costs by up to 30.% already with the current 5G communication infrastructure [123]. Therefore, AI-driven 6G networks, which should employ novel hardware with different energy-efficient technologies, such as IRSs, backscattering, SWIPT, and others, should accomplish even more expressive energy savings. The application of AI can enable 6G networks to self-configure and adapt to specific connectivity patterns. For instance, based on current data traffic, an AIdriven base station can optimize its energy consumption and spectrum utilization by autonomously turning on (or off) only radio components necessary to satisfy the requirements of connected users [124]. Advanced AI techniques such as deep reinforcement learning and transfer learning can be exploited for implementing smart 3D beamforming, dynamic user clustering, channel estimation, sensing, and countless other possibilities. AI can also help to prolong the battery lifetime of users through smart task offloading, for instance. In this regard, federated learning strategies in cooperation with edge computing should play a key role. In the federated learning paradigm, distributed AI models are trained locally in the user terminals, such that only the model weights are shared with edge servers or upper network layers. These strategies should reduce the necessity of raw data transmissions and assist computational-intensive tasks, thus leading to reduced utilization of limited battery power [125]. Distributed AI should also enable base stations to operate cooperatively in autonomous orchestration mode with little human intervention [126, 127]. Furthermore, AI can assist in planning and designing energy-efficient network deployments with improved signal propagation and coverage [128]. To sum up, AI-driven solutions compose an essential technology for enabling highly efficient and sustainable 6G communication systems.
3.3.5
Toward Fully Sustainable Network
Future 6G networks will rely heavily on sophisticated AI strategies and the energyefficient technologies presented in previous subsections, such as WPT, SWIPT,
54
D. B. da Costa et al.
backscattering, and IRSs. These technologies will work together harmoniously in an intelligent orchestration. Smart propagation environments enabled by IRSs will tremendously improve beamforming and energy radiation, resource allocation and network management will be autonomous and efficient with the assistance of distributed AI techniques and edge computing, and advanced harvesting technologies will maximize energy recycling and energy efficiency. With these promising possibilities, 6G networks should enjoy an extended operating lifetime with minimized energy consumption and a sustainable carbon footprint. Smart spectrum sharing schemes will contribute to technological and operational sustainability, where spectrum scarcity will not be a problem. Moreover, flexible spectrum sharing regulations will create opportunities for new business models and make 6G systems also economically sustainable. This 6G concept with all technologies working together contributing to all things sustainable is illustrated in Fig. 3.
3.3.6
Challenges and Open Directions
We have presented key promising state-of-the-art technologies and strategies that should pave the way to all things sustainable in 6G networks. However, the 6G era is still several years away, which gives us room for further development and enhancements. Several open issues still need to be studied and tackled with all discussed technologies. Specifically, more efficient WPT and ambient energy harvesting technologies need to be developed to broaden the application possibilities. Efficient ways to implement ambient backscattering systems which are robust to inconstant availability of ambient signals are yet to be discovered. More in-depth investigations of spectrum sharing strategies in overloaded scenarios expected in ultradense 6G networks need to be carried out. Furthermore, large-scale studies on distributed AI strategies to jointly optimize the presented technologies in practical deployments are still missing. To wrap up, each raised technology provides a plethora of opportunities for future research, which need to be solved before all things can become sustainable.
3.4 All Things Secured The ever-increasing connection density, the extensive deployment of base stations, and concepts such as pervasive edge computing should introduce security vulnerabilities in all communication layers. The classical cryptography methods might not be able to cope alone. Therefore, security and data privacy are more than ever critical concerns. In this section, we discuss featured technologies that are essential to enable secure communication in 6G and beyond generations. As illustrated in Fig. 4, physical layer technologies, quantum communication, and blockchain are among the key enablers of all things secured in future communication networks.
6G: Vision, Applications, and Challenges
55
Fig. 4 All things secured
3.4.1
Physical Layer Security
By employing at the base station a large number of antenna elements, massive MIMO systems compose an indispensable physical layer technology for realizing secure communication. Massive MIMO is essential for supporting high operating frequencies, e.g., mmWave spectrum. Therefore, it is one of the main components of 5G and should also remain in 6G. Aided by signal processing techniques known as beamforming, massive MIMO systems can harness multipath propagation phenomena to focus transmissions only to intended spatial locations [129, 130], thus enabling space division multiple access (SDMA). This feature provides the first layer of security and impairs potential eavesdroppers outside the beam space receiving unauthorized information. This security layer should be further enhanced in 6G systems that will extend the operating frequencies and include the upper mmWave spectrum up to THz bands, enabling the so-called “pencil” beams [131]. In addition to these high-precision beams, THz-enabled systems have a short coverage range, i.e., limited to a few meters, which makes jamming and eavesdropping extremely difficult. As discussed in [132], attackers would need to be in close proximity to the licensed user to wiretap the communication link. Moreover, even if attackers are nearby, communication security could be ensured with the aid of dynamic hopping over a large number of frequency channels possible with the broad THz spectrum. As anticipated previously, IRS is a powerful spectral and energy-efficient device for controlling the propagation medium. IRS shall provide a low-cost solution for extending the limited coverage range of mmWave and THz systems in 6G. Besides signal coverage advantages, the flexible properties and capabilities of
56
D. B. da Costa et al.
IRS can also be harnessed to improve security in the physical layer of 6G. As performed in [133], the reflecting elements of IRS can be efficiently optimized to boost signal reception to intended users while mitigating information leakage on eavesdroppers. IRS can also be configured to minimize malicious jamming transmissions impinging legitimate users [134], or even, as proposed in [135] and [136], to assist the generation of artificial noise and improve the secrecy rate of communication systems. The use of IRS for secret key generation is also one attractive opportunity for assisting physical layer security [137]. Multiple access (MA) techniques should also play an important role in the security of future 6G networks. In particular, in addition to impressive gains in terms of degrees of freedom (DoF) and spectral efficiency, the recent RSMA technique also delivers interesting security properties. At the BS, RSMA splits the uncoded data messages for each user into common and private parts. The common parts of all users are encoded into a common super symbol and private parts into private symbols. The common symbol is transmitted via a multi-cast precoder intended for all users, whereas each private symbol is conveyed following an SDMA approach and is intended only for the corresponding user [138]. By executing successive interference cancellation (SIC), each user retrieves a common stream containing the common parts of all users and a private stream comprising its intended private symbol. To reconstruct the original uncoded message, each user needs to decode and recombine both the private and common parts from the received data streams. Even though all users may have access to other users’ (certainly encrypted) common data parts, only partial information can be retrieved, which complicates the eavesdropping of the whole data message. Moreover, recent studies show that the common stream can be exploited to transmit artificial noise to eavesdroppers while ensuring legitimate users retrieve their common parts [139]. It has been also demonstrated that RSMA remarkably outperforms other conventional MA schemes in terms of secrecy rate [140]. These results show that RSMA is a strong MA candidate with the potential to reinforce the physical layer security of 6G systems.
3.4.2
Quantum Communication
Quantum computing is another disruptive paradigm that is being rapidly developed. The enormous processing power of quantum computing will cause serious security threats even to the most sophisticated cryptography protocols currently available. Due to this reason, quantum-safe encryption strategies and quantum communication have been receiving increasing attention and are envisioned to be part of 6G and beyond [141–143]. Quantum communication exploits the exotic features of quantum mechanics to establish theoretically unhackable communication links, regardless of the attackers’ computing power [144]. Such capability is made possible by three fundamental properties: quantum superposition, no-cloning theorem, and quantum entanglement [141]. Specifically, superposition refers to the behavior of the elementary quantum information unity, the quantum bit (qubit). Differently from
6G: Vision, Applications, and Challenges
57
a classical bit that takes deterministic values 0 or 1 at a time, a qubit randomly assumes a superposition of these two or more states. This property is responsible for the giant leap in processing capabilities offered by quantum computing. The nocloning theorem affirms that an unknown quantum state (e.g., a qubit) cannot be copied or observed. This theorem has strong security implications and ensures that an eavesdropper cannot intercept quantum information without causing noticeable disturbance [142]. In its turn, quantum entanglement enables the generation of two or more entangled quantum particles whose states are dependent on one another regardless of the physical space separation between the particles. In other words, any modification to the state of one of the entangled particles is reflected instantaneously in the state of the others, independently of their distances [143]. Existing quantum communication schemes are mainly based on the above properties. In particular, three quantum approaches with attractive prospects for enabling secure communication in 6G systems can be highlighted, namely, quantum teleportation, superdense coding, and quantum key distribution (QKD). Quantum teleportation is an application of the exotic quantum entanglement property, which allows the transmission of quantum information without actually sending qubits physically. To this end, two entangled qubits and two classical bits (to be transmitted over a conventional communication link) are harnessed to teleport a third data qubit. This impressive possibility has been verified experimentally in distances up to 1400 km [143], and it can be exploited to deploy ultra-secure quantum communication networks in the future since actual data messages are not transmitted. Superdense coding, on the other hand, exploits two entangled qubits to encode two classical information bits. In contrast to teleportation, one entangled physical qubit is transmitted over a quantum channel (e.g., optical fiber link) but the actual bits are not transmitted. Only the receiver with the second entangled qubit is able to recover the information bits [149]. Moreover, the nocloning theorem can be used to check the integrity of transmitted qubits, making quantum superdense coding another secure communication protocol. Last, QKD relies on quantum communication to implement quantum-secure cryptography schemes. More specifically, QKD-based protocols exploit qubits to share encryption keys securely between transmitters and receivers. Under these protocols, the data transmission is performed via authenticated classical channels. Nevertheless, due to the properties of quantum mechanics, potential eavesdroppers have no access to the encryption keys. QKD is applicable to optical networks, terrestrial wireless networks, and satellite networks [150]. Moreover, due to the practical feasibility of QKD, telecommunication companies, such as SK Telecom and Verizon, have been already testing QKD-based protocols to improve the security of their commercial cellular networks [141].
3.4.3
Blockchain
As a robust distributed ledger concept, blockchain has been recognized by the telecommunication sector as a crucial technology to enhance the security of higher-
58
D. B. da Costa et al.
level layers of 6G communication networks, from data to application layers [151]. The distributed architecture of blockchain makes it very suited to the pervasive edge computing infrastructure expected in 6G. Specifically, under the blockchain paradigm, users’ transactions and data information are encrypted, organized into blocks, and stored in a chain, where each block is connected to the previous one via a hash function. This blockchain ecosystem is managed and maintained by all stakeholders of the network in a peer-to-peer fashion, in which a distributed consensus algorithm verifies the validity of new data blocks and keeps the chain in a consistent state [152]. The characteristics and features of blockchain can provide security to communication systems in various aspects. For instance, the stored hash-connected chain blocks are not allowed to be modified. As a result, attackers are impaired from tampering with the data and transactions. Moreover, because all nodes in the network share a copy of the blockchain, i.e., a register of all transactions, any corruption in the chain can be easily traced back [153]. Thus, blockchain has the potential to enable high-privacy transparent operations in 6G, including secure resource management, spectrum scheduling and sharing, infrastructure asset management, safe roaming authentication, and secure IoT services [152, 154]. In particular, using blockchain to store and distribute AI models provides an audit trail, and pairing blockchain and AI can enhance data security. AI can rapidly and comprehensively read, understand, and correlate data at incredible speed, bringing a new level of intelligence to blockchain-based business networks.
3.4.4
Toward Fully Secured Network
It is clear that the security prospects for 6G systems are promising. If the abovediscussed technologies are properly implemented, their particular features can complement one another and virtually eliminate the chances of security breaches, thus enabling secure communications effectively. For example, in the physical layer, secure RSMA transmission schemes can be efficiently combined with advanced mmWave or THz beamforming techniques of massive MIMO to form highly directional signal beams. Such systems would already substantially limit the reach and capabilities of possible attackers. To further shield the communication and ensure that only licensed users receive the transmitted signals, IRSs can be deployed in the communication environment to passively induce signals impinging at unintended receivers to add up destructively. Quantum communication is also among the key technologies for transmitting sensitive data in 6G networks. Protocols based on QKD, for instance, can be implemented to encrypt data exchanges with edge and cloud servers, while blockchain (or even quantum-based blockchain) would be responsible for securing other communication layers. Looking further ahead (potentially beyond 6G), as quantum computing technology evolves, communication systems should progressively shift to fully quantum, unlocking ultra-secure high-performance quantum channels. Figure 4 illustrates this vision with all things secured.
6G: Vision, Applications, and Challenges
3.4.5
59
Challenges and Future Directions
This section has covered crucial enabling technologies of secure communications in 6G systems. Nevertheless, these technologies are in their initial development stages and suffer from a number of security vulnerabilities. For instance, despite the ultra-narrow beams of THz communication, such systems remain susceptible to eavesdropping in line-of-sight scenarios, and transmitted signals can be easily blocked. Optimizing IRS to prevent data leakage might not always be possible, given its passive operation. Moreover, since RSMA is a relatively recent technique, its security advantages and drawbacks are still not fully clarified. Regarding quantum communication, the full extent of the properties and laws of quantum mechanics are yet to be unveiled and understood. This implies that we still have a long way to go before secure quantum communication systems become part of our everyday life. All in all, these discussed open issues arise as exciting research directions.
3.5 All Things Together 6G will integrate the innovation from connectivity, sensing, intelligence, sustainability, and security as a system. Some of these capabilities are already supported by different existing communication technologies as over-the-top services. For example, sensing can be provided by multi-modality dedicated sensors, such as camera, LiDAR, radar, and sonar. 6G aims to provide joint communications and sensing within the cellular network. Another example is the transformation of AI as an application to service, where the ML data, processing, training, and inference will be integrated into network entities and links. This also requires security and privacy technologies, which include blockchain, quantum keys, and federated learning integrated in 6G network. The network native connectivity, sensing, intelligence, sustainability, and security arise as the key innovation and advancement of 6G compared to those operated with other communication technologies. We foresee that native AI can be an enabling technology to achieve the vision of abovementioned native X. Connectivity can be empowered by AI technologies to achieve the ambitious KPI of network capacity, data rate, latency, and spectral and energy efficiency in 6G. Applications include deep learning-driven joint source-channel coding and reinforcement learning-driven radio resource management. The management and operation cost of telecom network can also be reduced from automated network deployment, maintenance, diagnosis, and healing. Furthermore, sensing can be driven by AI cross multiple disciplinary, such as computer vision, audio, and signal processing, to achieve highly accurate detection, localization, recognition, and reconstruction. For example, classical CNN frameworks, including ResNet, VGG, and LeNet, have been extensively exploited in the field of wireless sensing and localization. Moreover, recent research on multi-modality sensing with fusion of signal, image, radar, and LiDAR has demonstrated effective enhancement on various
60
D. B. da Costa et al.
downstream tasks using transformer framework of GPT. The attention mechanism has the potential to create an abstraction space for sensory data, such that the benefit of each modality can be compensated to achieve high accuracy and resolution in different scenarios. Intelligence will be natively integrated into 6G network that empowers edge training, inference, and processing. Edge AI technologies such as model quantization, zeroth-order gradient descent, and Bayesian robust optimization can address the challenges of energy consumption, computational constraint, data heterogeneity, security, and privacy of learning on device. Collaborative learning frameworks such as federated leaning, split learning, and swarm learning are widely studied to learn generalized models from representative data while reducing the communication and computation cost. Furthermore, semantic communications have been exploited to enable agents learn emerging protocols through collaborative communication and control, which has the potential to drive communication theories into the beyond Shannon era to create breakthroughs in efficiency. Sustainability can be addressed by extensive use of AI technologies as discussed in smart energy-efficient networks. For instance, machine learning can predict the data traffic and user mobility patterns and optimize base station placement, configuration, tuning, transmit power, coverage, and beamforming, to minimize the energy consumption of transmission, power amplifier, and cooling. Moreover, self-organized network driven by ML can reduce the energy footprints caused by operation and maintenance through automation. The advancement of edge AI, federated learning, and semantic communications can significantly reduce energy consumed for computation and communication of both model and data. It should be noted that traditionally large AI model could consume extensive energy to train in the cloud, and the benefit of generalization can bring significant improvements to energy efficiency during operation, with the support of edge models, making it efficient on power-constrained devices. Security of communication network will potentially be empowered by quantum and blockchain technologies in 6G. Machine learning has the potential to optimize quantum key distribution and identify central quantum protocols including teleportation, entanglement purification, and the quantum repeater, which are important in long-distance quantum communications. On the other hand, quantum computing has the potential to accelerate machine learning training and inference, which thus improves the end-to-end latency and energy efficiency of 6G networks built upon AI. Blockchain is a key technology in distributed learning to guarantee the security of data. Federated learning and swarm learning have the potential to be widely used in 6G networks to avoid the exchange of data, where blockchain can prevent privacy leakage by assigning training task to multiple agents. Furthermore, the rise of semantic communications with pragmatic reasoning naturally encrypts messages where the protocols are learned exclusively between agents, without introducing redundancies.
6G: Vision, Applications, and Challenges
61
4 Conclusions 6G, a disruptive innovator in wireless communications, is envisioned to make all things connected, sensed, intelligent, secured, and sustainable by 2030. It will continue enhancing the connection experience in 5G toward Tb/s data rate, millisecond latency, seven nines reliability, thousands km/h mobility, and integrated space-aerial-ground-sea coverage. 6G will exploit THz spectrum and visible light to deliver extreme high data rate and low latency. Holographic MIMO will be developed with low cost, size, and weight and will rely on metasurfaces to provide ubiquitous connectivity and extreme capacity. Non-orthogonal, rate splitting, and orbital angular momentum multiple access schemes will empower extreme spectral efficiency. The extreme connectivity will drive innovative vertical applications in industry, people, and society, such as fully immersive XR experience, digital twins, haptic communications, high-fidelity hologram, and so on. 6G will also bring wireless communications to new eras beyond connections. Sensing will be tightly integrated with 6G to leverage the ubiquitous radio transmission; to detect, track, and predict objects; and analyze and reconstruct environments. Immersive sensors, like LiDAR, radar, and sonar, will be connected to 6G in which multi-modality information will be digitized, understood, and transferred in the network. AI will be natively integrated into 6G design, from air interface to network protocol to management and operation, to enable the network to automatically observe, learn, decide, and reason in diverse environments and applications. Furthermore, 6G network architecture and protocol will natively support distributed AI, to bring computation and learning close to the data and applications. Semantic and emergent communications will transfer effective abstraction of information to enable intelligent devices to collaboratively accomplish diverse application tasks. Furthermore, 6G will introduce novel devices, materials, antennas, transceivers, and network architecture to reduce energy consumption, such as energy transfer and harvesting, reflective intelligent surface, and backscattering. Secured communication technologies such as blockchain, quantum communication, and federated and swarm learning will drive 6G toward a trustworthy network. 6G research and innovation is at the beginning stage with global innovation and involvement. Collaboration around the globe is key to the success of 6G with a unified standard. We anticipate that 6G will go beyond the support and utilization of innovative technologies and applications in other domains such as AI, security, manufacture, and vehicles. It will natively integrate their systems, architectures, features, and objectives into wireless network design. Finally, 6G will strengthen and expand business players in telecommunication network, transforming the business models of telecom operator and vendor into a new era, where industries, consumers, and societies will be deeply integrated together.
62
D. B. da Costa et al.
References 1. E. Dahlman, S. Parkvall and J. Skold, 5G NR: The Next Generation Wireless Access Technology, 2nd Ed., Elsevier, 2021. 2. H. Holma, A. Toskala and T. Nakamura, 5G Technology: 3GPP New Radio, Wiley, 2019. 3. 3GPP, “Release 15 Description; Summary of Rel-15 Work Items,” TR 21.915 v15.0.0, October 2019. 4. 3GPP, “Release 16 Description; Summary of Rel-16 Work Items,” TR 21.916 v16.2.0, June 2022. 5. 3GPP, “Release 17 Description; Summary of Rel-17 Work Items,” TR 21.917 v1.0.0, September 2022. 6. W. Chen, J. Montojo, J. Lee, M. Shafi and Y. Kim, “The Standardization of 5G-Advanced in 3GPP,” in IEEE Communications Magazine, https://doi.org/10.1109/MCOM.005.2200074. 7. Nokia, “5G evolution: Learn what is behind it and how it paves the way toward 5GAdvanced,” 2022. [Online]. Available: https://www.nokia.com/networks/5g/5g-advanced/. 8. M. Chafii, F. Bader and J. Palicot, “Enhancing coverage in narrow band-IoT using machine learning,” 2018 IEEE Wireless Communications and Networking Conference (WCNC), 2018, pp. 1-6, https://doi.org/10.1109/WCNC.2018.8377263. 9. N. Promwongsa et al., “A comprehensive survey of the tactile internet: State-of-the-art and research directions,” IEEE Communications Surveys & Tutorials, vol. 23, no. 1, pp. 472–523, 2020. 10. W. Saad, M. Bennis, and M. Chen, “A vision of 6G wireless systems: Applications, trends, technologies, and open research problems,” IEEE network, vol. 34, no. 3, pp. 134–142, 2019 11. M. Latva-aho, K. Leppänen, F. Clazzer, and A. Munari, “Key drivers and research challenges for 6G ubiquitous wireless intelligence,” 2020. 12. W. Tong and P. Zhu, Eds., 6G: “The Next Horizon: From Connected People and Things to Connected Intelligence.” Cambridge: Cambridge University Press, 2021. 13. H. Viswanathan and P. E. Mogensen, “Communications in the 6G Era,” in IEEE Access, vol. 8, pp. 57063-57074, 2020, https://doi.org/10.1109/ACCESS.2020.2981745. 14. M. A. Uusitalo et al., “6G Vision, Value, Use Cases and Technologies From European 6G Flagship Project Hexa-X,” IEEE Access, vol. 9, pp. 160004–160020, 2021. 15. Bazzi, A., & Chafii, M. (2022). On Outage-based Beamforming Design for Dual-Functional Radar-Communication 6G Systems. arXiv preprint arXiv:2207.04921. 16. Y. Lu and X. Zheng, “6G: A survey on technologies, scenarios, challenges, and the related issues,” Journal of Industrial Information Integration, vol. 19, p. 100158, 2020. 17. L. U. Khan, W. Saad, D. Niyato, Z. Han, and C. S. Hong, “Digital-twin-enabled 6G: Vision, architectural trends, and future directions,” IEEE Communications Magazine, vol. 60, no. 1, pp. 74–80, 2022. 18. L. Bariah et al., “A prospective look: Key enabling technologies, applications and open research topics in 6G networks,” IEEE access, vol. 8, pp. 174792–174820, 2020. 19. B. Zong, C. Fan, X. Wang, X. Duan, B. Wang, and J. Wang, “6G Technologies: Key Drivers, Core Requirements, System Architectures, and Enabling Technologies,” IEEE Vehicular Technology Magazine, vol. 14, no. 3, pp. 18–27, 2019, https://doi.org/10.1109/MVT.2019. 2921398. 20. S. Hazra and A. Santra, “Robust Gesture Recognition Using Millimetric-Wave Radar System,” IEEE Sensors Letters, vol. 2, pp. 1–4„ December 2018. 21. H. Durrant-Whyte and T. Bailey, “Simultaneous localization and mapping: part I,” in IEEE Robotics & Automation Magazine, vol. 13, no. 2, pp. 99-110, June 2006, https://doi.org/10. 1109/MRA.2006.1638022. 22. J. Neu and C. A. Schmuttenmaer, “Tutorial: An introduction to terahertz time domain spectroscopy (THz-TDS),” Journal of Applied Physics, vol. 124, no. 23, p. 231101, 2018. 23. B. Yektakhah and K. Sarabandi, “All-Directions Through-the-Wall Imaging Using a Small Number of Moving Omnidirectional Bi-Static FMCW Transceivers,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, pp. 2618–2627, May 2019.
6G: Vision, Applications, and Challenges
63
24. J. Guan, S. Madani, S. Jog, S. Gupta, and H. Hassanieh, “Through Fog High-Resolution Imaging Using Millimeter Wave Radar,” Jun. 2020. 25. G. Liu et al., “Vision, requirements and network architecture of 6G mobile network beyond 2030,” in China Communications, vol. 17, no. 9, pp. 92-104, Sept. 2020, https://doi.org/10. 23919/JCC.2020.09.008. 26. A. Schwind, W. Hofmann, R. Stephan, R. S. Thomä and M. A. Hein, “Bi-static Nearfield Calibration for RCS Measurements in the C-V2X Frequency Range,” in 2020 14th EuCAP, 2020. 27. W. Li, M. J. Bocus, C. Tang, R. J. Piechocki, K. Woodbridge and K. Chetty, “On CSI and Passive Wi-Fi Radar for Opportunistic Physical Activity Recognition,” in IEEE Transactions on Wireless Communications. 28. I. B. F. de Almeida, M. Chafii, A. Nimr and G. Fettweis, “Blind Transmitter Localization in Wireless Sensor Networks: A Deep Learning Approach,” 2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), 2021, pp. 1241-1247, https://doi.org/10.1109/PIMRC50174.2021.9569361. 29. Z. Li, A. Nimr, P. Schulz and G. Fettweis, “Superresolution Wireless Multipath Channel Path Delay Estimation for CIR-Based Localization,” 2022 IEEE Wireless Communications and Networking Conference (WCNC), 2022, pp. 1940-1945, https://doi.org/10.1109/ WCNC51071.2022.9771756. 30. T. S. Rappaport, “Wireless Communications and Applications Above 100 GHz: Opportunities and Challenges for 6G and Beyond,” IEEE Access, vol. 7, pp. 78729–78757, 2019 31. B. Hattenhorst, S. M. Schnurre, T. Hülser, C. Baer and T. Musch, “Contactless Flame Reactor State Parameter Investigation Using a Broadband mmWave Radar,” IEEE Sensors Letters, vol. 4, pp. 1–4, May 2020. 32. M. Matinmikko-Blue, S. Yrjölä and P. Ahokangas, “Spectrum Management in the 6G Era: The Role of Regulation and Spectrum Sharing,” 2020 2nd 6G Wireless Summit (6G SUMMIT), 2020, pp. 1-5, https://doi.org/10.1109/6GSUMMIT49458.2020.9083851. 33. F. Nizzi et al., “Data dissemination to vehicles using 5G and VLC for Smart Cities,” 2019 AEIT International Annual Conference (AEIT), Florence, Italy, 2019, pp. 1-5. 34. M. Asad Ullah, K. Mikhaylov and H. Alves, “Massive Machine-Type Communication and Satellite Integration for Remote Areas,” in IEEE Wireless Communications, vol. 28, no. 4, pp. 74-80, August 2021, https://doi.org/10.1109/MWC.100.2000477. 35. S. Zhang, J. Liu, H. Guo, M. Qi and N. Kato, “Envisioning Device-to-Device Communications in 6G,” in IEEE Network, vol. 34, no. 3, pp. 86-91, May/June 2020, https://doi.org/10. 1109/MNET.001.1900652. 36. W. Cheng, W. Zhang, H. Jing, S. Gao and H. Zhang, “Orbital Angular Momentum for Wireless Communications,” in IEEE Wireless Communications, vol. 26, no. 1, pp. 100-107, February 2019, https://doi.org/10.1109/MWC.2017.1700370. 37. V. Mishra, M. R. B. Shankar, V. Koivunen, B. Ottersten and S. A. Vorobyov, “Toward Millimeter-Wave Joint Radar Communications: A Signal Processing Perspective,” IEEE Signal Processing Magazine, vol. 36, pp. 100–114, September 2019. 38. E. Basar, M. Di Renzo, J. De Rosny, M. Debbah, M. Alouini, and R. Zhang, “Wireless Communications Through Reconfigurable Intelligent Surfaces,” IEEE Access, vol. 7, pp. 116 753–116 773, 2019. 39. J. Hu, H. Zhang, B. Di, L. Li, L. Song, Y. Li, Z. Han, and H. V. Poor, “Reconfigurable Intelligent Surfaces based RF Sensing: Design, Optimization, and Implementation,” arXiv preprint arXiv:1912.09198, pp. 1–30, 2019. 40. A. Pizzo, T. L. Marzetta, and L. Sanguinetti, “Spatially-Stationary Model for Holographic MIMO Small-Scale Fading,” IEEE J. Sel. Areas Commun., vol. 38, no. 9, pp. 1964–1979, 2020. 41. G. P. Fettweis and H. Boche, “6G: The Personal Tactile Internet–And Open Questions for Information Theory,” in IEEE BITS the Information Theory Magazine, vol. 1, no. 1, pp. 7182, 1 Sept. 2021, https://doi.org/10.1109/MBITS.2021.3118662.
64
D. B. da Costa et al.
42. R. Bomfin, A. Nimr, M. Chafii and G. Fettweis, “A Robust and Low-Complexity WalshHadamard Modulation for Doubly-Dispersive Channels,” in IEEE Communications Letters, vol. 25, no. 3, pp. 897-901, March 2021, https://doi.org/10.1109/LCOMM.2020.3034429. 43. I. Bizon Franco de Almeida, M. Chafii, A. Nimr and G. Fettweis, “Alternative Chirp Spread Spectrum Techniques for LPWANs,” in IEEE Transactions on Green Communications and Networking, vol. 5, no. 4, pp. 1846-1855, Dec. 2021, https://doi.org/10.1109/TGCN.2021. 3085477. 44. P. Neuhaus, M. Dörpinghaus and G. Fettweis, “Zero-Crossing Modulation for Wideband Systems Employing 1-Bit Quantization and Temporal Oversampling: Transceiver Design and Performance Evaluation,” in IEEE Open Journal of the Communications Society, vol. 2, pp. 1915-1934, 2021, https://doi.org/10.1109/OJCOMS.2021.3094927. 45. F. Roth et al., “Spike-Based Sensing and Communication for Highly Energy-Efficient Sensor Edge Nodes,” 2022 2nd IEEE International Symposium on Joint Communications & Sensing (JC&S), 2022, pp. 1-6, https://doi.org/10.1109/JCS54387.2022.974350 46. Z. Xiao, P. Xia and X. -G. Xia, “Full-Duplex Millimeter-Wave Communication,” in IEEE Wireless Communications, vol. 24, no. 6, pp. 136-143, Dec. 2017, https://doi.org/10.1109/ MWC.2017.1700058 47. M. Sigmund, R. Bomfin, M. Chafii, A. Nimr and G. Fettweis, “Iterative Receiver for Power-Domain NOMA with Mixed Waveforms,” 2022 IEEE Wireless Communications and Networking Conference (WCNC), 2022, pp. 602-607, https://doi.org/10.1109/WCNC51071. 2022.9771625. 48. O. Dizdar, Y. Mao, W. Han and B. Clerckx, “Rate-Splitting Multiple Access: A New Frontier for the PHY Layer of 6G,” 2020 IEEE 92nd Vehicular Technology Conference (VTC2020Fall), 2020, pp. 1-7, https://doi.org/10.1109/VTC2020-Fall49728.2020.9348672. 49. C. E. Shannon, “Programming a Computer for Playing Chess,” 1950. 50. J. Mitola and G. Q. Maguire, “Cognitive radio: making software radios more personal,” in IEEE Personal Communications, vol. 6, no. 4, pp. 13-18, Aug. 1999, https://doi.org/10.1109/ 98.788210. 51. D. Grace and H. Zhang, “Cognitive Communications: Distributed Artificial Intelligence (DAI), Regulatory Policy and Economics, Implementation,” Wiley, 2012. 52. J. Hoydis, F. A. Aoudia, A. Valcarce and H. Viswanathan, “Toward a 6G AI-Native Air Interface,” in IEEE Communications Magazine, vol. 59, no. 5, pp. 76-81, May 2021, https:// doi.org/10.1109/MCOM.001.2001187. 53. A. Valcarce and J. Hoydis, “Toward Joint Learning of Optimal MAC Signaling and Wireless Channel Access,” in IEEE Transactions on Cognitive Communications and Networking, vol. 7, no. 4, pp. 1233-1243, Dec. 2021, https://doi.org/10.1109/TCCN.2021.3080677. 54. M. A. Uusitalo et al., “6G Vision, Value, Use Cases and Technologies From European 6G Flagship Project Hexa-X,” in IEEE Access, vol. 9, pp. 160004-160020, 2021, https://doi.org/ 10.1109/ACCESS.2021.3130030. 55. O. Ye et al., “The Next Decade of Telecommunications Artificial Intelligence,” Dec 2021, https://arxiv.org/abs/2101.09163. 56. M. Honkala, D. Korpi and J. M. J. Huttunen, “DeepRx: Fully Convolutional Deep Learning Receiver,” in IEEE Transactions on Wireless Communications, vol. 20, no. 6, pp. 3925-3940, June 2021, https://doi.org/10.1109/TWC.2021.3054520. 57. L. Huang, H. Zhang, R. Li, Y. Ge and J. Wang, “AI Coding: Learning to Construct Error Correction Codes,” in IEEE Transactions on Communications, vol. 68, no. 1, pp. 26-39, Jan. 2020, https://doi.org/10.1109/TCOMM.2019.2951403. 58. H. Seo, J. Park, M. Bennis, and M. Debbah, “Semantics-native communication with contextual reasoning,” 2021. [Online]. Available: https://arxiv.org/abs/2108.05681 59. X. Luo, H.-H. Chen, and Q. Guo, “Semantic communications: Overview, open issues, and future research directions,” IEEE Wireless Communications, vol. 29, no. 1, pp. 210–219, 2022. 60. J.-C. Belfiore and D. Bennequin, “Topos and stacks of deep neural networks,” 2021. [Online]. Available: https://arxiv.org/abs/2106.14587
6G: Vision, Applications, and Challenges
65
61. E. Bourtsoulatze, D. B. Kurka, and D. Gündüz, “Deep joint source channel coding for wireless image transmission,” in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 4774–4778. 62. N. Farsad, M. Rao and A. Goldsmith, “Deep Learning for Joint Source-Channel Coding of Text,” 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 2326-2330, https://doi.org/10.1109/ICASSP.2018.8461983. 63. S. Wu, G. Tsoukaneri and B. Mouhouche, “Q-Learning based Link Adaptation in 5G,” 2020 IEEE 31st Annual International Symposium on Personal, Indoor and Mobile Radio Communications, 2020, pp. 1-6, https://doi.org/10.1109/PIMRC48278.2020.9217256. 64. P. Kela, T. Höhne, T. Veijalainen and H. Abdulrahman, “Reinforcement Learning for Delay Sensitive Uplink Outer-Loop Link Adaptation,” 2022 Joint European Conference on Networks and Communications and 6G Summit (EuCNC/6G Summit), 2022, pp. 59-64, https://doi.org/10.1109/EuCNC/6GSummit54941.2022.9815746. 65. M. Mitev, M. M. Butt, P. Sehier, A. Chorti, L. Rose and A. Lehti, “Smart Link Adaptation and Scheduling for IIoT,” in IEEE Networking Letters, vol. 4, no. 1, pp. 6-10, March 2022, https://doi.org/10.1109/LNET.2022.3144733. 66. J. Song, I. Z. Kovács, M. Butt, J. Steiner and K. I. Pedersen, “Intra-RAN Online Distributed Reinforcement Learning For Uplink Power Control in 5G Cellular Networks,” 2022 IEEE 95th Vehicular Technology Conference: (VTC2022-Spring), 2022, pp. 1-7, https://doi.org/10. 1109/VTC2022-Spring54318.2022.9860770. 67. Q. Zhao, S. Paris, T. Veijalainen and S. Ali, “Hierarchical Multi-Objective Deep Reinforcement Learning for Packet Duplication in Multi-Connectivity for URLLC,” 2021 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit), 2021, pp. 142-147, https://doi.org/10.1109/EuCNC/6GSummit51104.2021.9482453. 68. J. J. Hernández-Carlén, J. Pérez-Romero, O. Sallent, I. Vilà and F. Casadevall, “A Deep QNetwork-Based Algorithm for Multi-Connectivity Optimization in Heterogeneous CellularNetworks,” in Sensors, vol. 22, no. 16, August 2022, https://doi.org/10.3390/s22166179. 69. A. Masri, T. Veijalainen, H. Martikainen, S. Mwanje, J. Ali-Tolppa and M. Kajó, “MachineLearning-Based Predictive Handover,” 2021 IFIP/IEEE International Symposium on Integrated Network Management (IM), 2021, pp. 648-652. 70. A. Prado, H. Vijayaraghavan and W. Kellerer, “ECHO: Enhanced Conditional Handover boosted by Trajectory Prediction,” 2021 IEEE Global Communications Conference (GLOBECOM), 2021, pp. 01-06, https://doi.org/10.1109/GLOBECOM46510.2021.9685348. 71. Njima W, Chafii M, Chorti A, Shubair RM, Poor HV. Indoor localization using data augmentation via selective generative adversarial networks. IEEE Access. 2021 Jul 8;9:98337-47. 72. Njima W, Bazzi A, Chafii M. DNN-based Indoor Localization Under Limited Dataset using GANs and Semi-Supervised Learning. IEEE Access. 2022 Jul 1;10:69896-909. 73. A. Decurninge et al., “CSI-based Outdoor Localization for Massive MIMO: Experiments with a Learning Approach,” 2018 15th International Symposium on Wireless Communication Systems (ISWCS), 2018, pp. 1-6, https://doi.org/10.1109/ISWCS.2018.8491210. 74. S. Kadambi et al., “Neural RF SLAM for unsupervised positioning and mapping with channel state information,” ICC 2022 - IEEE International Conference on Communications, 2022, pp. 3238-3244, https://doi.org/10.1109/ICC45855.2022.9838367. 75. A. Alkhateeb, G. Charan, T. Osman, A. Hredzak, and N. Srinivas, “DeepSense 6G: largescale real-world multi-modal sensing and communication datasets,” to be available on arXiv, 2022. [Online]. Available: https://www.DeepSense6G.net 76. G. Charan, T. Osman, A. Hredzak, N. Thawdar and A. Alkhateeb, “Vision-Position MultiModal Beam Prediction Using Real Millimeter Wave Datasets,” 2022 IEEE Wireless Communications and Networking Conference (WCNC), 2022, pp. 2727-2731, https://doi. org/10.1109/WCNC51071.2022.9771835. 77. S. Wu, C. Chakrabarti and A. Alkhateeb, “LiDAR-Aided Mobile Blockage Prediction in Real-World Millimeter Wave Systems,” 2022 IEEE Wireless Communications and Networking Conference (WCNC), 2022, pp. 2631-2636, https://doi.org/10.1109/WCNC51071.2022. 9771651.
66
D. B. da Costa et al.
78. T. Nishio, Y. Koda, J. Park, M. Bennis and K. Doppler, “When Wireless Communications Meet Computer Vision in Beyond 5G,” in IEEE Communications Standards Magazine, vol. 5, no. 2, pp. 76-83, June 2021, https://doi.org/10.1109/MCOMSTD.001.2000047. 79. R. Li et al., “Deep Reinforcement Learning for Resource Management in Network Slicing,” in IEEE Access, vol. 6, pp. 74429-74441, 2018, https://doi.org/10.1109/access.2018.2881964. 80. K. Mehmood et al., “Intent-driven Autonomous Network and Service Management in Future Networks: A Structured Literature Review,” Computer Networks, 2021, https://doi.org/10. 48550/ARXIV.2108.04560. 81. M. K. Shehzad, L. Rose, M. M. Butt, I. Z. Kovács, M. Assaad and M. Guizani, “Artificial Intelligence for 6G Networks: Technology Advancement and Standardization,” in IEEE Vehicular Technology Magazine, vol. 17, no. 3, pp. 16-25, Sept. 2022, https://doi.org/10. 1109/MVT.2022.3164758. 82. Y. Yang et al., “6G Network AI Architecture for Everyone-Centric Customized Services,” 2022, https://doi.org/10.48550/ARXIV.2205.09944. 83. A. Tak and S. Cherkaoui, “Federated Edge Learning: Design Issues and Challenges,” in IEEE Network, vol. 35, no. 2, pp. 252-258, March/April 2021, https://doi.org/10.1109/MNET.011. 2000478. 84. W. Liu, X. Zang, Y. Li and B. Vucetic, “Over-the-Air Computation Systems: Optimization, Analysis and Scaling Laws,” in IEEE Transactions on Wireless Communications, vol. 19, no. 8, pp. 5488-5502, Aug. 2020, https://doi.org/10.1109/TWC.2020.2993703. 85. S. Savazzi, M. Nicoli and V. Rampa, “Federated Learning With Cooperating Devices: A Consensus Approach for Massive IoT Networks,” in IEEE Internet of Things Journal, vol. 7, no. 5, pp. 4641-4654, May 2020, https://doi.org/10.1109/JIOT.2020.2964162. 86. L. Barbieri, S. Savazzi and M. Nicoli, “Decentralized Federated Learning for Road User Classification in Enhanced V2X Networks,” 2021 IEEE International Conference on Communications Workshops (ICC Workshops), 2021, pp. 1-6, https://doi.org/10.1109/ ICCWorkshops50388.2021.9473581. 87. I. Kajic et al., “Learning to cooperate: Emergent communication in multi-agent navigation,” 2020, https://doi.org/10.48550/ARXIV.2004.01097. 88. H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, “Deep learning enabled semantic communication systems,” IEEE Transactions on Signal Processing, vol. 69, pp. 2663–2675, 2021. 89. P. Jiang, C.-K. Wen, S. Jin, and G. Y. Li, “Deep source-channel coding for sentence semantic transmission with HARQ,” IEEE Transactions on Communications, pp. 1–1, 2022. 90. E. C. Strinati and S. Barbarossa, “6g networks: Beyond shannon towards semantic and goaloriented communications,” 2020. [Online]. Available: https://arxiv.org/abs/2011.14844 91. Q. Zhou, R. Li, Z. Zhao, C. Peng, and H. Zhang, “Semantic communication with adaptive universal transformer,” 2021. [Online]. Available: https://arxiv.org/abs/2108.09119 92. K. Lu, Q. Zhou, R. Li, Z. Zhao, X. Chen, J. Wu, and H. Zhang, “Rethinking modern communication from semantic coding to semantic communication,” IEEE Wireless Communications, pp. 1–13, 2022. 93. J. Dai, S. Wang, K. Tan, Z. Si, X. Qin, K. Niu, and P. Zhang, “Nonlinear transform sourcechannel coding for semantic communications,” 2021. [Online]. Available: https://arxiv.org/ abs/2112.10961 94. Z. Weng, Z. Qin, X. Tao, C. Pan, G. Liu, and G. Y. Li, “Deep learning enabled semantic communications with speech recognition and synthesis,” 2022. [Online]. Available: https:// arxiv.org/abs/2205.04603 95. Y. Wang, Z. Gao, D. Zheng, S. Chen, D. Gündüz, and H. V. Poor, “Transformer-empowered 6g intelligent networks: From massive mimo processing to semantic communication,” 2022. [Online]. Available: https://arxiv.org/abs/2205.03770 96. H. Xie and Z. Qin, “A lite distributed semantic communication system for internet of things,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 1, pp. 142–153, 2021. 97. C. K. Thomas and W. Saad, “Neuro-symbolic artificial intelligence (ai) for intent based semantic communication,” 2022. [Online]. Available: https://arxiv.org/abs/2205.10768
6G: Vision, Applications, and Challenges
67
98. M. Chehimi, C. Chaccour, and W. Saad, “Quantum semantic communications: An unexplored avenue for contextual networking,” 2022. [Online]. Available: https://arxiv.org/abs/2205. 02422 99. M. K. Farshbafan, W. Saad, and M. Debbah, “Curriculum learning for goal-oriented semantic communications with a common language,” 2022. [Online]. Available: https://arxiv.org/abs/ 2204.10429 100. M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst, “Geometric deep learning: Going beyond euclidean data,” IEEE Signal Processing Magazine, vol. 34, no. 4, pp. 18–42, jul 2017. 101. Y. Feng, H. You, Z. Zhang, R. Ji, and Y. Gao, “Hypergraph neural networks,” 2018. [Online]. Available: https://arxiv.org/abs/1809.09401 102. S. Ebli, M. Defferrard, and G. Spreemann, “Simplicial neural networks,” ArXiv, vol. abs/2010.03633, 2020. 103. M. Hajij, K. Istvan, and G. Zamzmi, “Cell complex neural networks,” 2020. [Online]. Available: https://arxiv.org/abs/2010.00743 104. H. Zhang, N. Shlezinger, F. Guidi, D. Dardari, M. F. Imani, and Y. C. Eldar, “Near-field wireless power transfer for 6G internet of everything mobile networks: Opportunities and challenges,” IEEE Commun. Mag., vol. 60, no. 3, pp. 12–18, 2022. 105. L. Gu, G. Zulauf, A. Stein, P. A. Kyaw, T. Chen, and J. M. R. Davila, “6.78-mhz wireless power transfer with self-resonant coils at 95% dc–dc efficiency,” IEEE Trans. Power Electron., vol. 36, no. 3, pp. 2456–2460, 2021. 106. J. Pries, V. P. N. Galigekere, O. C. Onar, and G.-J. Su, “A 50-kw three-phase wireless power transfer system using bipolar windings and series resonant networks for rotating magnetic fields,” IEEE Trans. Power Electron., vol. 35, no. 5, pp. 4500–4517, 2020. 107. K. W. Choi, S. I. Hwang, A. A. Aziz, H. H. Jang, J. S. Kim, D. S. Kang, and D. I. Kim, “Simultaneous wireless information and power transfer (SWIPT) for internet of things: Novel receiver design and experimental validation,” IEEE Int. Things Journal, vol. 7, no. 4, pp. 2996–3012, 2020. 108. H. Stockman, “Communication by means of reflected power,” in IRE, vol. 36, no. 10, 1948, p. 1196–1204. 109. X. Lu, D. Niyato, H. Jiang, D. I. Kim, Y. Xiao, and Z. Han, “Ambient backscatter assisted wireless powered communications,” IEEE Wireless Commun., vol. 25, no. 2, pp. 170–177, 2018. 110. N. Van Huynh, D. T. Hoang, X. Lu, D. Niyato, P. Wang, and D. I. Kim, “Ambient backscatter communications: A contemporary survey,” IEEE Commun. Surveys Tut., vol. 20, no. 4, pp. 2889–2922, 2018. 111. M. D. Renzo, A. Zappone, M. Debbah, M. Alouini, C. Yuen, J. D. Rosny, and S. Tretyakov, “Smart radio environments empowered by reconfigurable intelligent surfaces: How it works, state of research, and road ahead,” IEEE J. Sel. Areas Commun., vol. 38, no. 11, 2020. 112. A. S. de Sena, D. Carrillo, F. Fang, P. H. J. Nardelli, D. B. d. Costa, U. S. Dias, Z. Ding, C. B. Papadias, and W. Saad, “What role do intelligent reflecting surfaces play in multi-antenna non-orthogonal multiple access?”, IEEE Wireless Commun., vol. 27, no. 5, pp. 24–31, Oct. 2020. 113. A. S. de Sena, P. H. J. Nardelli, D. B. da Costa, P. Popovski, and C. B. Papadias, “Ratesplitting multiple access and its interplay with intelligent reflecting surfaces,” Available at Early Access Issues, IEEE Commun. Mag., pp. 1–7, 2022. 114. C. Pan, H. Ren, K. Wang, M. Elkashlan, A. Nallanathan, J. Wang, and L. Hanzo, “Intelligent reflecting surface aided MIMO broadcasting for simultaneous wireless information and power transfer,” IEEE J. Sel. Areas Commun., vol. 38, no. 8, pp. 1719–1734, 2020. 115. P. Ramezani and A. Jamalipour, “Backscatter-assisted wireless powered communication networks empowered by intelligent reflecting surface,” IEEE Trans. Veh. Technol., vol. 70, no. 11, pp. 11908–11922, 2021. 116. W. Zhang, Y. Qin, W. Zhao, M. Jia, Q. Liu, R. He, and B. Ai, “A green paradigm for internet of things: Ambient backscatter communications,” China Commun., vol. 16, no. 7, pp. 109–119, 2019.
68
D. B. da Costa et al.
117. W. Zhang, C.-X. Wang, X. Ge, and Y. Chen, “Enhanced 5G cognitive radio networks based on spectrum sharing and spectrum aggregation”, IEEE Trans. Commun., vol. 66, no. 12, pp. 6304–6316, 2018. 118. G. K. Papageorgiou et al., “Advanced dynamic spectrum 5G mobile networks employing licensed shared access,” IEEE Commun. Mag., vol. 58, no. 7, pp. 21–27, 2020. 119. H. Zeng, X. Zhu, Y. Jiang, Z. Wei, and L. Chen, “Hierarchical symbiotic transmission strategy with cooperative NOMA for cognitive radio networks,” IEEE Wireless Commun. Lett., vol. 11, no. 3, pp. 558–562, 2022. 120. Y. H. Al-Badarneh, A. Elzanaty, and M.-S. Alouini, “On the performance of spectrum-sharing backscatter communication systems,” IEEE Int. Things J., vol. 9, no. 3, pp. 1951–1961, 2022. 121. J. Jeon, R. D. Ford, V. V. Ratnam, J. Cho, and J. Zhang, “Coordinated dynamic spectrum sharing for 5G and beyond cellular networks,” IEEE Access, vol. 7, pp. 111 592–111 604, 2019. 122. A. Narayanan, A. S. D. Sena, D. Gutierrez-Rojas, D. C. Melgarejo, H. M. Hussain, M. Ullah, S. Bayhan, and P. H. J. Nardelli, “Key advances in pervasive edge computing for industrial internet of things in 5G and beyond,” IEEE Access, vol. 8, pp. 206 734–206 754, 2020. 123. Nokia, “Nokia AVA – AI energy efficiency for telco,” 2022. [Online]. Available: https://www. nokia.com/networks/services/NokiaAVA/energyefficiency, [Accessed: May 29, 2022.]. 124. H. Fourati, R. Maaloul, L. Fourati, and M. Jmaiel, “An efficient energy-saving scheme using genetic algorithm for 5G heterogeneous networks,” IEEE Systems Journal, pp. 1–10, 2022. 125. Q. Zeng, Y. Du, K. Huang, and K. K. Leung, “Energy-efficient resource management for federated edge learning with CPU-GPU heterogeneous computing,” IEEE Trans. Wireless Commun., vol. 20, no. 12, pp. 7947–7962, 2021. 126. H. Chergui, L. Blanco, L. A. Garrido, K. Ramantas, S. Kuklinski, A. Ksentini, and C. Verikoukis, “Zero-touch AI-driven distributed management for energy-efficient 6G massive network slicing,” IEEE Network, vol. 35, no. 6, pp. 43–49, 2021. 127. M. Miozzo, Z. Ali, L. Giupponi, and P. Dini, “Distributed and multi-task learning at the edge for energy efficient radio access networks,” IEEE Access, vol. 9, pp. 12 491–12 505, 2021. 128. A. Zappone, M. Di Renzo, and M. Debbah, “Wireless networks design in the era of deep learning: Model-based, AI-based, or both?” IEEE Trans. Commun., vol. 67, no. 10, pp. 7331– 7376, 2019. 129. A. S. de Sena, D. B. da Costa, Z. Ding, and P. H. J. Nardelli, “Massive MIMO-NOMA networks with multipolarized antennas,” IEEE Trans. Wireless Commun., vol. 18, no. 12, pp. 5630–5642, Dec. 2019. 130. A. S. de Sena, F. R. M. Lima, D. B. da Costa, Z. Ding, P. H. J. Nardelli, U. S. Dias, and C. B. Papadias, “Massive MIMO-NOMA networks with imperfect SIC: Design and fairness enhancement,” IEEE Trans. Wireless Commun., vol. 19, no. 9, pp. 6100–6115, 2020. 131. Y. Karacora, C. Chaccour, A. Sezgin, and W. Saad, “Reliable beam tracking with dynamic beamwidth adaptation in terahertz (THz) communications,” 2022. [Online]. Available: https:// arxiv.org/abs/2201.06541 132. V.-L. Nguyen, P.-C. Lin, B.-C. Cheng, R.-H. Hwang, and Y.-D. Lin, “Security and privacy for 6G: A survey on prospective technologies and challenges,” IEEE Commun. Surv. Tutorials, vol. 23, no. 4, pp. 2384–2428, 2021. 133. J. Chen, Y.-C. Liang, Y. Pei, and H. Guo, “Intelligent reflecting surface: A programmable wireless environment for physical layer security,” IEEE Access, vol. 7, pp. 82 599–82 612, 2019. 134. H. Yang, Z. Xiong, J. Zhao, D. Niyato, Q. Wu, H. V. Poor, and M. Tornatore, “Intelligent reflecting surface assisted anti-jamming communications: A fast reinforcement learning approach,” IEEE Trans. Wireless Commun., vol. 20, no. 3, pp. 1963–1974, 2021. 135. X. Guan, Q. Wu, and R. Zhang, “Intelligent reflecting surface assisted secrecy communication: Is artificial noise helpful or not?” IEEE Wireless Commun. Lett., vol. 9, no. 6, pp. 778–782, 2020. 136. S. Hong, C. Pan, H. Ren, K. Wang, and A. Nallanathan, “Artificial-noise-aided secure MIMO wireless communications via intelligent reflecting surface,” IEEE Trans. Commun., vol. 68, no. 12, pp. 7851–7866, 2020.
6G: Vision, Applications, and Challenges
69
137. Z. Ji, P. L. Yeoh, D. Zhang, G. Chen, Y. Zhang, Z. He, H. Yin, and Y. li, “Secret key generation for intelligent reflecting surface assisted wireless communication networks,” IEEE Transactions on Vehicular Technology, vol. 70, no. 1, 2021. 138. A. S. de Sena, P. H. J. Nardelli, D. B. da Costa, P. Popovski, C. B. Papadias, and M. Debbah, “Dual-polarized RSMA for massive MIMO systems,” IEEE Wireless Commun. Lett., pp. 1–1, 2022. 139. H. Fu, S. Feng, W. Tang, and D. W. K. Ng, “Robust secure beamforming design for two-user downlink MISO rate-splitting systems,” IEEE Trans. Wireless Commun., vol. 19, no. 12, pp. 8351–8365, 2020. 140. H. Xia, Y. Mao, X. Zhou, B. Clerckx, S. Han, and C. Li, “Secure beamforming design for rate-splitting multiple access in multi-antenna broadcast channel with confidential messages,” 2022. [Online]. Available: https://arxiv.org/abs/2202.07328 141. C. Wang and A. Rahman, “Quantum-enabled 6G wireless networks: opportunities and challenges,” IEEE Wireless Commun., vol. 29, no. 1, pp. 58–69, 2022. 142. F. Xu, M. Curty, B. Qi, and H.-K. Lo, “Measurement-device-independent quantum cryptography,” IEEE J. Sel. Top. Quantum Electron., vol. 21, no. 3, pp. 148–158, 2015. 143. A. S. Cacciapuoti, M. Caleffi, R. Van Meter, and L. Hanzo, “When entanglement meets classical communications: Quantum teleportation for the quantum internet,” IEEE Trans. Commun., vol. 68, no. 6, pp. 3808–3833, 2020. 144. M. Sasaki, “Quantum key distribution and its applications,” IEEE Secur. Privacy, vol. 16, no. 5, pp. 42–48, 2018. 145. Azim, A.W., Monsalve, J.L.G. and Chafii, M., 2021. Enhanced PSK-LoRa. IEEE Wireless Communications Letters, 11(3), pp.612-616. 146. A. W. Azim, A. Bazzi, R. Shubair and M. Chafii, “Dual-Mode Chirp Spread Spectrum Modulation,” in IEEE Wireless Communications Letters, vol. 11, no. 9, pp. 1995-1999, Sept. 2022, https://doi.org/10.1109/LWC.2022.3190564. 147. Gizzini AK, Chafii M, Nimr A, Shubair RM, Fettweis G. CNN aided weighted interpolation for channel estimation in vehicular communications. IEEE Transactions on Vehicular Technology. 2021 Oct 14;70(12):12796-811. 148. A. K. Gizzini and M. Chafii, “A Survey on Deep Learning Based Channel Estimation in Doubly Dispersive Environments,” in IEEE Access, vol. 10, pp. 70595-70619, 2022, https:// doi.org/10.1109/ACCESS.2022.3188111. 149. D. Chandra, M. Caleffi, and A. S. Cacciapuoti, “The entanglement-assisted communication capacity over quantum trajectories,” IEEE Trans. Wireless Commun., vol. 21, no. 6, pp. 3632– 3647, 2022. 150. S. J. Nawaz, S. K. Sharma, S. Wyne, M. N. Patwary, and M. Asaduzzaman, “Quantum machine learning for 6G communication networks: State-of-the-art and vision for the future,” IEEE Access, vol. 7, pp. 46 317–46 350, 2019. 151. Z. Bao, Q. Wang, W. Shi, L. Wang, H. Lei, and B. Chen, “When blockchain meets SGX: An overview, challenges, and open issues,” IEEE Access, vol. 8, pp. 170 404–170 420, 2020. 152. H. Li, P. Gao, Y. Zhan, and M. Tan, “Blockchain technology empowers telecom network operation,” China Commun., vol. 19, no. 1, pp. 274–283, 2022. 153. W. Zheng, Z. Zheng, X. Chen, K. Dai, P. Li, and R. Chen, “NutBaaS: A blockchain-as-aservice platform,” IEEE Access, vol. 7, pp. 134 422–134 433, 2019. 154. R. Khan, P. Kumar, D. N. K. Jayakody, and M. Liyanage, “A survey on security and privacy of 5G technologies: Potential solutions, recent advancements, and future directions,” IEEE Commun. Surv. Tutorials, vol. 22, no. 1, pp. 196–248, 2020.
6G Visions and Requirements Yifei Yuan
1 Driver and Direction of Development Over the last 50 years, mobile communications have gone through five generations, evolving from 1G voice to 2G message to 3G mobile data to 4G mobile broadband to 5G enhanced mobile broadband (eMBB), ultra-reliable low-latency communications (URLLC), and massive machine-type communications (mMTC). Mobile communications play a very important role in the development of a country’s economy and bring huge real benefits to people not only to increase their work productivity but also in various aspects of their daily lives. It is quite fair to say that mobile communications have significantly changed the ways of financial business, education, health care, sports, entertainments, publication, media, industry, transportation, etc., as well as the interactions between humans. During this process, the life styles of people are redefined. In the future, driven by the demand for further improvement of the societies, brand-new application scenarios are expected to emerge for 2030 and beyond, with deeper integration between communications technologies and new disruptive technologies. All these will definitely bring about even more challenging and more stringent requirements in terms of mobile communications capabilities, thus propelling the development of next-generation mobile communications systems [1].
Y. Yuan (O) China Mobile Research Institute, Beijing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_3
71
72
Y. Yuan
1.1 Driver from Societal Development It is expected that there will be various social issues in the future. For instance, the aging of population is a very common issue faced by many countries, especially for developed countries. As less percentage of people in a society actively contribute to the production of goods and services, and more people are becoming pure consumers to deplete the assets of entire society, human capitals would flow out and the cost of running a society would dramatically increase. In the end, this burden would be shared by everyone in the society. Another social issue is the rapid urbanization. While such change is quite necessary in particular for developing countries, the acceleration of moving people from rural areas to cities will impose severe strain on education systems, health care networks, transportations, employment, and living accommodations. The resources that can be provided by cities are always limited and it will be difficult to catch up with the increasing demand during the urbanization process. Other social issues would include public safety, prevention and control of epidemics, etc., all of which can be extremely challenging simply due to the more diversified populations of future societies. People may come from different races, cultures, religions, education backgrounds, political groups, and so on. In order to solve these social issues, future societies need to maintain enough level of fairness among different groups of people to access the resources for education and health care and to enjoy various benefits entitled to common citizens. The societies also need to implement adequate safety measures to minimize the loss of lives and work-related injuries. More comprehensive mechanisms should be established to prevent illnesses and to offer efficient treatments of different kinds of diseases. Only by doing so, can the quality of living be improved, the productivity of the entire society be increased, and the efficiency of the managements be enhanced. People are counting on future mobile communications systems as one of the most promising ingredients to assist the solutions to social issues in the future.
1.2 Driver from Mobile Services and Scenarios There is no doubt that because of mobile communications, people’s lives are getting more convenient and efficient in almost every aspect. In 4G era, mobile purchasing and mobile payment spawned several new services with explosive growth, attracting people across all age groups, different geographical regions, and diverse cultural backgrounds. In 5G, the networks not only can support eMBB like VR or XR but also would enable new application scenarios such as unmanned factories, remote health care such as complicated surgeries that require very high data rate connection, very low latencies, and high reliabilities. Mobile communications have expanded from the traditional use cases of voice and high-speed data to more diversified services whose requirements can be more challenging and scenario-specific. Con-
6G Visions and Requirements
73
sequently, mobile communications are impacting many parts of society, including economics, governments, life styles, etc. Looking ahead toward 2030, a lot of new services and use scenarios are expected, for instance, holographical interactions, the conveyance of human feelings, integrated space-and-terrestrial communications, etc., all of which set higher requirements for mobile communications. Apparently, those more stringent performance goals would not be achieved by the current 5G technologies and networks, calling for the next-generation mobile networks to fulfill them.
1.3 Technology Drivers From technology innovation point of view, continuous developments are seen in the field of communications, computing, data storage, and transmissions. Rapid growth is also seen in new materials, new devices, new fabrication, and manufacturing, etc., that do not belong to the traditional communications domain. The integration of new technologies such as cloud computing, big data, block chain, artificial intelligence with communications will propel the evolution from 5G to 6G networks. Below are a few examples: • Artificial intelligence: With the development and wider applications of artificial intelligence, tighter integration is expected between communications and artificial intelligence, which would lead to new network architectures with innate intelligence. • Computing power: Built upon the foundation of 5G networks, computing power is a valuable capability of networks which should be further utilized so that the potential of huge networks can be fully released. By doing this, edge computing can be more efficiently employed to serve specific application scenarios and various vertical industries. • Materials technologies: With the fast development in material sciences, highly efficient power amplifiers based on new materials, reconfigurable intelligent surface (RIS) based on meta-materials, etc. will be more feasible and would impose new requirements for data rate of transmission, the network configuration, etc.
1.4 Concept of Five Big Developments Sustainable growth and having the ability to solve social problems are the long-term goals of economic development of societies. During the development, it is crucial that a balance should be maintained to ensure a relatively fair distribution of welfare, financial resources, education opportunities, and health care across various groups in societies. It is also important to encourage economic development while protecting the environment from being damaged, polluted, or inconvertibly changed. This is
74
Y. Yuan
a great and challenging project that requires coordinated efforts of entire societies. We should firmly hold the development concept of “innovation, coordination, green, openness, sharing” and propose remedies for social developments. “Five big developments” concept can be applied to the field of communications as a guide for sustainable growth of mobile communications networks. With this direction, we may come up with ideas and actionable strategies for the evolution and development from 5G networks to future 6G networks, including the following: “Innovation” emphasizes the breakthroughs in fundamental theories and innovations in basic technologies which can be reflected in breakthroughs or significant extensions from traditional communications theories, new architectures of integrated communications and computing, introduction of disruptive or revolutionary technologies such as holographs, brain-inspired computing, reinforced artificial intelligence, and so on. “Coordination” means that in order to build future networks, science and technology breakthroughs in communications alone are not enough. It is crucial that communications, information science, material science and technology, energy industry, and various vertical industries work together in a coordinated manner. Only when the innovation chain and the industrial chain are in close collaboration can we achieve a coordinated effort in developing truly global industrial standards that can fully take advantage of huge economic scale world-wide. This in turn would promote the globalization of industrial development, increase the mutual collaboration between regions and countries, and reduce the development gaps between the poor regions and the rich regions. “Green” becomes more crucial as operators of 5G networks are getting more concerned about the excessive energy consumption due to massive numbers of antennas, wide system bandwidths, and millimeter-wave band operations. Electricity bills now constitute a significant portion of the operation expenditure, which makes it challenging to ensure sustainable operations. Hence, future networks should have low carbon footprint and be energy efficient. The network equipment needs to be more eco-compatible and can coexist with the surrounding environments. It is desirable that future networks would be self-sustainable with renewable power like wind, solar, etc. It would also be the goal that the device materials can be reconfigurable and the equipment can be 4D-printed using new printing and manufacturing technologies, therefore, to save the energies and materials compared to the traditional ways of fabrications. “Openness” requires that the future mobile networks should have interoperable interfaces to maintain an open ecology where the capabilities of different network components are sufficiently distributed among diverse vendors. This is in contrast to vertically integrated systems that tend to be monopolized by a few dominant vendors. To ensure such openness, a large portion of software and hardware should be open-sourced to encourage fair competitions. This would lead to more healthy markets that are open and transparent. By doing so, future networks would be ecologically open and embrace greater integration, interoperability, and mutual intelligence.
6G Visions and Requirements
75
“Sharing” stands for the generalization of future mobile networks where transportations, electricity, urban development, and other infrastructure development of societies can enjoy the benefit offered by mobile communications. As a major component in digitizing entire societies, future mobile networks would play a key role in enabling more centralization and larger economical scale to upgrade infrastructures and further improve the operational efficiencies of infrastructures. Built upon the foundation laid out by 5G networks, 6G will add more power to propel the development of the economies of societies and to realize the “five big developments.” In terms of “innovation,” 6G will provide governments, enterprises, and individual persons with various capabilities from aspects of technology innovations, management innovations, business innovations, cultural innovations. These capabilities are reflected in the capability of fundamental platforms which can link together different strata of societies, the capability of information services where the information can take many forms such as text documents, images, videos, audios and various databases with advanced search engines, the capability of computing that can be centralized cloud computing, edge computing, and distributed computing powers scattered in various communications hardware in base stations and mobile terminals, internet of things devices, etc., and the capability of artificial intelligence which would permeate into every corners of the networks to achieve the true innate intelligence. For the “coordination,” 6G networks will inter-connect isolated information islands and remote countries to form a coordinated chain. This would facilitate the “belt and road” and drive the coordinated effort of global economy development and integration. During the process, new business models and ecosystems will be established that will provide the catalyzer for the development of vertical industries. This will also help to build more comprehensive administrative systems of governments. Regarding the concept of “green,” 6G networks are expected to offer extremely powerful capabilities in sensing the environments that surround the networks. Such sensing capabilities can get benefit from the three-dimensional coverage worldwide, from terrestrial to space, from underwater to underground. This would encourage people to come up with environmental protection proposals that involve multiple countries around the world to form a close collaboration to fight against environment pollutions, to push the transformation and up-grading of traditional energy-inefficient industries and practices, so that we can achieve very low carbon emission, highly green, and efficient operation and development. For ‘openness’, it is an innate feature of 6G mobile networks. 6G will be a key enabler to the further opening of global economy, the opening of markets, the opening of cultures, and the opening of government policies and political systems. In the area of “sharing,” with 6G, we can build an artificial intelligence (AI)-based infrastructure to be shared by most citizens of societies. In these societies, big data analysis and artificial intelligence are no longer the privilege of big enterprises, governments, or very important persons. Instead, ordinary people can conveniently access big data and enjoy the benefit of artificial intelligence. The dream of true “data sharing” can be achieved and the fairness can be guaranteed regarding the digital dividend and digital rights. Future mobile networks would facilitate the further upgrade of “shared economy” since economy will no longer
76
Y. Yuan
be restricted within a local area, a country, or a specific group of people. The shared economy will lead to “shared manufacturing” and “shared infrastructures” across industries in global platforms, helping to create a new shared eco-system where benefits of the future networks can be shared by every stratum of societies. It is crucial to hold firm the concept of “five big developments” so that 6G can reshape and reconstruct the network space which would provide more room for integration of global economic development and facilitate building up the global community of joint development, the global community of security, and the global community of common interests.
2 Vision and Scenarios of 2030+ 2.1 Vision: Digital Twins and Ubiquitous Intelligence “4G mobile networks changed the way of people’s life. 5G mobile networks have changed the societies.” Such saying underscores the never-stopped aspiration of human beings to enjoy better network performances of mobile communications and pursue higher quality of lives, more comfortable working, living conditions. 4G mobile communications coincided with the explosive growth of data services that prevailed in many aspects of societies. Accompanying the rapid growth of penetration rate for smart phones and the expansion of internet for consumers, everyday lives of people have become much more convenient than before, as seen from clothing, to meals, to accommodations, to transportation, to education, to entertainments, to sports, to health care, etc. 5G mobile communications have started a new era featured by the interconnection of massive number of things. 5G mobile networks can achieve multi-level and comprehensive interconnections between humans and humans, between humans and things, and between things and things. The applications of 5G permeated through various fields of societies, dramatically reviving the impetus of entire societies to move forward. In the future, with the fast expansion of 5G applications, new breakthroughs in science and technologies, the deeper integration between new technologies and communications technology, and new requirements at higher levels are expected to be derived and formulated. If one would say that 5G mobile networks can achieve ubiquitous availability and access to information, regardless of time, space, social ranking, etc., 6G mobile networks, on top of the foundation built by 5G networks, are expected to support more comprehensive digitization of the entire world. Such higher level of digitization will be assisted by the fast development of artificial intelligence, therefore being able to fulfill the goal of ubiquitous availability and access to the intelligence and to allow more things to work or operate intelligently and better serve people and the societies. In the years of 2030 and beyond, the whole world will be made up of two worlds: one is the real physical world we have lived so far and the other is the fully
6G Visions and Requirements
77
Fig. 1 Digital twins and ubiquitous intelligence
digitized virtual twin world that is based on and mirrored to the physical world. The information and intelligence between humans and humans, between humans and things, and between things and things in the physical world can be passed in the mirrored digital world. More specifically, the twin virtual world is a very comprehensive and huge simulation model that can precisely characterize the real physical world and accurately predict the future status of the physical world. This would be quite useful in the sense that based on the prediction of the twin virtual world, precautionary measures can be carried out in advance in the physical world to prevent the real world from being deviated from its normal course. By doing so, human beings can be further relieved from monotonous activities to monitor and control things around. People can engage in more advanced activities where the quality of their lives would be further improved. The productivity efficiency and management efficiency of the entire societies would be elevated. 6G will drive the world toward “digital twins and ubiquitous intelligence” and fulfill the rebuilding of the whole world as illustrated in Fig. 1.
2.2 Application Scenarios With the overall vision in mind, the future mobile communications networks will spawn many brand-new application scenarios in the aspects of intelligent sharing of lives, intelligence empowered productions, intelligent enlightenment of societies, as shown in Fig. 2.
78
Y. Yuan
Fig. 2 Typical application scenarios for 2030+
2.2.1
Intelligent Sharing of Lives
The capabilities of mobile communications services play a key role in determining the quality of human’s lives. The mobile communications systems in 2030 and beyond will enable holographic level of interaction; integration of communications and sensing; digital twin bodies; intelligent interactions fully utilizing new emerging technologies such as brain-to-machine interactions, artificial intelligence, molecule communications, etc., to achieve new forms of living, for instance, highly efficient learning, convenient shopping, coordinated offices, healthy biological bodies.
Holographic Level of Interaction The combination of 5G mobile communications and VR/AR technologies can realize some applications of 3D displaying and bring certain experiences in immersive and deep interactions. In 6G era, it is expected that the 3D holographic display will prevail, thanks to the fast development of display technologies and data communications. With 3D display via holography mechanism, the end users do not need to wear any special equipment, but can vividly feel the 3D effect of full angle of 360 degree with naked eyes. Different information can be delivered and displayed from different perspective angles, making users to believe that they are in fact in the scenes – unable to differentiate whether the users are in the digital twin world or physical real world. Aided by future 6G mobile networks, holographic display technology will be integrated into many application scenarios such as communications, remote health care, office designs, military defense, entertainments, gaming, etc., as shown in Fig. 3. This would completely change the living habits of humans and bring much
6G Visions and Requirements
79
Fig. 3 Holographic interaction
higher quality of living experience. In addition, holographic interaction technology can facilitate the interaction between the content projector and the end users. Just like the scene in Hollywood’s movie “Iron Man,” the content on the displaying screen can easily be switched just by sliding the fingers in the air. It becomes so powerful that some digital and mechanical device parts can be fabricated and their performance can even be tested. With holographic interactions, information exchange and dissemination will no longer stick to a fixed mode. Holographic interactions drastically bring closer the information sources and the end users so that the human experience-oriented communications and information sharing can be realized.
Integrated Communications and Sensing The development of the Internet has undergone several stages, e.g., the fixed Internet, the mobile Internet, and the Internet of Things (IoT). The Internet is still full of potential and fast evolving. Vision and audio have been two fundamental ways of information passing and exchange for humans. In 5G mobile communications networks, the requirement of simultaneous live audio and live video can be achieved to some extent. However, the information received by humans is not only through auditory organs like ears and visual organs like eyes. Other sensory functions such as touching, smelling, tasting, etc., would also play very important roles in daily life of humans. In 6G era, integrated communications and sensing will become the main-stream communication method where many sensory organs will be parts of communi-
80
Y. Yuan
Fig. 4 Integrated sensing and communications
cations in the entire communications networks. One of the main development trends will be the coordination between multiple sensory organs and the deep participation into communications. With the network environment supported by 6G communications, integrated communications and sensing will spur more new applications and use scenarios, for example, health and medical care, technical training, entertainments, road and transportation, office and production, psychology and exchange of feelings, as illustrated in Fig. 4. Integrated communications and sensing can contribute to the fight against the complex challenges of various social issues and to provide the impetus of economic development and innovations.
Digital Twin Bodies Currently, 5G body networks have already found their use in medical care, for instance, the monitoring of human body’s health conditions, the assistance of disabled persons, and the monitoring of sports activities. Body networks can also provide some highly reliable communications at certain data rate, and thus facilitate the remote medical diagnoses, remote surgery operations, etc. In 6G era, various sensors can be implanted or deployed inside and outside of human bodies in a very dense manner. These sensors are capable of collecting, analyzing, and performing the modeling work of different types of data in real time. As shown in Fig. 5, such
6G Visions and Requirements
81
Fig. 5 Application scenarios of digital twin bodies
body networks can help to achieve the digital twins of human bodies which have unique characteristics that are neither of real humans nor of conventional machines or robots. Equipped with the future 6G network technologies, “digital twin bodies” can facilitate much more efficient study on mechanisms and pathology of human organs and viruses that afflict human bodies. Digital twin bodies can also provide assistance to surgeons to be able to more precisely anticipate various emergencies or unexpected scenarios that may happen during complex and vital surgical operations. One such application is as follows: when a surgeon performs a medical operation, the digital twin body can alert the surgeon what the consequence would be if the scalpel cutting starts from a specific position. Such advice would be very valuable in operating rooms. After the surgical operation is completed and the patient is released from the hospital, the digital twin body can still provide various health and medical advice or management to the patient to accelerate the recovery process. In the field of medical research, digital twin bodies would play important roles. For instance, the human brains are extremely complicated whose physiological activities are very difficult to monitor and study. How human brains carry out thinking and how the motion perceptions work would be the key and challenging research topics in neural sciences. When the digital twin bodies are applied to the study on brains, experiments can easily be prepared by the staff who can conduct comprehensive simulations to discover the mysteries and mechanisms deep in the brains. Similarly, digital twin bodies can be employed to mimic the attacks by viruses and bacteria. This would provide very useful information for studying the mechanisms of virusrelated contagious diseases.
82
Y. Yuan
Interactions of Intelligence Audio interactions, video interactions, and sensing interactions between humans and machines are already supported by 5G mobile communications networks. Looking toward 6G, the interactions of intelligence in 2030+ are expected to have breakthroughs in brand-new research directions such as the interactions of emotional feelings and brain-computer interface. In the future 6G mobile communications networks, there will be intelligent agents everywhere that have the capabilities of sensing, cognition, and even thinking. These intelligent agents will replace the apathy and passive machines/equipment of traditional interactions of intelligence. The pure hostand-server relationship between humans and intelligent agents will become more humane and have equal relationship where the intelligent systems will be able to communicate feelings via observing voice tones and facial expressions. With these observations, intelligent agents can infer the psychological activities and the feelings of human beings, so that they can help control the emotions of users and minimize the consequences of negative mood. For example, intelligence-assisted driving systems can first analyze and try to understand the emotion of a driver. Then the intelligent agent can alert the driver, or prevent abnormal driving practices, which is crucial for improving safety on the road and the transportation. In 6G communications networks, machines can be controlled by mind-reading or by the brain directly where some functionalities of human bodies will be performed by machines. This will significantly revive the confidence of disabled persons and provide the necessary assistance to overcome their physiological disadvantages, allowing them to maintain a status of high productivity, to be able to accumulate a lot of knowledge and acquire certain technical capabilities in a relatively short period of time, and to achieve the “lossless” transfer of brain information.
2.2.2
Intelligence-Empowered Production
Intelligence-empowered production is a concept of the evolution of production toward 2030. The current production mechanisms in agriculture and in industry will be empowered by the application of new communications technologies. This constitutes a very powerful force for the healthy growth of productions and facilitates the fast development of a digitized economy. With the wider applications of 5G, it is apparent that the manufacturing industry can achieve the first step of intelligent production via utilizing more digital information and network connections. For example, intelligent equipment such as drones is now used for agricultural activities to relieve farmers from many chores; Equipment like robots and VR helmets will be used in manufacturing businesses to assist the work of humans and can improve the rate of information collection and the efficiency of production. As new technologies develop further, more integration of technologies is expected between productions and digital twins, so that the vision of intelligence-empowered production can be fulfilled, as illustrated in Fig. 6.
6G Visions and Requirements
83
Fig. 6 Intelligence-empowered production
Agriculture Twin The restrictions on agricultural practices can be dramatically lifted by intelligenceempowered production. Hence, the productivity of agriculture is significantly increased, which is reflected in all the elements of productions [2]. Future mobile networks will support ubiquitous coverage that spans over the altitude of aviation, the outer space, and the sea, which further expands the field of agricultural production. In the future, the fields to grow the crops or vegetables will not be limited within the regular farm land areas. They can be extended to under-water scenarios or outer-space scenarios. Digital twin technology can model and simulate the process of agricultural production in advance and then provide remedies to deal with any possible negative impacts. Therefore, production capabilities and the utilizations of agricultural resources can be further improved. Block-chain technologies can integrate the farms, the certification bodies, the sellers, logistics, and storage companies into a chain of allies that can share resources. By doing so, the initial sources can be traced back, the next movements can be monitored, and the responsibilities can be checked [3]. In the meantime, the consumers’ demand in urban cities and the supply of rural produce can be tied more closely via various information technologies, thus adding more power to drive the flow of agricultural produce and pushing the development of intelligent agricultural ecosystems. Technologies such as big data, Internet of Things, and cloud computing will support more massive number of intelligent equipment like drones, robots, and sensors for monitoring the environments. Full connections between humans and things, and between things will be possible, which would propel the development in planting, forestry, animal husbandry, fishing, etc. To guarantee the total supply of agricultural products, agriculture in the future will rely more on biology technologies. Via
84
Y. Yuan
gene-editing technology, some genes of seeds would be altered [4], which can fundamentally eliminate the harmful effects of pests.
Industry Twin With respect to industry production, intelligence-empowered production means deeper integration between industrialization and digitization. The combination of digital twin technology and industry product not only can predict the various factors that can affect the development of industry output but also can facilitate the digital domain simulation and verification to assist research on production in industry labs, which further improves the innovation capabilities. Different from the digitized intelligent factories that are based on sensors and actuators, industry twin technology has a higher goal and is based on data and modeling, which is more suitable for new computing concepts such as artificial intelligence and big data [5]. More and more intelligent factories will coordinate with humans, machines, and things closely to create new models of intelligent manufacturing. Intelligent robots will replace humans and current robots to form the major work force to achieve agile manufacturing. Industry manufacturing will be more self-driven and intelligent. The development of nano-technologies has provided a brand-new way for monitoring and detecting processes in various stages of industry production. Hence, nano-robots will be part of the production cycle and carry out monitoring throughout the entire life cycle of the product. Industry production, the storage, and the sale strategies will benefit from the realtime dynamic analysis of market data, which would ensure the profit maximization for industry production. Relying on the further integration of big data platform with industries, all the related activities in the industries can be more efficiently coordinated and optimized, which would lead to a new industry twin model in future societies [6].
2.2.3
Intelligence-Enlightened Societies
Mobile communications networks are the key infrastructure to build intelligent societies. Looking forward to 2030+, mobile communications networks will be the networks with ubiquitous coverage that would integrate land, air, space, and sea. The future networks not only can dramatically improve the system capacities to support more intelligent infrastructure but also can extend the coverage of public services to narrow the digital gap between different regions, to refine the level of social management, and to pave way for building a better society full of intelligence.
6G Visions and Requirements
85
“Ubiquitous Coverage” to Assist Intelligent Infrastructure: Super Transportation As the transportation networks become more digitized, coordinated, and intelligent, these trends will continue and push forward the deep reform in the field of transportation. Traveling will become on-demand and immediate services will have requirements and resources of services connected by data. No matter whether a person lives in a city, or in remote mountains, or high above in the sky, multi-mode transportation vehicles across sea-land-air-space will enable the point-to-point, door-to-door travels and deliver three-dimensional transportation services and super network performance. This will greatly promote the full-scale upgrade of shopping, leisure and entertainment, and business services and bring brand-new experience of travels. In 2019, the central government of China and the State Council published “The guideline for building a country with strong transportation network” which set the goal that by 2035, a modernized, high-quality, comprehensive, and threedimensional transportation network will be completed [7]. Such network is featured with very high level of intelligence, safeness, green, and sharing, which can largely eliminate traffic jams in urban areas and achieve universal accessibility to various travel services. Towards 2030+, the key transportation vehicles and equipment will be much safer and advanced, thanks to the major technology breakthroughs anticipated in the next few years. It is expected that fully autonomous driving (Grade L5), high-speed magnetic-elevated trains, low-pressure (quasi-vacuum) tube-based high-speed trains will be important carriers for ground transportations. Airborne cars, personal aviation equipment, and flying buses will become more mature and be part of transportation vehicles for future free-style trips. Amphibian flying ships, trains in tunnels under the sea floor, and submarine buses will provide more convenient ways for the travels on the sea. Space travel will be more accessible to ordinary people and various space vehicles like rockets, space shuttles, space stations, etc., will help to fulfill human beings’ dream of experiencing exotic and spectacular views in outer space. While many of these transportation modes may be envisioned, they may all not be achieved especially considering sustainability considerations. In the future, super-powered transportation will be capable of multi-level and multi-resource “merging and sharing” which will facilitate more beautiful life, mobile office, interaction of family members, and entertainment, as shown in Fig. 7. It will create a transportation environment that is safe and reliable, guaranteed by multi-dimensional escorting. In such transportation system, the managers and administrative officials in charge of transportations can have “a holographic sensing” of the current network conditions which will enable them to carry out analysis and make right decisions. The users in this transportation system are able to dynamically monitor the instantaneous traffic condition and make more accurate predictions in advance. With these, the future travels will be more efficient, green, and secure. In addition, with the support of very powerful communications networks, superpowered transportation will assist logistics, information flow, and cash flow and
86
Y. Yuan
Fig. 7 Super transportation
help achieve dynamic balance between various transportation resources in urban cities and maintain a steady growth of the economic development of cities [8].
“Ubiquitous Coverage” to Facilitate Universal Accessibility of Public Services: Universal Education, Virtual Traveling Benefiting from 5G networks that have high data rate, massive connections, high reliability and low latency, intelligent medical services, remote educations, etc., are becoming available, which helps bridge the gap between the cities and rural villages in terms of medical resources to some extent. However, in order to fully achieve the universal access to public services in 2030+ and the ultimate goal of “setting myself free,” it is important to exploit the capability of filling the coverage holes of communications networks and extending the terrestrial coverage, made available by “ubiquitous coverage.” The communications coverage can be further extended to very remote areas or geographically isolated regions, for instance, islands on the open seas/oceans, commercial airplanes, deep-sea ships. By doing so, public services such as education, medical care, culture, and traveling can be significantly improved. In the “ubiquitous coverage”-based 6G networks, precision medical care can be extended to more suitable areas and help build a customized “digital human” that corresponds to actual humans living in a wide area. Such “digital human” can play key roles in services like risk avoidance of human beings’ major illnesses, earlystage screening, and targeted medical treatment. Effectively, the function of medical health services will be shifted from “therapeutic oriented” to “prevention-oriented.”
6G Visions and Requirements
87
Equipped with the ubiquitous AI computing power in holographical interaction technology and networks, the universal education in 6G era not only can achieve enhanced real-time interactive classes between students who are far apart but also can enable the one-to-one special tutoring according to the student’s individual background and needs. “Digital twin” technology provides an efficient way for customization of education which can fully take consideration of personal characteristics of students. “Ubiquitous coverage”-enabled networks can strengthen the tourism business by connecting also to culture and history. When network coverage is everywhere, people will be able to get fully immerged into virtual realities at any times, to shoot up above the cloud to enjoy the spectacular view of snowy mountains, to dive into the deep oceans to experience the turbulent currents, etc.
“Ubiquitous Coverage” to Facilitate Precise Administration of Social Affairs: Immediate Rescue Operations, Detections in Desolate Areas Combined with IoT technologies, 5G networks can support various administrative services such as security monitoring in hot spots, smart management of urban areas, etc. By 2030+, ubiquitous coverage will become a main feature of networks and be able to cover desolated areas like deep mountains, deep oceans, and deserts and to achieve space-air-terrestrial-sea full coverage, thus pushing forward the convenient, precise, and intelligent administration of social affairs. Achieving “ubiquitous coverage” will require wide coverage, flexible deployment, extremely low power consumption, and extremely high precision and robustness to natural disasters, which can be useful providing immediate response and rescue operations and monitoring activities in desolated areas. These services are crucial for effective administrations of social affairs. For instance, as shown in Fig. 8, with “ubiquitous coverage” and “digital twins,” virtual digital buildings
Fig. 8 Precise administration of social affairs
88
Y. Yuan
can be “constructed” where it would be very convenient to work out plans for evacuations and rescues during natural or man-made disasters like earthquake, fires, etc. Via real-time constant monitoring of the detections of adverse events in desolated regions, early warning can be announced, long before typhoons, flooding, sandstorms, etc., have occurred, so that enough time can be reserved for disaster preparation and protection.
3 Network Performance Requirements for 2030+ Through the analysis of typical services and the prediction of deployment scenarios for 2030+, it is seen that in order to achieve access to typical services under different scenarios, the required transmission rate of networks could be as high as tera-bits per second and the transmission delay could be as short as milli-second or sub-millisecond level [9].
3.1 Typical Use Scenarios Regarding the transmission delay in networks, taking VR service as an example, the display resolution of screens on current mobile terminals is around 3840 × 2160 × 8 bits. It is expected that by 2030, the resolution will be 24 K × 13 K × 12 bits. According to Moore’s Law, the processing power of semi-conductors will be improved by 128 times by the year of 2030. Hence, the processing time of rendering, encoding, and decoding will be reduced by 44% for each subframe in a VR service model. In the current 5G networks, the processing delays for VR services are roughly 15 ms, 8 ms, and 10 ms, for rendering, encoding, and decoding, respectively. In 6G networks, the above processing delays will be about 7 ms, 4 ms, and 5 ms, respectively. It is projected that by 2030, the transmission delay of cloud VR networks would be around 7 ms, to avoid the situation that transmission delay would become the bottleneck in the entire flow of the service. Regarding the network bandwidth, video monitoring service can be used as an example, in particular for the data collection units. Right now, the deployment density of security cameras in the city of Shenzhen is about 205 per square kilometer. If we translate this number to the number of video cameras per a thousand people, the level is about 60% of the number in developed countries. It is expected that by 2030, the resolution of video cameras will be 16 K × 16 K and the sampling rate will be 120 frames per second. If RGB samples are quantized with 14-bit for high quality and the compression ratio is 500:1, the uplink data rate per video camera will be about 2.38 Gbps. In the case of holographic video services, the raw size of pictures in number of pixels is about 24 K × 13 K × 50 per each 3D video frame. If 24 bits are used for each RGB pixel and the refresh rate is 240 frames per second, the downlink data transmission rate should be
6G Visions and Requirements
89
24 K × 13 K × 50 × 24 × 240 = 81.7 Tbps [10]. Even if high compression ratio, e.g., 500:1 is used, the data rate is still around 167 Gbps. In future applications, in order to match the performance of real networks, resolutions or sampling rates can be reduced to some extent to make remote communications services of holographic videos more feasible. Taking the dense high-rise building scenario as another example, based on the above assumption of 6G new service models, future 6G networks will support regular services such as simultaneous video broadcasting, online gaming, and website browsing, together with devices of IoT. For IoT services, it is assumed that there are 50,000 users per square kilometer and each user is equipped with 200 IoT devices. Also considering the three-dimensional coverage, there will be 10 terminals per cubic meter. Based on certain estimation method for key performance indicators, the downlink data rate for this scenario would be about 167 Gbps. The uplink data rate would be about 2.38 Gbps. The downlink data throughput density is about 3.75 Gbps/m3 and the uplink data throughput density is about 0.61 Gbps/m3 . During the course of future deployment, these key performance indicators (KPIs) may be relaxed to some extent according to the parameters of the actual services. This helps to make the services more feasible in future networks.
3.2 Key Performance Indicators of Future Networks In 5G standards certified by ITU, the performance requirement for high mobility is only 500 km/h. In order to support integrated space and terrestrial communications, 6G networks should support mobility speed over 1000 km/h. Apparently, high mobility requirement together with high data rate poses significant challenges to the system design of 6G networks. In addition, according to the definition of network energy efficiency in IMT-2020 of ITU, the energy efficiency at low load is measured by the sleeping time ratio when there is no data transmission. At high load, the energy efficiency is measured by the system spectral efficiency. However, the high energy efficiency brought by the high spectral efficiency may not guarantee the green sustainable development of communications industry. For instance, multiantenna technology has been widely used in 4G and 5G systems in order to improve the system spectral efficiency. At the same time, the total energy consumption is also significantly increased. More scientific study and designs are necessary to balance the system spectral efficiency, the energy efficiency, and the total energy consumption of the networks. There are many new performance requirements for 6G networks in 2030+, many of which cannot be fulfilled by the current 5G networks and technologies [11]. Therefore, on the one hand, we should push the evolution of 5G technologies and enhance some key performance indicators, based on the current capabilities of 5G networks. On the other hand, future 6G networks will provide more comprehensive performance indicators compared to 5G networks, such as extremely low jittering, ultra-high security, three-dimensional coverage, and ultra-precise positioning.
90
Y. Yuan
Fig. 9 Key performance indicators of networks in 2030+ and potential enabling technologies
To meet more stringent requirements of new application scenarios and services, changes are expected for 6G air-interface technologies and architectures. It is important that we start the study on enabling technologies early, for example, reconfigurable intelligent surface (RIS), non-orthogonal multiple access across (NOMA), new channel coding and modulation, cell-free operation, terahertz and visible light, integrated space, and terrestrial communications [12–17]. Based on the above-mentioned technologies as well as other potential technologies, the capabilities of future networks will be dramatically improved, in order to support more diversified applications and services. Figure 9 shows the key performance indicators of different application scenarios and the corresponding enabling technologies.
4 Summary The rapid development of information and communications technologies has accelerated the digitization. The entire societies have become more information-oriented, hence pushing human beings towards a world of digital twins that combines the virtual space and reality. “Innovative, coordinated, green, open and sharing” become the only way of the future development of the entire communications industry. Looking forward to the development of society in 2030+, new application scenarios will keep coming out, for example, digital twin of human bodies, superpowered transportations, intelligent interaction, inter-connections with integrated communications and sensing, etc. All these new services pose more challenging performance requirements for future communications networks, including ultrahigh peak data rates, ultra-low time jitters, three-dimensional coverage, ultra-precise
6G Visions and Requirements
91
positioning, deterministic delays, etc. At the same time, stringent requirements are proposed for service models and deployments and development of networks. With intelligent sharing of lives, intelligence-empowered production and intelligenceenlightened services, the future world will be made more beautiful and enjoyable via digital twins and ubiquitous intelligence.
References 1. China Mobile, white paper “Vision and requirements for 2030+”, November 2020 2. Z. Zhang, Y. Xiao, Z. Ma, et al., “6G Wireless Networks: Vision, Requirements, Architecture, and Key Technologies[J]”, IEEE Vehicular Technology Magazine, 2019, 14(3): 28-41. 3. Topic of block-chain becoming hot topic of annual National Congress of China. https:// blockchain.hexun.com/2019-03-05/196389348.html/. 2019. 4. Y. Liu, Q. Wang, B. Zhang, et. al., The application of gene-editing in agricultural industry, http://www.moa.gov.cn/ztzl/zjyqwgz/kpxc/201804/t20180424_6140883.htm/. 2018. 5. Digital twins becoming key technology for industry internet, https://www.iyiou.com/p/ 98516.html/. 2019. 6. Deloitte, New opportunities for industries: industry 4.0 and digital twins. 2018, 000(009):4249. 7. Central Committee of Communism Party of China, Guideline for building powerful transportations, 2019, 40(06):628. 8. Deloitte, Future cars: constructing new eco-systems with new transportation technologies. 2016. 9. C. Cui, S. Wang, K. Li, et. al, “6G vision, services and key performance indicators,” Journal of Beijing University of Post & Telecommunications, 2020. 10. X. Xu, Y. Pan, P. P. M. Y. Lwin, et al. 3D holographic display and its data transmission requirement[J]. 2011 International Conference on Information Photonics and Optical Communications, Jurong West, 2011:1-4. 11. M. Mozaffari, A. T. Z. Kasgari, W. Saad, et al. Beyond 5G with UAVs: Foundations of a 3D Wireless Cellular Network[J]. IEEE Transactions on Wireless Communications, 2018, 18(01):357-372. 12. S. Han, T. Xie , C.-L. I, Y. Yuan, et al. Artificial Intelligence Enabled Air Interface for 6G: Solutions, Challenges, and Standardization Impacts[J]. IEEE Communications Magazine, 2020. 13. Y. Yuan, S. Wang, Y. Wu, V. Poor, Z. Ding, X. You, and L. Hanzo, “NOMA for next-generation massive IOT: performance potential and technology directions [J], IEEE Communications Magazine, 2021. 14. E. Bjrnson, L Sanguinetti, H. Wymeersch , et al. Massive MIMO is a Reality – What is Next? Five Promising Research Directions for Antenna Arrays[J]. 2019. 15. Y. Yuan, Q. Gu, A. Wang, D. Wu and Y. Li, Recent progress in research and development of reconfigurable intelligent surface [J]. ZTE Communications, March 2022. 16. H. Yao, L. Wang, X. Wang, et al. The Space-Terrestrial Integrated Network (STIN): An Overview[J]. IEEE Communications Magazine, 2018:2-9. 17. A. A. Boulogeorgos, A. Angeliki, M. Thomas, et al. Terahertz Technologies to Deliver Optical Network Quality of Experience in Wireless Systems Beyond 5G[J]. IEEE Communications Magazine, 2018, 56(6):144-151.
Next G Applications and Use Cases Mitch Tseng
1 Introduction “What are the 6G Services?” “How will the 6G services be different from the 5G services?” “What are the use cases or applications that cannot be done well with 5G systems and must rely on the 6G systems for improvement?” These are a few questions people always ask when they hear that the industry is working on 6G wireless communications while wireless operators are still trying to deploy and finetune their 5G services. These questions are not easy to answer as most users, after having more or less gone through the evolutions of digital wireless communications from 2G through 5G, have not yet experienced a noticeable difference after switching to 5G from their 4G services. This makes the proposals for “6G Applications” even more challenging as the industry has not yet fully exploited the 5G capabilities. Fortunately, the evolution of wireless services always follows closely the progress of network technologies. Rather than pretending that there is a crystal ball that can magically show what 6G services and major 6G applications will be, it is better to look back and check how the key applications and services evolved with the new technologies provided. Wireless technologies evolved as a result of either addressing deficiencies of the system or trying to satisfy the needs of operators and users. By examining the past developments, we can come up with reasonable expectations for the future. There are many companies, operators, vendors, universities, and research entities that have been looking into the “Beyond 5G (B5G)” or “6G” technologies and systems. Some have published “Vision” papers painting a portrait of how the B5G
M. Tseng (O) Plano, TX, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_4
93
94
M. Tseng
or 6G systems can shape the future [1–11]. Two global consortia, Next Generation Mobile Network (NGMN) Alliance in Europe and Next G Alliance (NGA) in the USA, have started to organize interested parties to look into 6G-related matters including technical, spectral, environmental, societal, and applications aspects. Both organizations have published reports on 6G use cases and applications [12, 13] with interesting insights. Rather than starting from the scratch, the work herein will be largely based on the two reports with additional viewpoints from discussions in other forums or papers. Among other aspects depicted by the two reports, a common theme is to build, through 6G systems, an immersive digital world experience (DWE) for the future. Many applications are focusing on providing realistic, interactive user experience through a user-machine-interface (UMI) with high-definition (HD) video displays, 3D imagining, and holographic displays accompanied by haptic sensors and somatic sensory networks to elevate the user experience of extended reality (XR) services to an ultra-realistic level. Although some of the services may not be new, the expectation of extreme performance will be the driver for more stringent system requirements such as ultra-high data rates up to 10 Gbps, 100 Gbps, or higher with very low “end-to-end” delay. Furthermore, in order to show 6G systems are also “green” in support of United Nations (UN) Sustainable Development Goals (SDGs) [14, 15], features such as energy efficiency, and actions involved in “Reduce, Refurbish, and Recycle,” are also need to be considered. Another factor that is crucial but seldom discussed in open forums is the market forecast for an application or service. A 6G application needs not only to be technically feasible, green, and energy efficient, but also to be well received by the users. There are more than technical issues that need to be addressed when an application is launched. In addition to the aforementioned business values and green initiative–related matters, societal issues such as privacy protection, digital equity, and other operational challenges that may not be obvious even in the 5G services need to be addressed as well. These may be non-trivial issues that require the industry to look into each application or service seriously; moreover, vendors may need to seek regulatory exemptions or policy changes before a service can be officially launched. For example, when a cell site’s coverage radius is reduced from several kilometers down to 10 s of meters or even smaller because of the use of the sub-terahertz (THz) radio, the location privacy protection of the user may be difficult to maintain. Fortunately, forums like “Thinknet 6G1 ” has been looking into general issues related to 6G and provided good insights in its whitepaper “Six Questions about 6G” [16]. In spite of all these opportunities and challenges, one way for the industry to move toward 6G applications is by extending the scope of digital twins and looking into the “cyber-physical” [1] transition not only at the network or the system level but also at the service experience that can benefit. Over the past few years, the use of digital twins has been expanded from the original concept of merely a digital
1 The
organization can be found through www.thinknet-6g.com
Next G Applications and Use Cases
95
representation of a system, such as the control board monitoring a production line in a factory. It can now be used to build a physical system, such as constructing a building, and, later on, the digital twin becomes part of the management operation as promoted2 by the Digital Twin Consortium (DTC). The digital or cyber part of the digital twin contains all the information related to the physical part that may directly interact with humans; therefore, the “user experience” can be enhanced when the digital information is tailored or exploited according to the user’s needs. With this approach, the applications will be tied to users directly through work, shopping, learning, transportation, healthcare, living, and entertainment and provide an immersive user experience and be part of the user’s life. This certainly opens up opportunities for 6G systems by providing an enhanced digital world to all users.
2 Wireless Evolution and Services Service improvements are definitively the best reason to help justify the needs for a new generation of wireless services. At first, wireless users were satisfied merely being connected “anytime, anywhere”; after the broadband wireless services with high-definition pictures, the new generation of wireless users are expecting holographic or 3D displays as the interfaces presented by their devices. While discussing what applications may best fit for 6G systems, it is worthwhile to take a historical view and examine how the previous “Gs” were doing from the service’s perspective. It will help understand the “Dos” and “Don’ts” for the services provided in each generation of the systems. Figure 1 provides a summary of the services highlighted by each generation of wireless systems.
6G 5G 4G/LTE+ 3G
2G •Voice •SMS •Capacity Gain
•Voice •Multimedia Messaging •Broadband Wireless o Web Services o Video Streaming
•VoIP •BBWS >100 Mbps •Wireless Internet •“Live TV” •MTC (IoT/IIoT)
•VoIP •BBWS (>1Gbps), 4KTV •Autonomous Driving •Private Net by Small Cells
• BBWS >100Gbps • Network Sensing • Digital World Experience • Coordinated Robots • AIaaS • Extreme Sports with HD XR
Fig. 1 Wireless evolution and related key services
2 Architecture,
Engineering, Construction & Operations, Digital Twin Consortium. https://www. digitaltwinconsortium.org/working-groups/aeco/
96
M. Tseng
Fig. 2 SIM cards help protect the subscription identity and provide some data services
As shown in Fig. 1, “2G” marked the first digital wireless communication system. In addition to voice services, 2G systems also supported short message services (SMS), which provided users with a less intrusive means to connect with others. The major benefit of 2G systems was that it provided a 3X to 6X system capacity gain compared with its analog predecessor. The significantly smaller handsets were appealing compared with the “bricks” people had previously carried. Furthermore, as the price of the handsets dropped into the range that users could easily afford, wireless communication became not only a popular option for urban users, but the adoption of 2G wireless networks was surprisingly high in some developing countries, where the Public Switched Telephone Network (PSTN) infrastructure was not well deployed. With voice and SMS services, the focus of the service was to increase the number of subscribers based on the desire that people wanted to be connected “anytime, anywhere.” Japan’s NTT even branded its wireless company “DoCoMo”, which means “everywhere” in Japanese. Nokia’s offering of a variety of handsets certainly helped make wireless communication a trendy thing in the 2G era. To a degree, people felt “naked” when they left home without their cellphones. One key addition that impacted the wireless industry was the adoption of SIM (subscriber identity module) cards. The subscriber data were originally part of the phone and security concerns were mounting up as a phone can be “cloned” by copying the memory map of one phone to another. The introduction of the SIM card not only provided a layer of security protection for subscriptions, but the “SIM services” through the SIM toolkit also helped to introduce early mobile data services. SIM cards were issued with three different sizes (standard, micro, and nano), as shown in Fig. 2, at different times; there is also “embedded SIM,” which integrates SIM with the phone in secured memory. The concept of service subscription with security on a mobile device has been fundamental for later generations, and the needs for data services other than voice and SMS helped drive development of 3G and later systems. After setting
Next G Applications and Use Cases
97
Fig. 3 Examples of candy bar, clam shell, and smartphones (author’s collection)
up wireless personal communications as trendy or even a necessity for users, wireless operators looked into applications and services beyond voice and SMS. Personal differentiations, such as ringtone download and simple Web purchases, were the starting points for people to get into the wireless data service through General Packet Radio Service (GPRS) or Enhanced Data rates for GSM Evolution (EDGE) services. The digital data services in the early days were not that well received due to the limited data capability of GPRS; so, the requirements for a “full digital data service” and even “video telephony” were included in the nextgeneration development. As a result, “3G” offered wireless internet browsing and video streaming capabilities and, with added security protection, 3G-enabled mobile phones became not only the means to connect people but also became personal assistants and personal entrainment centers. With email access and basic personal productivity software installed on mobile devices, enterprise users could make “Blackberries” or “Communicators” an extension of their office equipment and this became an important sector of the 3G services. Despite some bad publicity when 3G was first launched,3 3G paved way for broadband wireless with many “apps” available for all purposes; furthermore, the broad touch screen and the improved hardware, such as the mobile devices shown in Fig. 3 with multiple cameras and large storage on the smartphones, enabled multimedia services on-the-go in the 4G era. 3 Such as the story about a Korean street vendor who was slapped with a huge bill for watching soap operas on her handset, while attending her business in the night market.
98
M. Tseng
Fig. 4 Evolution path to and features of 6G Systems
This trend of seeking better wireless services continued through Long-Term Evolution (LTE), Voice over LTE (VoLTE), 4G or “LTE Advanced” and then into 5G. The major offerings of “5G New Radio (NR)” are three key features: enhanced Mobile Broadband (eMBB), Ultra Reliable Low Latency Communications (URLLC), and massive Machine Type Communications (mMTC) as shown on the bottom part of the Fig. 4. While eMBB successfully supports wireless broadband services and improves the user experience, the latter two, URLLC and mMTC, are still ramping up their takers. URLLC is targeting automotive and some healthcare services requiring low delay or latency in the 10 s of milliseconds with extremely high system reliability, while mMTC is focusing on Internet of Things (IoT) services by providing coverage for a massive number of IoT devices with service continuity. The wireless network continues to evolve, and after rolling out 5G systems, ITU-R continues working on International Mobile Telecommunications-2030 (IMT2030). The path to 6G is planned to have an evolution path through 5G, 5G Advanced (5G+), and then 6G by 2030 as shown in Fig. 4. While each generation of wireless system will have new or improved features associated with it, the most discussed 6G features in forums like NGA are “Compute Fabric,” “Immersive DWE,” “AI-Native Network,” and “Network Sensing.” Improved system capabilities (or features) will enable new applications and services to be provided in the 5G and 6G era; furthermore, new applications may prompt discussions of a more flexible service architecture or new technologies to support stringent requirements. The three features of 5G systems (eMBB, URLLC, and mMTC) are the foundation that supports the 5G services. It is expected that 5G+ will provide larger bandwidth and shorter delays to enhance the 5G services. However, one area that may draw a lot of discussions is using 5G systems for massive IoT (mMTC) services. While it is technically feasible, the question of cost effectiveness for 5G systems to support IoT services that do not need mobility to maintain service continuity, or to cover IoT devices that consume or generate only
Next G Applications and Use Cases
99
small amounts of data intermittently, will need to be examined. Furthermore, use cases like “Ultra-Realistic Interactive Sport – Drone Racing” as one of the 6G use cases discussed in [13] may drive the 5G or 5G Advanced systems to an extreme. In order to support a superior immersive extended reality (XR) experience, more computing power, broader data bandwidth, very short delays, and high network reliability will be required.
3 6G Applications What exactly should a “6G Application” be? Although different application developers and service providers may have different views, the discussion can be divided into two categories: (1) applications with needs that 5G and 5G Advanced systems cannot satisfy; and, (2) services that require aggregating multiple service instances and resulting in technical needs beyond what 5G and 5G Advanced systems can support. These points are further elaborated below: 1. A 6G Application, from a narrow sense, must have very stringent technical requirements that the 5G and 5G Advanced systems cannot meet, and it requires the advanced features offered by the new generation of wireless systems to support. For example, the “Immersive Interactive Sport – Drone Racing” listed in the Next G Applications Report [13] will call for, in the desired stage, a 360degree picture with 8 K to 16 K high-definition pictures and will need a huge bandwidth in the Gbps range for transmission. Furthermore, the application also calls for a very low round-trip operation delay to avoid adverse effects such as causing motion sickness to the users. The requirements needed to support the use case cannot be met by 5G wireless systems. 2. An example of 6G services with aggregated 5G services can be a game show, where multiple players both local and remote all require high-quality live video feeds and interactive capabilities. Although the technical needs to bring in each player to the show may be met with 5G capabilities, the 5G system may be overloaded when a large number of players need to participate and the latency requirement needs to be strictly maintained to ensure the fairness of the game. Drone Racing: An Example of “6G Applications” Drones have been used in many industry sectors mostly to provide flexible video services and goods delivery. After the first “World Drone Prix4 ” held at Dubai in 2016 with $250,000 for the grand prize winner (a 16-year-old British teenager), the visibility and popularity has been boosted tremendously. Professional drone racing leagues, such as Drone Racing League,5 MultiGP Drone Racing League,6
4 https://www.youtube.com/watch?v=gIM4zKvsTIQ 5 http://www.thedroneracingleague.com/
100
M. Tseng
Fig. 5 Ultra-realistic interactive sport – drone racing (Source: ITRI)
and The Drone Racing Federation7 (TDRF), have been seriously promoting drone racing globally. Although international competitions have been cancelled due to the COVID-19 pandemic, technologies for interactive drone racing have been continuously developed. The first 5G Drone Prix in Taiwan was held in December 2021.8 In addition to the conventional race track in the stadium, digital objects were injected into the racing paths through racers’ feeds to provide further challenges to the racers. Bounded by the 5G data transmission rate, the picture quality and the response time for flight control were good, but the proponents have been continuously working on not only improving the racing experience for the players, but also helping to bring up the excitement level for the audiences through live-cast. Figure 5 is a conceptual drawing that shows how drone racing combining physical drones, realistic displays, and live interactions is powered by a 6G network. The ultimate goal of the developers is to bring an immersive interactive drone racing experience to the users, and the technical requirements are challenging current and future networks. In order to provide a “live” flying experience to the players, a “360-degree picture” to map the 3D pilot-view needs to be generated. This is done by projecting, such as with the Equirectangular Projection, a 3D spherical panorama view onto a 2D picture with −180 degrees to 180 degrees on 6 https://www.multigp.com/ 7 https://www.thedroneracingfederation.com/ 8 See
the video at https://www.youtube.com/watch?v=Zi-K3msNu3U
Next G Applications and Use Cases
101
the horizontal axis and −90 degrees to 90 degrees on the vertical axis. As a result, the images will be presented to both eyes of the players at a rate of 240 framesper-second (fps), 64 pixel-per-degree (ppd), and with a 3 x 12-bit RBG coding resolution. The data rate needed for transmitting the picture will be .
(360 x 64) x (180 x 64) x 240 x 2 = 4.6 T − bit/second
Furthermore, in order to avoid motion sickness caused by the motion-to-photon (MTP)9 latency (the time needed for a user movement to be fully reflected on a display screen), the MTP should be less than 20 ms. While the latency is more manageable, the extremely high data transmission rate cannot be completely met with the 5G network. As a result, the system developers have to adopt a scheme to process images with lower picture quality to reduce the needed transmission bandwidth today. Nevertheless, it is foreseen that the level of excitement can be elevated with 6G technologies, attracting more fans and players worldwide. While introducing a new generation of wireless services, it is always crucial to build a vision of what kind of services will be needed. Many companies published a variety of new or improved services through their “6G Vision” papers [6–11], and the thoughts eventually being considered and adopted in the ITU-R IMT-2030 process as the input through NGA and NGMN. NGA published its “6G Applications and Use Cases” [13] in May, 2022; and, NGMN had its report on “6G Use Case and Analysis” published in February 2022 [12]. Both documents were used to build contributions to the “ITU-R IMT-2030 Vision” document draft.
3.1 6G Use Cases: Next G Alliance’s View In North America, 6G applications will be seen as both brand new services and service experiences provided by the advanced features, such as much broader bandwidth and lower latency, as well as a continuing evolution of the 5G services with improvement in service quality. This is due to the reality that the existing services cannot be easily phased out; “backward compatibility” is a must for not only the network or system evolution, but also for services. Even though the networks and systems may be migrating from 5G to 6G, the existing 5G applications, services, and capabilities will need to continue to be supported. The voice, messaging, and multimedia services offered by 5G systems will be continued, although new means and technologies may be used to improve the quality, connectivity, security, privacy, and reliability of the services. As a result, a
9 http://www.chioka.in/what-is-motion-to-photon-latency/
102
M. Tseng
Networked-Enabled Robotic and Autonomous Systems
•Use sensors, such as sonar, radar, light detecting and ranging, GPS, camera, odometry to detect surroundings •Enable robotic and autonomous systems to interact with humans in natural ways through communications.
Multi-sensory Extended Reality
•Leverage technologies including AR, VR with 3D and holographic display capabilities with high-definition pictures to creative immersive user-machine interface •Add interactive capability to boost user sensation
Distributed Sensing and Communications
•Describe the future state of the world with ubiquitous connectivity through Terrestrial Network and Non-terrestrial Network to achieve remote data collections •Include wide-area coverage and ultra-low powers operating modes
Personalized User Experience
•Leverage real-time, fully automated, and secure personalization of devices, networks, products, and services based on a users personal profile and context information to provide ubiquitous user experiences •Include user’s preference, trends and biometrics.
Fig. 6 NGA 6G use case categories
list of initial requirements for 6G systems to support existing applications, services, and capabilities are listed in the NGA Applications Report [13]: 1. 2. 3. 4.
Native voice services shall be available at the launch of 6G networks. Interworking/handoff with 5G voice service shall be supported. Messaging service(s) shall be available at the launch of 6G networks. All national/regional regulatory requirements shall be met at the launch of 6G networks. 5. Customer security and privacy shall be designed into all applications, services, and capabilities. In addition to maintaining service continuation from previous generations, NGA is also engaging in making 6G systems a means to provide an enhanced digital world experience, which is the “Goal #2” of one of the Six Audacious Goals described in the “Next G Alliance Report: Roadmap to 6G” [17]. With this goal in mind, the Applications Working Group of NGA started looking into 6G applications and use cases through four categories: 1. 2. 3. 4.
Networked-Enabled Robotic and Autonomous Systems Multi-Sensory Extended Reality Distributed Sensing and Communications Personalized User Experience
Brief descriptions of each category can be found in Fig. 6. There are 16 use cases identified and discussed in the NGA Applications Report as shown in Table 1.
Next G Applications and Use Cases
103
Table 1 Use cases and categories of 6G applications (Next G Alliance) Networked-enabled robotic and autonomous systems Online cooperative operation among a group of service robots Field robots for hazardous environments
Multi-sensory extended reality Ultra-realistic interactive sport – drone racing Immersive gaming/entertainment Mixed reality co-design Mixed reality telepresence Immersive education with 6G High-speed wireless connection in aerial vehicle for entertainment service
Distributed sensing and communications Remote data collection Untethered wearables and implants Eliminating the North American digital divide Public safety applications Synchronous data channels Healthcare – in-body networks
Personalized user experience Personalized hotel experience Personalized shopping experience
Each use case in the report is introduced with description, market trends related to the use case, potential research areas, and some with requirements to indicate the technical challenges. Notice that the requirements in the report were deliberately stated in a non-specific fashion, e.g., used “high” or “low” to describe the transmission bandwidth rather than using “more than 4 Gbps,” as the use case may be introduced to indicate the planned future state. As such, some readers may feel “Are these really 6G use cases? We already have some of these use cases today.” It is, indeed, that some of the use cases can probably be seen implemented with 5G Systems but the implementation may be only a subset of the true 6G application. Take the “interactive drone racing” for example. There are drone racing events operated with 5G network already according to ITRI, the contributor of the use case, but the focus has been on providing the views from the drone to help pilot the drones. The user experience and the excitement level of the audience may still be far from what a true 360-degree high-definition picture with combined digitally injected obstacles added to the real race track can bring. This also brings up an important point while developing applications for future wireless systems: the service associated with a use case often needs to be started with whatever technologies are available today, and then to turn the operational deficiency into motivations for new features to help drive technology evolution. There may be brand new applications implemented with the magic provided by 6G technologies, but we may not be able to see them until we have defined what the actual 6G system is to be. Nevertheless, NGA is continuously looking into details of the use cases to help shape 6G requirements. According to the workplan of the Application working
104
M. Tseng
Enhanced Human Communication
• such as immersive experience, telepresence and multimodal interaction
Enhanced Machine Communication
• such as robotic communication and interaction
Enabling Services
• such as positioning, mapping, automatic protection, smart health, and manufacturing
Network Evolution
• such as Native Artificial Intelligence (AI) exposed as a service, energy efficiency, and coverage
Fig. 7 The four classes for 6G use cases by NGMN
group, there will be two whitepapers published before the end of 2022 to further address the technical aspects of robotics and multi-sensory extended reality; and the in-depth discussions of the rest of the two categories will be conducted in 2023. The outcomes of these studies will lead to the development of the requirements for 6G Systems.
3.2 6G Use Cases: NMGN’s View In order to evaluate what 6G use cases may be, NGMN Alliance started developing the “6G Use Cases and Analysis” in 2021 to collect and assess proposed use cases for 6G. The work was published in 2022 [12] and served as part of the contribution from NGMN to the ITU-R IMT-2030 Vision document. The proposed use cases were grouped into four classes according to their key characteristics and potential technology needs. The four classes are shown in Fig. 7. Rather than proposing specific use cases for 6G, NGMN proposed 14 “potential generic uses cases” for discussions. One of the objectives of the discussion is to pave the way for the development of the “6G Requirements,” which is an on-going activity in NGMN today. The proposed use cases and their respective classes are shown in Table 2. Instead of providing specific quantitative measures, the use cases are presented in a descriptive fashion. At current stage, the NGMN claimed that these use cases are considered provisional and would be further explored and prioritized. Nevertheless, the guidance of exploiting the 6G requirements by examining the uses cases is set clearly.
Next G Applications and Use Cases
105
Table 2 Use case classes and potential generic use cases (NGMN) Enhanced human communications XR Immersive Holographic Telepresence Communication Multi-model Communication for Teleoperation Intelligent Interaction: Sharing of Sensation, Skills & Thoughts
Enhanced machine communications Robot network fabric
Interacting cobots (coordinated-robots)
Enabling services 3D hyper-accurate position, location, and tracking
Network evolution Trusted native AI – AIaaS
Interactive mapping, digital twins, and virtual worlds Automatic detection, protection, inspection
Coverage expansion
Energy efficiency
Digital healthcare Smart industry Trusted composition of services
Although some of the use cases, such as robotics and immersive XR, were all discussed in both NGMN and NGA reports, there is little overlap on the use cases from the two organizations, so the industry can benefit broadly from both of the studies from broader exploiting efforts. The use cases on native AI and position, location, and tracking will help develop the requirements for including AI and network sensing into 6G Systems. Furthermore, just like the “eMMB, URLLC, and mMTC” for 5G systems, the industry needs a “theme” to help communicate with the users and the related offerings of future wireless to the public through various 6G use cases.
4 Theme of 6G Use Cases: Building an Immersive Digital World Experience 6G systems will support an immersive digital world experience for the future by expanding the use of digital twins to link the cyber and physical worlds. Despite most of the discussions related to 6G systems being from the viewpoints of technology, spectrum, sustainability, societal and economical needs, regulatory and policy, the objective is still focused on helping us create an environment with all sorts of applications and services enabled by the new features and technologies of 6G systems. Wireless communication has transitioned from voice services to data services over the past 20 years. With all the developments in technologies like sensors and sensory networks, data analytics, AI and machine learning (AI/ML), and display technologies to support augmented reality or virtual reality (AR/VR), industrial digital twins have been applied to manufacturing, remote maintenance,
106
M. Tseng
construction, smart city management, and even into social media like the Metaverse over the past few years. Originally, the digital twin was just a digital representation of a physical entity; later on, the industry recognized that the digital twin can be configured as an emulator while building a physical system. How can an industrial digital twin turn a software simulation into a physical system? The work starts by constructing a software simulation of the target system, and gradually connecting each software module of the simulator with the interfaces of physical modules associated with the module. Then the collected real data from the physical module can be fed into the simulator, which literally turns the simulator into an emulator. The physical system can then be built after all the software modules are replaced by physical modules. Moreover, this software simulator is the digital twin of the physical system and it can be expanded as the means of controlling the operation, administration, and management of the physical system. The popularity and extensiveness of digital twins can be seen by merely examining the eight different working groups, ranging from agriculture, aerospace, manufacturing, to healthcare, in the Digital Twin Consortium (DTC, www.digitaltwinconsortium.org). More details about building industrial digital twins with real examples can be found in [18]. The immersive digital world experience in the future can be achieved by connecting the physical systems through their digital/cyber twins operated in 6G systems so that the necessary services and applications can work together to satisfy user needs. DoCoMo introduced their view on the “Cyber-Physical Fusion” [1] by combining sensory network, AI, network compute, and analytics to enhance the performance of the IoT systems. The immersive user experience will be served by use of the AI-native intelligent network, which allows users to enjoy interactive actions without knowing the capabilities of the supporting network and cyber twins, as shown in Fig. 8, behind the scene. Cyber twins of physical entities allow the services and applications to interact through the intelligent network powered by 6G systems to form an immersive digital world experience for users. The following table offers a few examples of how the cyber-physical world of 6G applications can be presented to help develop 6G applications. The 6G network is assumed to provide the access and connectivity to support the application (Table 3).
5 To 6G and Beyond On the quest to build a digital world experience while migrating from 5G towards 6G, we should leverage the development of the technologies available and exploit all sorts of use cases to build services they will be the best fit for the future. Figure 9 provides a view about how the wireless industry can migrate from 5G to 6G. On the other hand, regardless of how users may feel about the thrill of entering a new era with the most state-of-the-art technologies, a 6G service will still be a business and its success will rely on how well the users will embrace it. The wireless
Product showcasing acts; virtual fitting rooms; 3D models Live experiments Play room with pets
Autonomous driving mirror system
Unmanned automated vehicle mirror system
Remote mall (retail) Interactive classroom
Automotive
UAVs in a factory
Pet care
Interactive virtual tourism
Virtual entities Avatars, meeting rooms Conference halls, one-on-one interactions Interactive sightseeing with live scenery
Applications Social network, conferencing, trade shows, offices
Carts for production line or warehouse
Studios with real product display; local device at user-end HD, 3D, or holographic displays, gesture control interface 3D high-definition displays, haptic sensors and actuators Cars, roadside stations (ITS)
Physical entities Computer communication systems, holographic and 3D high-definition displays, haptic sensors and actuators 360 view HD display (XR); data collector at scenic spots
Table 3 Examples of the next-generation services with virtual and physical relations
Surrounding obstacle data In-car data (probe data) Roadside INFORMATION (ITS) Routing planning and update Payload Info Collision prevention Vehicle status (energy, control) (continued)
Live images, feedback signal of touch
Data collection for the sights can be updated real time or recorded with short delays; interaction is “real time” Product information, product matching results, business transactions Library of curricula, gesture signals
Information needed Video feeds, voice feeds, data file sharing, avatar movements
Next G Applications and Use Cases 107
Status of the robot
Public safety
Virtual stage
Interactive multi-site gaming/shows Home assistant/personal healthcare assistant
Hazardous situation replica
Virtual home
View and status of the payload
Supply chain logistics
Remote agriculture with drones Agriculture
Interactive sport – drone racing Live racing experience (pilot view),
Robots in a factory
Table 3 (continued) Detailed robot status according to the planned tasks Drones, Drone live feeds, position and location, video 5G/6G network, XR displays, haptic feed to the player and the audience sensors and actuators Video cameras, sprayers, pruners, Drone status, video feeds, status of the payload, environment sensors control of the accessary tools, environmental data Freight vehicles (trucks, trains, Information on the payload while shipping ships, and airplanes) Drones Studio with physical stage Game-related information, stage, and lighting control Robots, 3D, or holographic Home data including data related to the physical displays, condition of the user in-body net Mission critical communication Video, audio feed of the situation, systems environmental data
Robot
108 M. Tseng
Next G Applications and Use Cases
109
Drone Energy
Smart Factory
Robots
Agriculture
Automotive
Cyber Twins
Healthcare
Smart Home
School/ Labs
Entertainment
Mall Personal Devices
Offices
Fig. 8 Cyber twin – the basis to support immersive digital world experience
Fig. 9 Pathway to migrate from 5G services to 6G services
service took off in the 2G era only when the users found they could be conveniently connected anytime, anywhere with wireless services in an affordable manner. The same principle holds when the industry is migrating toward 6G.
110
M. Tseng
As the industry progresses, there will be many exciting announcements on the innovations and new technical capabilities while the industry is moving toward 5G advanced and 6G services. Nevertheless, at the end of the day, the operators and vendors still need to examine the services and applications they offer and see if the return on investment (RoI) meets expectations from their management, the boards, and the investors. With the business and operations in mind, the following are practical consideration points for the industry while moving toward an immersive digital world experience: 1. 5G is the foundation for 6G. From economic and technology maturity’s viewpoint, 6G service will never be a step jump from 5G. The experience learned from implementing the key features, eMBB, URLLC, and mMTC, in the 5G network will make IoT services more mature; and the lessons learned while deploying those features can certainly be extended into 6G services. 2. Tele-presence, tele-sensing, and tele-operation. The pandemic in 2019 changed the behavior of wireless users in many ways. The most significant change is that, due to the travel ban and working from home, business users are getting used to conducting businesses remotely. Furthermore, fear of contracting the virus through gathering prompted students to attend classes online, while homemakers shopped for food and household needs online with curb-side pickup or home delivery. This trend can benefit 6G applications greatly as a resilient and reliable remote operation is the best way to demonstrate the capability and trustworthiness of a wireless system. By leveraging the technologies such as network sensing, edge computing, and coverage provided by non-terrestrial networks (NTN), a 6G system can make tele-operation the norm for many 6G services. 3. Immersive, interactive, and intelligence. An intelligent AI-native network will be the basis to support 6G services. In addition to a high-quality user-machine interface, a short network response time to enable an effective interactive environment will be a key factor to boost the user experience. 6G applications should take advantage of the success of mMTC and place sensors and sensory networks to free users from accessing the system through handheld devices. The intelligence provided by 6G systems will also improve operational efficiency with machine learning capability. 4. ESG and sustainability. More investors are using environmental, societal, and governance (ESG) as criteria to evaluate a company on product risk and growth potential. However, resource conservation should be the obligation of a responsible corporation to help to contribute to global sustainability. 5. Non-intrusive automatic subscription management. Trustworthiness between the user and the service will be key for a robust business relationship, and a nonintrusive means to help protect a user’s identity and privacy automatically will ease the adoption of a service. Automated subscription management can ease operation for users by adding elements like facial recognition, gait analysis and
Next G Applications and Use Cases
111
other biometric data, or behavioral and geolocation history to the normal security protection scheme. 6. End-to-end service view. A successful “proof-of-concept” (PoC) is just the beginning of a successful service. Business aspects and operational logistics, including supply-chain and regulatory compliance, must also be carefully planned in an end-to-end manner to demonstrate the value of the service to investors and customers. One good way to assess the viability of a service is to check the cash-flow in every stage of the end-to-end service. New capabilities, such as AI-native network, network sensing, and ultra-high data rate, make many of the applications and services needed to help build an immersive digital world experience feasible in the 6G systems. As discussed earlier, although there will be brand new services and applications for 6G, the industry is not likely to have a “step-jump” to 6G systems as the industry is exploiting the full capability of the current system today and in the near future. According to the roadmap of 3GPP, before we enter 6G in the 2030s, there will be multiple releases each with feature improvements; meanwhile, we will see current services being improved and the user experience will be enhanced. In a way, we may enjoy part of the immersive digital world experience portrayed in this chapter sooner than we thought.
References 1. NTT DoCoMo, 5G Evolution and 6G Ver. 4.0, January 2022 https://www.docomo.ne. jp/english/binary/pdf/corporate/technology/whitepaper_6g/DOCOMO_6G_White_PaperEN_ v4.0.pdf 2. University of Surrey, 6G Wireless: A New Strategic Vision, November 2020 https:// www.surrey.ac.uk/sites/default/files/2020-11/6g-wireless-a-new-strategic-vision-paper.pdf 3. Hexa-X, 6G Vision, Use Cases and Key Societal Values, February 2021 https://hexa-x.eu/wpcontent/uploads/2021/02/Hexa-X_D1.1.pdf 4. Samsung, The Next Hyper Connected Experience for All, July 2020 5. Finnish 6G Flagship, White Paper on 6G Drivers and the UN SDGs, April 2020 6. W. Saad, M. Bennis, and M. Chen, A Vision of 6G Wireless Systems: Applications, Trends, Technologies, and Open Research Problems, IEEE Network, May/June 2020 7. G. Liu et al., Vision and requirements of mobile network beyond 2030, China Communications, September 2020 https://ieeexplore.ieee.org/document/9205980 8. 5G IA, European Vision for the 6G Network Ecosystem, June 2021 https://5g-ppp.eu/wpcontent/uploads/2021/06/WhitePaper-6G-Europe.pdf 9. KDDI, Beyond 5G/6G Ver. 2.0, October 2021 https://www.kddi-research.jp/sites/default/files/ kddi_whitepaper_en/pdf/KDDI_B5G6G_WhitePaperEN_2.0.1.pdf 10. Beyond 5G Promotion Consortium, Beyond 5G White Paper ~Message to the 2030s ~ Ver. 1.0, March 18, 2022 https://b5g.jp/doc/whitepaper_en_1-0.pdf 11. NGMN, 6G Drivers and Vision, April 2021, https://www.ngmn.org/wp-content/uploads/ NGMN-6G-Drivers-and-Vision-V1.0_final.pdf 12. NGMN, 6G Use Cases and Analysis, February 2022. https://www.ngmn.org/wp-content/ uploads/NGMN-6G-Use-Cases-and-Analysis.pdf
112
M. Tseng
13. Next G Alliance, Next G Alliance Report: 6G Applications and Use Cases, May 2022. https:// nextgalliance.org/white_papers/6g-applications-and-use-cases/ 14. 17 Gals to Transform Our World, Sustainable Development Goals, United Nations, 2022. https://www.un.org/sustainabledevelopment/ 15. Next G Alliance, Green G: The Path Toward Sustainable 6G, January 2022. https:// nextgalliance.org/white_papers/green-g-the-path-towards-sustainable-6g/ 16. Thinknet 6G (www.thinknet-6g.com), Six Questions about 6G, February, 2022. https:// www.bayern-innovativ.de/en/networks-und-thinknet/digitization-overview/thinknet-6g/page/ whitepaper-six-questions-about-6g 17. Next G Alliance, Next G Alliance Report: Roadmap to 6G, February 2022. https:// nextgalliance.org/white_papers/roadmap-to-6g/ 18. “Building Industrial Digital Twins” by Shyam Varan Nath and Pieter Van Schalwyk, Packt Publishing, September 2021 ISBN 978-1-83921-907-8
Bridging the Digital Divide Maurilio Matracia, Aniq Ur Rahman, Ruibo Wang, Mustafa A. Kishk, and Mohamed-Slim Alouini
1 Introduction The most famous definition of the digital divide comes from the Organisation for Economic Co-operation and Development (OECD), where it is described as “the gap between individuals, households, businesses and geographic areas at different socio-economic levels with regard both to their opportunities to access information and communication technologies (ICTs) and to their use of the Internet for a wide variety of activities” [59]. Although the story of the term is already more than two decades old, the corresponding problem is still quite far from being solved: according to the estimates from the International Telecommunication Union (ITU), until last year nearly three billion people completely lacked broadband connectivity, and the share of Internet users in rural areas was equal to half of its urban counterpart [37]. Furthermore, several low- and middle-income countries (LMICs) are still far from ensuring comprehensive 3G coverage to their citizens, which also implies a strong digital divide between more and less developed countries [55]. Despite these numbers, however, connectivity is still often taken for granted by social media and governments, sometimes neglecting the fact that a significant percentage of the population does not have access (because of either the geographical location, limited digital skills, economic instability, or even personal choices) to important services and information. Contextually, the United Nations (UN) set seven important advocacy targets to be achieved by 2025, including one referred to
M. Matracia (O) · A. U. Rahman · R. Wang · M.-S. Alouini King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia e-mail: [email protected]; [email protected]; [email protected]; [email protected] M. A. Kishk Department of Electronic Engineering, National University of Ireland, Maynooth, Ireland e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_5
113
114
M. Matracia et al.
as “Get Everyone Online,” according to which broadband-Internet user penetration should reach .75% worldwide, .65% in LMICs, and .35% in least developed countries (LDCs) [18]. The digital divide can essentially refer to three aspects: (i) the physical and material access, which mainly relates to the diffusion of connectivity by implementing new policies as well as the development and implementation of affordable and effective technologies or platforms, (ii) the digital skills required to make proper use of online services and information, and (iii) the actual usage (in terms of frequency and diversity) of said skills [44]. In this chapter, we will focus on the first aspect. The main challenges faced when trying to ensure physical and material access to everyone mostly relate to both economic and technological limitations. In particular, both the capital expenditures (CapEx) and the operating expenditures (OpEx) of the required telecommunications infrastructures are often prohibitive for the network providers, and similar considerations follow when considering the required power infrastructures. Finally, important issues deriving from the limited and costly spectrum resources further complicate the problem of global connectivity. Although it is hard to quantify the benefits of connectivity for a generic community, it is quite evident that bridging the digital divide is an important step toward sustainable development. For example, as the job market and the economy of products evolve, it is expected that online learning tools will be even more crucial when it comes to acquiring knowledge and developing skills. Contextually, children growing up without easy access to these tools would be less likely to become competitive in their future careers. Following the same lines, the recent COVID-19 pandemic highlighted the necessity of enabling remote health services (especially in far-flung areas, where typically there are no advanced medical facilities) while making exurban areas more appealing to most of the workers with concrete opportunities of operating remotely. Nonetheless, remote areas usually lack important resources. For example, in order to establish connectivity in remote regions, it is essential to have a reliable power source. Unfortunately, most of sub-Saharan Africa, which is underconnected, also suffers from the lack of adequate power resources [22]. The same argument can be used for many remote areas all over the world, which are often overlooked by governments and energy providers. Integrating renewable energy sources into power systems can prove to be sustainable in the long run, thereby supporting connectivity in far-flung regions that cannot be easily installed with conventional infrastructure. Another resource necessary for wireless communications is spectrum, an invisible yet precious resource which dictates the users’ quality of service (QoS). The lower frequencies in sub-GHz bands are licensed (because of their better range) and hence also quite expensive. However, strategies to reuse them can support ubiquitous connectivity. To further enhance coverage, innovative mechanisms of spectrum sharing are required. The market dynamics can also play a pivotal role, whereby spectrum licensing can be carried out in a flexible manner among multiple stakeholders. Moreover, community-driven micro operators can exploit the unlicensed spectrum for deploying Wi-Fi networks to broaden coverage zones.
Bridging the Digital Divide
115
On the institutional side, governments can step in and subsidize spectrum license pricing for mobile network operators (MNOs) in order to incentivize network deployment in remote and rural zones [61]. Finally, the majority of the governments have the potential to facilitate infrastructure and spectrum sharing among MNOs to reap the benefits of competition. The remainder of this chapter includes two core sections, entitled “Connectivity” and “Affordability.” While the first core section discusses the most interesting technological innovations (categorized as alternative backhaul paradigms; ground, air-, and space-based solutions; and their integration) capable of bridging the digital divide, the second core section focuses on relevant economic aspects related to such solutions and the digital divide in general. In particular, we will firstly discuss ground connectivity by (i) explaining the problem of optimal base station (BS) deployment, (ii) highlighting four important projects that tackle issues related to either the physical infrastructure or waves’ propagation, and (iii) recalling the potential of endowing wind turbine (WT) towers of BS functionalities. As of nonterrestrial networks, instead, we will essentially go through the various types of aerial and space platforms (respectively drones, helikites, balloons, gliders, and airships in Sect. 2.3 and low, medium, and geostationary Earth orbit satellites, along with their gateways, in Sect. 2.4), before discussing 3D integrated networks and spectrum coexistence. Then, in the subsequent section, the affordability of global connectivity will be discussed with respect to aerial and space solutions, television white space (TVWS), and universal broadband and electrification. Finally, Sect. 4 will summarize the content of the chapter and highlight some possible research directions for the incoming years.
2 Connectivity 2.1 Alternative Backhaul Paradigms 2.1.1
E/V Bands for Backhaul
Due to the limited capacity throughput achievable with traditional microwave bands, for 5G and beyond applications, it is expected to rely on millimeter-wave (mmWave) bands such as the so-called V- and E-bands, which respectively cover the 57–.71 GHz and the 71–.86 GHz spectrum ranges. For example, band and carrier aggregation (BCA) is a technique that combines the E-band with lower bands in order to ensure high reliability and increase the distance of point-to-point links up to ten kilometers; while this results in an attractive solution to bridge the digital divide, it should be noted that the respective spectrum is not unlicensed [27]. On the other side, V-band is typically unlicensed and can be used also for non-telecom applications. Nonetheless, microwaves (and thus, the waves corresponding to both these solutions) suffer from attenuations due to rain, oxygen, water vapor, and fog [29].
116
2.1.2
M. Matracia et al.
Tarana’s G1
The platform Gigabit 1 (G1) developed by the company Tarana is a next-generation fixed wireless access platform (ngFWA), which exploits unlicensed spectrum to deliver fiber-class broadband over non-line-of-sight (NLoS). In particular, the platform consists of compact base nodes (BNs) mounted on cell towers that communicate with up to 256 remote notes (RNs) per BN; the RNs are powered over an Ethernet cable. The company raised as much as .170 million USDs of funding in March 2022, and the proposed solution is three times cheaper compared to 5G at fixed wireless access (FWA). G1 has been already deployed in the USA (more precisely in Joplin, Missouri) and achieved impressive results: up to .438 Mbps of maximum downlink rate, with an RN signal-to-noise ratio (SNR) of .11 dB. Moreover, simulation results have shown that G1 can cover over sixty thousand households with at least .200 Mbps when operating at .3 GHz in a suburban environment [83].
2.2 Terrestrial Solutions 2.2.1
Optimal Terrestrial Network Deployment
The Internet point of presence (iPoP) is not widely available in rural areas in low-income countries, due to the fact that the deployment of 4G services is cost prohibitive. Due to the lack of good quality Internet around villages, many inhabitants tend to migrate to nearby cities, which further weakens the motivation for infrastructure deployment. The networks in regions with sparse population should be designed while keeping scalability in mind. For example, population clusters can be interconnected through means of a Wi-Fi mesh network, drawing Internet supply from a nearby iPoP either wirelessly or through wireline modes based on the traffic demands of the population. Furthermore, the placement of the mesh nodes should be done to maximize the coverage of the population. Authors in [66] developed an algorithm to deploy access points based on the population’s spatial distribution through adaptive clustering. In a similar spirit, authors in [23] studied the problem of BS deployment in order to maximize coverage with the help of UAVs.
2.2.2
Meta Connectivity’s Bombyx
As already mentioned, connecting the unconnected requires reliable power grids. However, a considerable percentage of rural areas is already served by overhead power lines, usually operating at either medium or low voltages. In 2020, Facebook Connectivity (now called Meta Connectivity) came up with an effective solution that exploits existing overhead power lines: a robot, called Bombyx (meaning silkworm
Bridging the Digital Divide
117
Fig. 1 An example of existing wind turbines nearby a village [74]
in the Latin language), that is capable of crawling along aerial lines and wrapping optical fiber around them, so that it is possible to avoid the costs of digging. The main advantages of this solution are represented by its (i) autonomy, since it does not require any human assistance; (ii) speed, since it makes fiber deployment multiple times faster compared to manual deployment; and (iii) noninvasiveness, since it can operate even without interrupting power lines’ services. Other important characteristics of this technology are the use of Kevlar braidings (which are stronger and more resistant to high temperatures) and the number of fibers per cable, which has been reduced from 96 to 24 in order to compensate for the weight of the robot itself [45].
2.2.3
Wind Turbine Towers
Almost two decades ago, the idea of providing communications functionality to WTs was proposed for the first time in [3]. In rural and remote areas, cellular coverage could be significantly improved by exploiting large WTs (see Fig. 1) and efficiently combining the power and communications infrastructures in just one tower (which is even more robust and tall than a conventional cell tower). The main circumstances where this solution becomes convenient are in the presence of either large WTs (may be already existing or prospective towers) or case government subsidies (e.g.,in the form of either financial support, tax reduction, spectral and/or land resources). Then, compared to conventional solutions, proper
118
M. Matracia et al.
deployment of wind turbine-mounted BSs (WTBSs) could (i) improve the continuity of service, because WTs are usually connected to reliable power grids; (ii) lower the overall infrastructure costs and complexity, since the BS is incorporated into the WT tower; and (iii) improve network capacity and coverage, since large WTs’ payload capability and altitude allow to install multiple antennas and reach wide coverage radii, respectively. The effectiveness of this solution has been evaluated in [48] by simulating the average data rate, based on real-world data sets about the population density and the planar distributions of existing cell towers and WTs. The results showed that, when the locations are chosen strategically, even a moderate deployment of WTBSs can lead to considerable increase (e.g., as high as .16.7% in a suburban area in southern Argentina) in terms of average data rate experienced by the typical exurban user.
2.2.4
Google X’s Project Taara
The first project we want to discuss is Taara by Google X: the idea is to make use of free-space optics (FSO) to provide high capacity with low cost and minimal infrastructure (possibly also the existing one, maybe including poles, towers, or rooftops). This solution is particularly suitable for remote and hard-to-reach areas: for example, the Congo River separates two capitals (Brazzaville and Kinshasa) that are less than five kilometers far from each other, making wired connectivity extremely expensive. Therefore, project Taara has been deployed for solving this problem: almost .700 TB were successfully transmitted in less than 3 weeks, with an excellent availability of .99.9% [28]. Currently, this technology is able to deliver up to .20 Gbps over a distance of .20 km in case of clear line-of-sight (LoS) transmission, but its effectiveness can easily drop in the presence of fog or harsher weather conditions [41].
2.2.5
Meta Connectivity’s Terragraph
Similarly to project Taara, Terragraph is an extremely compact technology that can be easily installed on any existing infrastructure. However, this solution is based on mmWave communications, and thus it is more appropriate for covering distances that do not exceed a few hundred meters (which is crucial, e.g., in suburban environments lacking optical fiber links). In particular, Terragraph is a multinode wireless software-defined network (SDN) that leverages .60 GHz unlicensed spectrum. An interesting experimental work that makes use of this technology can be found in [26], where path loss, received power, SNR, and root mean square (RMS) delay spread were measured in case of installations on both rooftops and street lamp poles.
Bridging the Digital Divide
2.2.6
119
Meta Connectivity’s SuperCell
In 2020, Facebook Connectivity presented SuperCell: a guyed tower characterized by a very tall structure (between 180 and 250 m in height) equipped with commercial off-the-shelf radios to provide high capacity by means of highly sectorized antennas [17]. The main advantage of this solution is the low CapEx deriving from the simplification of the cellular network infrastructure; in other words, since one SuperCell can be seen equivalently to tens of cell towers, mobile operators would have to provide less towers and backhaul links [86]. On the other hand, the main issues deriving from the size of this tower can be summarized as (i) a smaller number of points of failure compared to the conventional network architecture, (ii) an extremely limited relocation flexibility, and (iii) the need of using bulky ropes to ensure the stability of the structure. Nonetheless, successful implementation of this product has been already achieved in at least two sites in the USA, specifically in New Mexico and Mississippi, by carrying on long-term evolution (LTE) field tests: in the first case, a coverage radius of roughly .40 km was achieved by deploying SuperCell on top of a hill, while in the second case, a 180 meters tall SuperCell with 36 sectors provided a coverage area roughly 30 times larger compared to the one provided by a conventional cell tower.
2.3 Aerial Platforms As of today, most of the interest in aerial networks relates to the use of unmanned aerial BSs (ABSs), which consist of unmanned aerial vehicles (UAVs) carrying and feeding some BS equipment. In general, UAVs can be broadly classified depending on either their altitude (ultralow, low, medium, high, or ultrahigh), type (fixed or rotary wing), or connection (tethered or untethered). Numerous studies about aerial platforms for wireless communications have been published in the last few years: for example, an interesting vision for future HAPsenabled networks has been proposed in [38], while comprehensive overviews of drone-mounted BSs and tethered platforms in general can be found in [56] and [13], respectively. For technical insights on the use of ABSs to enhance coverage in rural areas, the reader can refer to the analytical work presented in [47]. In what follows, we will have an overview of the main types of UAVs for wireless communications.
2.3.1
Drones
These represent the most famous type of aerial platforms, up to the point that the term “UAV” and “drone” are often used interchangeably [56]. Drones can be either tethered or untethered, although the second case does not usually need to be
120
M. Matracia et al.
Fig. 2 A drone flying above a rural area [90]
specified. In both cases, they commonly operate within the ultralow-altitude range (50–.150 m). The main advantages of tethered drones (compared to their untethered counterparts) are the higher payload capability, endurance, and often also a better backhaul link (if the tether is connected to the core network). On the other hand, it is evident that the tether also limits the drone’s mobility and relocation flexibility, as extensively discussed in [4]. Low-altitude fixed-wing untethered drones such as quadcopters (an example is shown in Fig. 2) are the most common example of ABS in the literature, mostly because of their exceptional maneuverability and capability of hovering over the same spot. Other advantages of untethered drones are their low cost and weight, although they have a very limited endurance (which in most of the practical cases does not exceed 1 hour, unless a very small payload is applied or perhaps promising technologies such as laser power beaming [40] become mature and get implemented). Finally, it is worth mentioning that untethered drones are extremely versatile, meaning that they can be used for various missions apart from providing connectivity, such as package delivery [15, 72], precision agriculture [52], and environmental monitoring [64].
2.3.2
Helikites
This type of UAV consists of a blimp that has the structure of a kite, and it is connected to the anchor point by means of a simple tether long a few hundred meters (which allows the helikite to operate in the low-altitude range, between 200 and .600 m). The aerodynamic design of the platform allows to exploit both wind and helium for its lift, and therefore there is generally no need of any additional power supply to stay aloft even for several days; nonetheless, the payload capability
Bridging the Digital Divide
121
of a helikite strongly depends on the wind speed (which in turn is the function of the altitude) and the volume of the helikite. However, when functioning as a BS, a helikite would need to feed the respective antennas and onboard processing units: then, either a more complex type of tether, a battery, or photovoltaic panel should be included in the vehicle. An interesting reference about the design of helikites as well as their characteristics compared to drones, airships, and aircrafts can be found in [21].
2.3.3
Balloons
Balloons represent the simplest type of aerostats since their envelopes do not present any rigid structure and their shape is quite regular. Similarly to drones, also balloons can be either tethered or untethered. However, while the drones’ tether is mainly used to supply the power required to stay aloft, in the case of balloons, it predominantly works to limit its uncontrollable mobility and avoid it to fly away. Nonetheless, the tether can also ensure reliable power supply and wired backhaul link. While tethered balloons are generally deployed as low-altitude platforms, untethered balloons generally operate in the high-altitude range, between 15 and .22 km. The main reason for this is twofold: on the one hand, a higher altitude allows to harvest more solar energy via photovoltaic panels, while on the other hand, the low turbulence and wind currents occurring at high altitudes imply a much more limited amount of energy needed to float, thus enabling long-term missions up to several months. However, the main issue comes from the cost of the gas needed to inflate them, helium (He), which is not very abundant on Earth and thus has a considerable cost. The most famous project on untethered helium balloons was Project Loon by Alphabet, which has been able to achieve reliable communications between balloons located at a distance of .100 km through FSO. The effectiveness of this solution has been questioned in [76], where three case studies have been compared by means of a novel analytical approach. After launching over two thousand balloons into the stratosphere, the project has been suspended due to the risks regarding its commercial viability.
2.3.4
Gliders
Gliders are similar to airplanes, but present relatively longer wings, which allow them to float without the need of any fuel or gas lighter than air. Due to this, they are currently considered as the most promising type of HAP. However, when functioning as BS (especially for long endurance missions), gliders need batteries and/or a large number of photovoltaic modules. Similarly to untethered balloons,1 1 As
of today, there is no evidence of tethered gliders.
122
M. Matracia et al.
Table 1 Comparison between various gliders Company Model Wingspan [m] Number of propellers Cruise speed [km/h] Payload capability [kg]
UAVOS ApusDuo 15 1 27 2
Airbus Zephyr S 25 2 56 5
HAPSMobile Sunglider 78 10 110 68
Stratospheric platforms HAP 60 2 Not available 140
they mostly operate at high altitudes, and hence their fixed wings are generally subject to low temperatures and strong mechanical stresses. The most famous example of glider was Aquila by Facebook, but the project has been suspended in 2018 [24]. Nevertheless, several projects are now focusing on novel designs of this type of vehicle for communications purposes, such as ApusDuo by UAVOS, Zephyr S by Airbus, Sunglider by HAPSMobile [49], and the hydrogen-powered HAP designed by Stratospheric Platforms, in which the main characteristics are compared in Table 1.
2.3.5
Airships
This type of vehicles can be quite similar to balloons or helikites, depending on their structures (nonrigid, semirigid, or rigid). They can be filled with helium or hydrogen (despite the safety concerns due to its flammability), but propellers are often installed to improve controllability and speed. Tethered airships do not usually have any propellers, but are still more controllable than tethered balloons due to their aerodynamic design. The most popular commercial example of tethered blimp2 is SuperTower, which includes a novel control system designed by Altaeros allowing to deploy the platform without the need of any personnel (as opposed to the usual dozen of specialized people required). In particular, currently, the American company offers two models of SuperTower: the ST-Flex and the ST-300. The main advantage of the former is that it requires less than a day to be deployed, while the latter is more appropriate for permanent installations since it can stay aloft for more than a month and carry up to .300 kg of payload [9]. On the other hand, untethered airships such as AIRLANDER 10 by Hybrid Air Vehicles (HAV) [89] and Stratobus Airship [43] by Thales Alenia Space are capable of flying even at ultrahigh altitudes (within the 20–.50 km range) in order to provide wider coverage to remote areas. Apart from communications, untethered airships can also be an attractive solution for shipping heavy goods and materials (up to sixty tons, by means of a rigid structure of .200 m in length and .50 m in nominal
2 A blimp is essentially a nonrigid airship. Its shape depends on its envelope and the pressure of the lifting gas.
Bridging the Digital Divide
123
Fig. 3 A satellite in space [78]
diameter) without the need for landing, as envisioned by the French company FLYING WHALES [42]; the main limitation, however, is that loading and unloading phases have to be equivalent and simultaneous, to prevent the airship from flying away.
2.4 Space Networks Until now, most space communications services have been undertaken by satellites (an example is shown in Fig. 3). The two most common modes of satellite communication include (i) satellites serving as space BSs to provide coverage and (ii) satellites serving as relays and establishing the communication between terrestrial radio stations; given that the technology to operate in the first mode is still at a very early stage, we hereby focus only on the use of satellites as relays. Compared with the ground and low-altitude communications, satellite communications can (i) achieve much larger coverage areas, (ii) maintain high reliability in the occurrence of typical disasters, and (iii) realize broadcast and multiple access communications by simultaneously receiving several ground stations [25]. Therefore, space communications can play an important role in promoting global connectivity. According to their orbital altitudes, satellites are classified into geostationary earth orbit (GEO) satellites, medium earth orbit (MEO) satellites, and low earth orbit (LEO) satellites. More than ten satellite companies have already proposed their satellite constellation programs and plan to launch their first satellite into orbit
124
M. Matracia et al.
Table 2 Part of satellite constellations licensed by the FCC Constellation Space Norway Audacy Karousel O3b Iridium OneWeb Telesat Kepler Starlink Kuiper
Number of (planned) satellites 2 3 12 42 81 648 298 140 11926 3236
Altitude (km) 8089–43,509 13890 31,569–40,002 8062 778 1200 1000 & 1248 500–600 340&550 590–630
Number of orbital planes 1 1 Not available 3 6 10 11 19 190 98
Type GEO GEO GEO MEO LEO LEO LEO LEO LEO LEO
within the next 5 years. Table 2 shows the satellites’ constellations licensed by the Federal Communications Commission (FCC) in the recent years.
2.4.1
FSO for Aerial and Space Communications
Conventional non-terrestrial network solutions are mostly based on radio frequency (RF) systems for both feeder and user links operating at the Ku band, Ka band, and Q/V band (which respectively fall in the 12–18, 27–40, and 40–.50 GHz ranges); because of this, one ground station is required for every .20 Gbps of satellite capacity [19]. On the other hand, the concept of fiber-through-the-air (FTTA) implemented via FSO technologies allows to send more than .1 Tbps with just one optical ground gateway station (OGGS). This is mostly due to the availability of roughly .10 THz of unlicensed spectrum and the possibility of selecting LoS paths. In addition, the compact, energy-efficient, and lightweight design strongly underpins mounting the FTTA equipment in non-terrestrial platforms, while the absence of atmospheric turbulence at altitudes greater than .17 km facilitates stratospheric and space FSObased communications [71]. On the other hand, the presence of either atmospheric turbulence or cloud coverage in the troposphere can compromise the reliability of air-to-ground (A2G) and space-to-ground (S2G) FSO feeder links [7]. One option to deal with this issue is to implement wither hybrid systems that can easily switch from FSO to RF communications in case of fog or strong cloud attenuation [87], although the switching itself causes information packet losses. However, it has also been shown that post- and pre-compensation wavefront corrections can mitigate the effect of atmospheric turbulence [11], while works such as [73, 80] showed the potential advantages of injecting HAPS with FSO relaying capability into the network
Bridging the Digital Divide
125
architecture to create integrated two-hops ground-air-space links with reduced beam wandering or spreading. Other important challenges derive from the beams’ misalignment losses (which requires developing powerful positioning, acquisition, and tracking algorithms) and the technological limitations of transceivers and analogue-to-digital (A2D) and digital-to-analogue (D2A) converters when it comes to process at rates of the order of .1 Tbps. Moreover, the frequent presence of fog or cloud obstructions requires effective cloud monitoring and characterization at the beginning of the site survey stage, so that the OGGSs can be properly placed.
2.4.2
GEO Satellite
GEO satellites rotate synchronously with the Earth and are relatively stationary at an altitude of .35,786 km above the Earth and are capable of providing network access to terrestrial users anytime. GEO satellite communications can relay over ten thousand kilometers from the ground in one hop, so it has obvious advantages over the ground-based microwave relay system, whose single-hop distance is limited by LoS propagation. Due to the high altitude of geosynchronous Earth orbits, these platforms benefit from a huge communication range: just three GEO satellites (equally spaced by .120◦ ) are sufficient to entirely cover our planet. In addition, each of these platforms has a large single-satellite communication capacity matching the large coverage area. On the economic side, it is evident that since a GEO satellite has to bear an enormous communication burden, its cost is typically worth hundreds of millions of US dollars [88]. Nonetheless, alternative solutions such as micro GEO satellites can halve the overall cost, as recently envisioned by the company Astranis [53].
2.4.3
LEO Satellite
In recent years, we have witnessed an explosive development of LEO satellite systems. As emerging mega satellite systems, they are deployed at altitudes between 200 and .2000 km. The low Earth orbit deployment brings opportunities and challenges for LEO satellite communication, as discussed in the following paragraphs. The distance between LEO satellites and the ground is much smaller than that of GEO satellites, resulting in shorter propagation latency, smaller channel attenuation, and higher achievable data rates [92]. In addition, photos taken by LEO satellites have higher resolution and are therefore more suitable for monitoring. Most importantly, the LEO satellite system could provide seamless coverage on the ground [93]. Furthermore, the dense deployment of LEO satellite systems is driving the establishment of inter-satellite communication links. However, low-orbit deployment results in a faster orbit speed and smaller coverage. In order to overcome the limitations of small coverage, LEO satellite
126
M. Matracia et al.
systems are often composed of a large number of units, such as the mega satellite constellation of SpaceX, which is expected to deploy 42,000 nodes. Their short orbital period (of about 90 to 120 minutes) allows to provide stable communication within a given area for just a dozen minutes, which makes communication handover challenging. Being out of sync with the rotation of the Earth also places low-orbit satellites over regions with low service requirement (such as the sea) during most of their operating period, resulting in uneven traffic.
2.4.4
MEO Satellite
The earliest communication satellite, Telstar, is a MEO satellite. The deployment height of MEO satellites is between its LEO and the GEO counterparts, and the same goes for their communication performances, including coverage and communication latency. Most of the MEO satellites have an orbit period of about 12 hours. Currently, most MEO satellites are used for navigation, such as the global positioning system (GPS) and the BeiDou navigation satellite system. Some MEO satellites are also used to cover the communications blind areas of GEO satellites around the North Pole and the South Pole.
2.4.5
Satellite Gateway
A satellite gateway (GW) is the hub of communication between satellite and ground network. The existing satellite communication structure is still mainly bent-pipe architecture. The bent-pipe satellite acts as a relay to transparently forward wireless signal from the user to the gateway. Therefore, the ground GW is an indispensable part of the satellite system [94]. In the near future, the traditional GW-satellite-GW communication mode still has significant advantages over inter-satellite communication in terms of security and reliability. Instead of using inter-satellite links, the company OneWeb plans to deploy 55–75 terrestrial gateways in different countries to keep track of where the communication service is going and relanding. Furthermore, the gateways are able to enhance the stability of communication and expand the coverage range [82]. Compared with the direct communication between satellite and mobile terminal such as mobile phone, relaying through the gateway and then forwarding to mobile terminal can effectively reduce the package loss rate and allow more users to access.
2.4.6
Satellite Constellations
Next, we introduce a few of the mega satellite constellations that are being deployed, which will play an important role in facilitating the global connectivity of nextgeneration communication networks [96].
Bridging the Digital Divide
127
• OneWeb: The plan for this constellation is to launch 648 satellites, which will orbit at an altitude of 1200 km and provide Internet access services directly to users through ground gateways [54]. OneWeb has a single star capacity of more than 5 Gbps. • SpaceX’s Starlink: The size of the constellation proposed by SpaceX should reach to nearly 12,000 satellites, consisting of 4408 low-orbit constellations at an altitude of 550 km and 7518 very-low-orbit constellations at an altitude of about 340 km [51]. Ku/Ka frequency band is selected in the low-orbit constellation, which is conducive to better coverage. Very-low-orbit constellations use the Vband for signal enhancement and more targeted services. • LeoSat: LeoSat plans to build a constellation of 108 satellites. The constellation is deployed in the 1400 km LEO orbit using six orbital planes with 18 satellites deployed on each plane. LeoSat uses the Ka frequency band to provide users with high-speed data transmission services around the world [91]. Unlike OneWeb and SpaceX, LeoSat primarily provides data transfer services to governments and businesses. • Amazon’s Project Kuiper: Unveiled in 2019 for the first time and authorized by the Federal Communications Commission (FCC) in 2020, this LEO satellite constellation was planned to exceed 3000 units (although it seems that this number has recently doubled). The overall cost of the projects accounts for more than ten billion USDs. Nevertheless, the company owned by Jeff Bezos delayed the launch of its prototypes to early 2023, while FCC regulations require at least half of the constellation units to be in orbit within July 2026 [10]. • Telesat’s Lightspeed: This constellation currently accounts for 188 LEO satellites, which use optical inter-satellite links to create a global mesh. Moreover, the satellites move along hybrid orbits, which allows them to cover the entire planet but providing higher capacity to the most populated areas [85]. However, these satellites present important drawbacks in terms of weight and cost. • Inmarsat’s Orchestra: This heterogeneous communication system combines GEO and LEO satellites, as well as TBSs. The main goal of the project is to serve maritime areas as well as the Arctic region. Nonetheless, the company expects to exploit its satellite constellation also for aviation purposes [36].
2.5 Integrated Networks Having described the roles of terrestrial, aerial, and space networks in global connectivity, in this section, we discuss strengths and challenges of space-airground integrated networks (SAGINs) as well as the communication technologies linking the various types of single networks. Finally, we provide an overview about integrated networks’ current state of the art.
128
2.5.1
M. Matracia et al.
Strengths
SAGIN networks are massive, highly heterogeneous, and flexible paradigms which can be used to boost both coverage and data rate [33]. With respect to high heterogeneity, SAGIN networks span space, air, and ground layers and contain wireless communication systems with a variety of different access technologies [69]. They use system integration to give full play to the advantages of the whole system and meet the needs of communication services. Specifically, different layers of the heterogeneous network can support different protocols. A SAGIN network can also select the most appropriate network access technology and provide better QoS according to users’ requirements [13]. In addition, the SAGIN network integrates a multilayer network in which some devices, such as LEO satellites, are always in high-speed motion and can leave or join the network anytime [20]. Therefore, the SAGIN network has a strong antidamage ability, and the failure of any node will not affect the operation of the whole network. Because of this feature, SAGIN networks are suitable for post-disaster emergency services, or military communications [49]. Therefore, SAGIN networks are flexible and self-organized.
2.5.2
Challenges
The main challenges for SAGINs are fourfold: interoperability, management, control, and regulation between different countries [69, 95]. Firstly, the overall SAGIN system is large and complex, and the efficient combination of different parts is challenging. Different layers may have different communications standards, and dedicated gateways are needed for converting protocols and formats. Secondly, network management and operation are faced with challenges due to the inevitable increase in the number of devices involved, multidimensional heterogeneity of resources, and diversified QoS requirements of users. Thirdly, it is not easy to control the system because of the extensive range of mobility of the devices, the time-varying topology, and the need for real-time monitoring of the device locations. In addition, network traffic load and heterogeneous resource availability information also need to be considered in the management. Finally, when a cell covers the area encompassing the borders from two nations, regulation between the two countries would be a problem since there is no uniform international framework for monitoring cross-border data transmission.
2.5.3
Communication Links
In terms of architecture, SAGIN includes communication links connecting different network layers [69]. Due to various devices and vastly different communication environments, in order to improve system performance, SAGIN networks often need to involve multiple communication link technologies:
Bridging the Digital Divide
129
• RF communication (below 30 GHz) is suitable for communication that needs to travel long distances in the troposphere, because the variable weather, high concentration of molecules, etc. will have a significant attenuation on highfrequency signals [14]. RF signals have low transmission data rate but high reliability. • mmWave is the frequency band between 30 and 300 GHz [77]. Compared with traditional RF communication, mmWave significantly broadens the available bandwidth and improves the transmission rate. mmWave frequency band can be used for air-space communication or air-ground communication with LoS paths. • Terahertz communication (0.3–10 THz) is expected to meet the demand for 6G network technology [5]. Compared to mmWave, the terahertz band has no shortage of spectrum and can provide ultra-wideband to achieve communication rates of more than 100 Gbps. Terahertz’s antenna size is relatively small and can be combined with massive MIMO and other technologies. Therefore, it is a good choice for stratospheric and space communication equipment. • Optical wireless communication technologies, including infrared, visible, and ultraviolet light communication techniques, are also promising technologies [14]. Optical bands do not suffer from multipath fading and the Doppler effect. In environments with low atmospheric turbulence and attenuation, optical communications can provide high capacity over long distances; an interesting use case, for example, is inter-satellite communication.
2.6 Spectrum Coexistence Even though frequency is a continuous quantity and has no physical upper bound, devices have been designed only for certain range of frequencies, which limits the amount of spectrum available for wireless communications. Given the increasing population and, consequently, the ever-increasing data traffic demand, most spectra are crowded and hence the QoS delivered to each user is generally low. Furthermore, competition for spectrum resources is not confined to users of the same technology [67], but transcends to inter-technology spectrum usage, whereby users of different communication technologies operate on the same set of frequency bands. For licensed spectrum, the problem can be mitigated by legally enforcing the usage of frequency bands by different network providers and users. However, the use of unlicensed spectra cannot be controlled in a similar fashion. For example, the 2.4 GHz band [63] is used by multiple technologies such as Wi-Fi, Zigbee, cellular networks, and Bluetooth. Similarly, Wi-Fi and cellular services compete in the 5 GHz [58] and 6 GHz [68, 75] bands alongside emergency communication systems. On the other hand, cellular systems will be competing against satellite networks within the 28 GHz mmWave band [57]. Finally, cellular users share the Citizens Broadband Radio Service (CBRS) band [30] with the incumbent radars and fixed satellite systems. When it comes to space networks, the satellites in different orbits can cause interference to each other while sharing the Ka and Ku bands.
130
M. Matracia et al.
Therefore, research is needed to mitigate the effect of interference through distributed control or minimize real-time spectrum overuse through a spatial spectrum database (as done in the case of TVWS). To protect the QoS of the primary users of a given band, exclusion zones [68] can be used to push the secondary users of the band out of this zone.
3 Affordability In this section, we will discuss relevant literature works, available data, future expectations, and our own opinions about the affordability (i.e., the economic feasibility from the users’, governments’, and operators’ perspectives) of the solutions and technologies discussed in the previous section.
3.1 Aerial Solutions Authors in [23] proposed an architecture to expand the coverage of cellular BSs through untethered drones by first forming a network of sites interconnected through optical fibers. To reduce the dependence of drones on batteries, they suggest using solar panels. They also devise algorithms to facilitate the cost-effective deployment of the solution. On top of this, given the rapid growth of the drone market, it is expected that the overall costs of deployment will decrease considerably in the next few years. Nonetheless, in contrast to the tens of thousands US dollars needed to build and install a cell tower, as of today, we can roughly estimate the cost of a commercial drone capable of functioning as a BS as just 7200 $, where .80% of the cost corresponds to the drone itself and its battery, while the remaining part is due to the power station and the flight controller (.12% and .8%, respectively).3 However, it is worth highlighting that, on the technical side, the overall capacity of the network would be strongly affected by the limited number of antennas that each drone can carry, but at the same time, better channel conditions would be achieved. While the failure of projects such as Facebook’s Aquila can be traced back to safety concerns and control issues [31], as already mentioned in Sect. 2.3.3, Alphabet recently cancelled Project Loon because of commercial viability issues, an experiment to provide Internet to remote regions through balloons wandering in the stratosphere at an altitude of 18–25 km [81]. Analysts speculate that the decision to cancel the project comes in light of LEO-based connectivity solutions popularized by the likes of Starlink and OneWeb.
3 A typical example is the DJI Matrice 300 RTK drone, with power station BS60 and flight controller N3.
Bridging the Digital Divide
131
3.2 LEO Satellites The three major LEO broadband satellite constellations are Kuiper, OneWeb, and Starlink, each having total cost ownership of roughly one billion USD, the majority of which can be attributed to setting up ground stations and launching the satellites into space. Commenting on the cost per capacity metric [60] constellations, we can say that Starlink has a high cost per capacity of USD 300M per Tbps, whereas OneWeb only requires USD 1M per Tbps, owing to the lower density of their constellation. In contrast, Kuiper requires USD 400M per Tbps whose density is at par with Starlink. To put things in perspective, we also look at the capacity offered by these constellations: Starlink and Kuiper offer a capacity of 600 Mbps/km.2 and 210 Mbps/km.2 , respectively, while OneWeb is limited to 6.5 Mbps/km.2 . From a business standpoint, each constellation needs to bring down the cost per capacity to offer affordable connectivity solutions to the users. Moreover, the lifespan of these LEO satellites is relatively short compared to the GEO, which questions the sustainability of the projects, and drives up the cost as the satellites need to be periodically replenished by launching more into space. Currently, Starlink offers a beta subscription for USD 110 per month and a one-time equipment cost of USD 599 [79] for a download speed of roughly a hundred Mbps. While the costs may not be affordable for the majority of the people residing in developing countries (who are already in a geographically and economically disadvantaged condition), the equipment cost is expected to drop soon due to technological advancements and mass production [34].
3.3 TVWS TV White Space exploits unused ultrahigh frequency (UHF) channels, saving spectrum licensing costs. The UHF signals use less power to cover large distances, and blockages such as buildings and walls do not dampen the signal much compared to higher frequencies. Moreover, setting up a TVWS network is cheaper than installing conventional cellular or fiber connections, requiring minimal technical skills [70]. The cost of TVWS BS equipment is seven times more compared to long-range 5.8–GHz Wi-Fi access points, while the TVWS subscriber units cost five times more than the Wi-Fi stations [65]. Higher towers are required for TVWS to ensure that 36% of the first Fresnel zone is free of obstacles, which implies higher setup costs for TVWS. Furthermore, commercial Wi-Fi devices consume one-third of the energy that TVWS equipment would. Therefore, more research is needed to make TVWS devices energy efficient. However, regarding maintenance costs, TVWS is comparable to Wi-Fi networks. Authors in [62] investigated the performance of the TVWS pilot network in a rural region in Northern Thailand. The results showed link throughput ranging from
132
M. Matracia et al.
4 to 13 Mbps and a maximum latency of 16 ms and 2.5 ms jitter for typical packet sizes up to 1500 bytes. In conclusion, they deem the performance suitable for voice, video, and real-time data transfers. They also suggest using TVWS in dense urban areas for IoT networks. Authors in [35, 39] recently proposed the idea of using TVWS through HAPs to cover sparsely populated regions. Through simulations, they have determined an upper-bound practical capacity of 53 Mbps.
3.4 Citizens Broadband Radio Service (CBRS) Similarly to TVWS, also CBRS leverages on the intelligent use of unlicensed spectrum. Developed in the USA, this paradigm aims to connect private cellular networks by exploiting as much as .150 MHz (specifically, from .3.55 to .3.7 GHz, which corresponds to the LTE band 48) of coordinated spectrum, as granted by the FCC. However, current devices do not support this technology; therefore, it is expected that the commercial service of CBRS will use LTE (or even 5G) as the radio access method with a time-division duplex (TDD) scheme [84]. The main use cases for CBRS implementation are (i) indoor localized broadband access, (ii) outdoor cellular-like broadband access across relatively large areas, and (iii) broadband service from a central transmitting tower to nearby buildings. Despite important advantages such as the low cost and the possibility to upgrade the subscription to enjoy better services [32], the main drawback of CBRS relates to its high-frequency band of operation, which strongly limits the coverage radius unless the transmit power is substantially increased. More detailed information on the CBRS framework, as well as the licensed shared access (LSA) one developed in Europe, can be found in [46].
3.5 Universal Broadband The United Nations (UN) Broadband Commission aspires the international community to meet the goal of full global coverage by 2020. Authors in [61] recently investigated whether this goal is realistic and analyzed its financial implications on countries’ economies. They estimated the cost of meeting the target to be USD 2 trillion using 4G and 5G NSA, which is roughly equivalent to 0.7% of annual GDP of developing countries over the upcoming decade. However, this estimate can drastically reduce if the government policies are designed carefully, making universal broadband solutions commercially viable; for example, governments could reduce radio spectrum pricing and facilitate the MNOs to share infrastructure (thereby allowing them to expand their footprint). In [12], authors recently proposed the concept of basic broadband as a relaxed alternative to the stringent low-latency demands of universal broadband. Basic
Bridging the Digital Divide
133
broadband involves the aggressive deployment of staging resources, that is, locally available storage and processing infrastructure which can be located within data centers, edge networks, or network endpoints. They argue that low latency and constant availability requirements of universal broadband serve as obstacles in the way of digital inclusion.
3.6 Universal Electrification To make ubiquitous connectivity a reality, countries should strive to electrify their rural areas in an affordable way. To reach islands and least developed countries, however, it is often necessary to lay hundreds or even thousands of kilometers of submarine optical fiber. For example, a consortium of companies such as China Mobile International, Meta Connectivity, MTN GlobalConnect, Orange, STC, Telecom Egypt, Vodafone, and WIOCC is now working to deploy the longest cable system ever (within the year 2024) and bring high-quality broadband services to African countries by connecting them to Europe and the Middle East [2]. Going back to the universal electrification problem, another important issue is that the signals transmitted though the submarine cables suffer from strong attenuation when they travel for very long distances. Contextually, Meta Connectivity has collaborated with GEPS Techno in developing a prototype called Wavegem, consisting of an autonomous marine platform which is capable of harvesting both solar and sea energy, allowing to amplify the signals’ power [1].
4 Conclusion In this chapter, we extensively discussed the problem of the digital divide between urban and exurban areas. In particular, we firstly overviewed what, in our opinion, are the most relevant platforms, projects, and technologies specifically designed for bridging said connectivity gap. While promising advancements on ground, air, and space networks have been anticipated, our research suggested that an integration of these three layers is the ideal solution to the problem. However, apart from the current techno-economic limitations (above all, the excessive costs of conventional ground infrastructures made by cell towers and underground optical fiber links, as well as the short endurance of aerial platforms), the integration of different paradigms certainly brings serious challenges: compatibility of different layer networks, high complexity of network management and operation, and the risk of frequency resource competition. In our vision, future networks will bridge the existing digital divide on a global scale. Cross-layer communication will no longer be a serious issue, and space, air, and ground networks will be integrated seamlessly. Communications devices will conquer regions at any altitude in the atmosphere, such as high altitudes in the
134
M. Matracia et al.
stratosphere that are challenging for device deployment. Communications services will be ubiquitous and will allow to provide high-capacity, high-speed, and reliable coverage in untraversed areas. Finally, it is worth noting that apart from the context of rural communications, most of the paradigms and frameworks hereby discussed may also apply (with proper modifications) to other relevant scenarios such as the ones requiring connectivity in underwater [6], maritime [8], aeronautical [16], and post-disaster [50] situations.
References 1. Marine renewables: Meta chooses GEPS Techno autonomous platform to boost its submarine cables, 2022. energyfacts. Accessed: Sep. 28, 2022. Available online: https:// www.energyfacts.eu/marine-renewables-meta-chooses-geps-techno-autonomous-platformto-boost-its-submarine-cables/. 2. “Longest Subsea Cable Ever Deployed”: 2Africa Cable Lands in Genoa, 2022. Offshore Engineer. Accessed: Sep. 28, 2022. Available online: https://www.oedigital.com/news/495831longest-subsea-cable-ever-deployed-2africa-cable-lands-in-genoa. 3. Modification of wind turbines to contain communication signal functionality, by T. M. Sievert (Nov. 21, 2006). Patent US 7,138,961 B2 [Online]. Available: https://patents.google.com/ patent/US20040232703A1/en. 4. A. KISHK, M., BADER, A., AND ALOUINI, M.-S. Aerial base station deployment in 6G cellular networks using tethered drones: The mobility and endurance tradeoff. IEEE Vehicular Technology Magazine 15, 4 (2020), 103–111. 5. ABBASI, O., AND YANIKOMEROGLU, H. UxNB-enabled cell-free massive MIMO with HAPSassisted sub-THz backhauling. arXiv preprint arXiv:2201.07379 (2022). 6. ALI, M. F., JAYAKODY, D. N. K., CHURSIN, Y. A., AFFES, S., AND DMITRY, S. Recent advances and future directions on underwater wireless communications. Archives of Computational Methods in Engineering 27, 5 (2020), 1379–1412. 7. ALLISS, R. J. Optimizing the performance of space to ground optical communications. In FreeSpace Laser Communications XXXI (2019), vol. 10910, SPIE, pp. 24–31. 8. ALQURASHI, F. S., TRICHILI, A., SAEED, N., OOI, B. S., AND ALOUINI, M.-S. Maritime communications: A survey on enabling technologies, opportunities, and challenges. IEEE Internet of Things Journal (2022), 1–1. 9. ALTAEROS. Products. Altaeros, Accessed: Sep. 28, 2022. Available online: https://www. altaeros.com/products/. 10. ANNIE PALMER. Amazon will open 172,000-square-foot Project Kuiper internet satellite factory, 2022. Accessed: Nov. 14, 2022. Available online: https://www.cnbc.com/2022/10/27/ amazon-to-open-kuiper-internet-satellite-factory.html. 11. ATA, Y., AND ALOUINI, M.-S. HAPS based FSO links performance analysis and improvement with adaptive optics correction. 12. BECK, M., AND MOORE, T. Is universal broadband service impossible? arXiv preprint arXiv:2204.11300 (2022). 13. BELMEKKI, B. E. Y., AND ALOUINI, M.-S. Unleashing the potential of networked tethered flying platforms: Prospects, challenges, and applications. IEEE Open Journal of Vehicular Technology 3 (2022), 278–320. 14. BEN YAHIA, O., ERDOGAN, E., AND KURT, G. K. HAPS-assisted hybrid RF-FSO multicast communications: Error and outage analysis. IEEE Transactions on Aerospace and Electronic Systems (2022).
Bridging the Digital Divide
135
15. BENARBIA, T., AND KYAMAKYA, K. A literature review of drone-based package delivery logistics systems and their implementation feasibility. Sustainability 14, 1 (2021), 360. 16. BILEN, T., AHMADI, H., CANBERK, B., AND DUONG, T. Q. Aeronautical networks for inflight connectivity: A tutorial of the state-of-the-art and survey of research challenges. IEEE Access (2022). 17. BONDALAPATI, P., TIWARI, A., SAHIN, M. E., TANG, Q., SARASWAT, S., SURYAKUMAR, V., YAZDAN, A., KUSUMA, J., AND DUBEY, A. Supercell: A wide-area coverage solution using high-gain, high-order sectorized antennas on tall towers. arXiv:2012.00161 (2020). 18. BROADBAND COMMISSION FOR SUSTAINABLE DEVELOPMENT. Achieving the 2025 Advocacy Targets, 2022. Accessed: Sep. 30, 2022. Available online: https://www. broadbandcommission.org/advocacy-targets/. 19. CALVO, R. M., POLIAK, J., SUROF, J., REEVES, A., RICHERZHAGEN, M., KELEMU, H. F., BARRIOS, R., CARRIZO, C., WOLF, R., REIN, F., ET AL. Optical technologies for very high throughput satellite communications. In Free-Space Laser Communications XXXI (2019), vol. 10910, SPIE, pp. 189–204. 20. CAO, X., LI, Y., XIONG, X., AND WANG, J. Dynamic routings in satellite networks: An overview. Sensors 22, 12 (2022), 4552. 21. CHANDRASEKHARAN, S., GOMEZ, K., AL-HOURANI, A., KANDEEPAN, S., RASHEED, T., GORATTI, L., REYNAUD, L., GRACE, D., BUCAILLE, I., WIRTH, T., AND ALLSOPP, S. Designing and implementing future aerial communication networks. IEEE Communications Magazine 54, 5 (2016), 26–34. 22. CHAOUB, A., GIORDANI, M., LALL, B., BHATIA, V., KLIKS, A., MENDES, L., RABIE, K., SAARNISAARI, H., SINGHAL, A., ZHANG, N., ET AL. 6G for bridging the digital divide: Wireless connectivity to remote areas. IEEE Wireless Communications 29, 1 (2021), 160–168. 23. CHIARAVIGLIO, L., AMOROSI, L., BLEFARI-MELAZZI, N., DELL’OLMO, P., MASTRO, A. L., NATALINO, C., AND MONTI, P. Minimum cost design of cellular networks in rural areas with UAVs, optical rings, solar panels, and batteries. IEEE Transactions on Green Communications and Networking 3, 4 (2019), 901–918. 24. COLDEWEY, D. Facebook permanently grounds its Aquila solar-powered internet plane, 2018. TechCrunch. Accessed: Nov. 13, 2022. Available online: https://techcrunch.com/2018/06/26/ facebook-permanently-grounds-its-aquila-solar-powered-internet-plane/. 25. DEL PORTILLO, I., CAMERON, B. G., AND CRAWLEY, E. F. A technical comparison of three low Earth orbit satellite constellation systems to provide global broadband. Acta Astronautica 159 (2019), 123–135. 26. DU, K., MUJUMDAR, O., OZDEMIR, O., OZTURK, E., GUVENC, I., SICHITIU, M. L., DAI, H., AND B HUYAN , A. 60 GHz outdoor propagation measurements and analysis using Facebook Terragraph radios. In IEEE Radio and Wireless Symposium (RWS) (Las Vegas, NV, USA, 2022), pp. 156–159. 27. EREZ AVIV. Wireless Backhaul Spectrum- Everything You Need To Know in 2022, 2022. Accessed: Nov. 19, 2022. Available online: https://www.ceragon.com/blog/wireless-backhaulspectrum. 28. ERKMEN, B. Beaming broadband across the Congo River, 2021. Google X. Accessed: Sep. 24, 2022. Available online: https://x.company/blog/posts/taara-beaming-broadband-acrosscongo/. 29. FENCL, M., DOHNAL, M., VALTR, P., GRABNER, M., AND BAREŠ, V. Atmospheric observations with e-band microwave links—challenges and opportunities. Atmospheric Measurement Techniques 13, 12 (2020), 6559–6578. 30. GAO, W., AND SAHOO, A. Performance impact of coexistence groups in a GAA-GAA coexistence scheme in the CBRS band. IEEE Transactions on Cognitive Communications and Networking 7, 1 (2020), 184–196. 31. GLASER, A. Here’s why Facebook’s massive drone crashed in the Arizona desert, 2016. Vox. Accessed: Nov. 13, 2022. Available online: https://www.vox.com/2016/12/18/13998900/ facebooks-drone-crash-aquila-arizona-structural-failure.
136
M. Matracia et al.
32. GOOGLE’S SPECTRUM ACCESS SYSTEM. Introduction to CBRS and Spectrum Sharing, 2022. Accessed: Nov. 17, 2022. Online video: https://www.youtube.com/watch?v=5SIawaRwuIE. 33. GUO, H., LI, J., LIU, J., TIAN, N., AND KATO, N. A survey on space-air-ground-sea integrated network security in 6G. IEEE Communications Surveys & Tutorials 24, 1 (2021), 53–87. 34. HERATH, H. Starlink: a solution to the digital connectivity divide in education in the global south. arXiv preprint arXiv:2110.09225 (2021). 35. HUSSIEN, H. M., KATZIS, K., MFUPE, L. P., AND BEKELE, E. T. Capacity, coverage and power profile performance evaluation of a novel rural broadband services exploiting TVWS from high altitude platform. IEEE Open Journal of the Computer Society 3 (2022), 86–95. 36. INMARSAT. Inmarsat ORCHESTRA Technology, 2022. Accessed: Nov. 14, 2022. Online video: https://www.youtube.com/watch?v=0RtyMtIzI6I. 37. ITU TELECOMMUNICATION DEVELOPMENT BUREAU. Measuring digital development: Facts and figures 2021. Tech. rep., Geneva, Switzerland, 2021. Accessed: Sep. 30, 2022. [Online]. Available: https://www.itu.int/en/ITU-D/Statistics/Pages/facts/default.aspx. 38. KARABULUT KURT, G., KHOSHKHOLGH, M. G., ALFATTANI, S., IBRAHIM, A., DARWISH, T. S. J., ALAM, M. S., YANIKOMEROGLU, H., AND YONGACOGLU, A. A vision and framework for the high altitude platform station (HAPS) networks of the future. IEEE Communications Surveys & Tutorials 23, 2 (2021), 729–779. 39. KATZIS, K., MFUPE, L., AND HUSSIEN, H. M. Opportunities and challenges of bridging the digital divide using 5G enabled high altitude platforms and TVWS spectrum. In Eighth International Conference on Communications and Networking (ComNet) (2020), IEEE, pp. 1– 7. 40. LAHMERI, M.-A., KISHK, M. A., AND ALOUINI, M.-S. Charging techniques for UAV-assisted data collection: Is laser power beaming the answer? IEEE Communications Magazine 60, 5 (2022), 50–56. 41. LAWLER, R. Alphabet’s Project Taara laser tech beamed 700TB of data across nearly 5 km, 2021. The Verge. Accessed: Sep. 24, 2022. Available online: https://www.theverge.com/2021/ 9/16/22677015/project-taara-fsoc-wireless-internet-kinshasa-congo-fiber. 42. LIQUIDE, A. Flying whales: a new take on the airship, Accessed: Nov. 6, 2022. Available online: https://www.airliquide.com/stories/innovation/flying-whales-new-take-airship. 43. LOBNER, P. Thales alenia space—stratobus, Accessed: Nov. 6, 2022. Available online: https:// lynceans.org/wp-content/uploads/2021/04/Thales-Alenia-Space_Stratobus-converted.pdf. ˇ UT ¯ E˙ , E., AND W ILSON , T. D. Digital means for reducing digital inequality: 44. MACEVI CI Literature review. Informing science: the international journal of an emerging transdiscipline 21 (2018), 269–287. 45. MAREK, S. Facebook says its fiber-spinning robot will dramatically reduce costs, 2021. Fierce Telecom. Accessed: Sep. 23, 2022. Available online: https://www.fiercetelecom.com/ tech/facebook-says-its-fiber-spinning-robot-will-dramatically-reduce-costs. 46. MASSARO, M., AND BELTRÁN, F. Will 5G lead to more spectrum sharing? discussing recent developments of the LSA and the CBRS spectrum sharing frameworks. Telecommunications Policy 44, 7 (2020), 101973. 47. MATRACIA, M., KISHK, M. A., AND ALOUINI, M.-S. Coverage analysis for UAV-assisted cellular networks in rural areas. IEEE Open Journal of Vehicular Technology 2 (2021), 194– 206. 48. MATRACIA, M., KISHK, M. A., AND ALOUINI, M.-S. Exploiting wind-turbine-mounted base stations to enhance rural connectivity. IEEE Communications Magazine 59, 12 (2021), 50–56. 49. MATRACIA, M., KISHK, M. A., AND ALOUINI, M.-S. On the topological aspects of UAVassisted post-disaster wireless communication networks. IEEE Communications Magazine 59, 11 (2021), 59–64. 50. MATRACIA, M., SAEED, N., KISHK, M. A., AND ALOUINI, M.-S. Post-disaster communications: Enabling technologies, architectures, and open challenges. IEEE Open Journal of the Communications Society 3 (2022), 1177–1205. 51. MCDOWELL, J. Starlink Launch Statistics, Planet4589. Retrieved 11 September 2022. Available online: https://planet4589.org/space/stats/star/starstats.html.
Bridging the Digital Divide
137
52. MOGILI, U. R., AND DEEPAK, B. Review on application of drone systems in precision agriculture. Procedia computer science 133 (2018), 502–509. 53. MOHNEY, D. “Micro GEO satellite builder Astranis raises $90 million, 2020. Space IT Bridge. Accessed: Oct. 12, 2022. Available online: https://www.spaceitbridge.com/micro-geosatellite-builder-astranis-raises-90-million.htm. 54. MOHNEY, D. OneWeb talks satellite broadband speeds, constellation configs, Space IT Bridge. Accessed: 14. Oct, 2022. Available online: https://www.spaceitbridge.com/onewebtalks-satellite-broadband-speeds-constellation-configs.htm. 55. MONTENEGRO, L. O., AND ARARAL, E. Can competition-enhancing regulation bridge the quality divide in Internet provision? Telecommunications Policy 44, 1 (2020), 101836. 56. MOZAFFARI, M., SAAD, W., BENNIS, M., NAM, Y.-H., AND DEBBAH, M. A tutorial on UAVs for wireless networks: Applications, challenges, and open problems. IEEE communications surveys & tutorials 21, 3 (2019), 2334–2360. 57. MULGAONKAR, D. R., SHARMA, D., MEHROTRA, R., AND VRIND, T. Advanced mechanisms for satellite and terrestrial co-existence in 26/28 GHz mmWave spectrum. In 2020 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS) (2020), IEEE, pp. 1–6. 58. NAIK, G., LIU, J., AND PARK, J.-M. J. Coexistence of wireless technologies in the 5 GHz bands: A survey of existing solutions and a roadmap for future research. IEEE Communications Surveys & Tutorials 20, 3 (2018), 1777–1798. 59. OECD. Understanding the digital divide. OECD Digital Economy Papers (2001. [Online]. http://www.oecd.org/dataoecd/38/57/1888451.pdf). 60. OSORO, B., AND OUGHTON, E. Universal broadband assessment of low earth orbit satellite constellations: Evaluating capacity, coverage, cost, and environmental emissions. SSRN (2022. Available online: https://ssrn.com/abstract=4178732). 61. OUGHTON, E. J., COMINI, N., FOSTER, V., AND HALL, J. W. Policy choices can help keep 4G and 5G universal broadband affordable. Technological Forecasting and Social Change 176 (2022), 121409. 62. PATI, B. M., TAPARUGSSANAGORN, A., LERTSINSRUBTAVEE, A., AND TANSAKUL, N. Performan study of television with spaces (TVWS) pilot network in Thailand. NBTC Journal 3, 3 (2019), 98–119. 63. POLAK, L., AND MILOS, J. Performance analysis of LoRa in the 2.4 GHz ISM band: coexistence issues with Wi-Fi. Telecommunication Systems 74, 3 (2020), 299–309. 64. POTTER, B., VALENTINO, G., YATES, L., BENZING, T., AND SALMAN, A. Environmental monitoring using a drone-enabled wireless sensor network. In Systems and Information Engineering Design Symposium (SIEDS) (2019), pp. 1–6. 65. QUISPE, M., OLIVARES, J., SAMANIEGO, J., AND MORÁN, R. Technical and economic analysis of TVWS and 5.8 GHz WiFi systems for rural areas. In XXIX International Conference on Electronics, Electrical Engineering and Computing (INTERCON) (2022), IEEE, pp. 1–4. 66. RAHMAN, A. U., FOURATI, F., NGO, K.-H., JINDAL, A., AND ALOUINI, M.-S. Network graph generation through adaptive clustering and infection dynamics: A step toward global connectivity. IEEE Communications Letters 26, 4 (2022), 783–787. 67. RAHMAN, A. U., KISHK, M. A., AND ALOUINI, M.-S. Improving spectral efficiency of wireless networks through democratic spectrum sharing. arXiv preprint arXiv:2111.10570 (2021). 68. RAHMAN, A. U., KISHK, M. A., AND ALOUINI, M.-S. A game-theoretic framework for coexistence of WiFi and cellular networks in the 6-GHz unlicensed spectrum. IEEE Transactions on Cognitive Communications and Networking (2022). 69. RAY, P. P. A review on 6G for space-air-ground integrated network: Key enablers, open challenges, and future direction. Journal of King Saud University-Computer and Information Sciences (2021). 70. ROWNEY, P. TV white space technology set to transform access and affordability in Malawi, 2017. Accessed: Sep. 29, 2022. [Online]. https://a4ai.org/news/tv-white-space-technologyset-to-transform-access-and-affordability-in-malawi/.
138
M. Matracia et al.
71. SAEED, N., ALMORAD, H., DAHROUJ, H., AL-NAFFOURI, T. Y., SHAMMA, J. S., AND ALOUINI, M.-S. Point-to-point communication in integrated satellite-aerial 6G networks: State-of-the-art and future challenges. IEEE Open Journal of the Communications Society (2021). 72. SALAMA, M. R., AND SRINIVAS, S. Collaborative truck multi-drone routing and scheduling problem: Package delivery with flexible launch and recovery sites. Transportation Research Part E: Logistics and Transportation Review 164 (2022), 102788. 73. SAMY, R., YANG, H.-C., RAKIA, T., AND ALOUINI, M.-S. Space-air-ground FSO networks for high-throughput satellite communications. IEEE Communications Magazine (2022). 74. SARICA, D. White wind turbine on green grass field. Pexels, 2020. https://www.pexels.com/ photo/white-wind-turbine-on-green-grass-field-6251771/. 75. SATHYA, V., KALA, S. M., ROCHMAN, M. I., GHOSH, M., AND ROY, S. Standardization advances for cellular and Wi-Fi coexistence in the unlicensed 5 and 6 GHz bands. GetMobile: Mobile Computing and Communications 24, 1 (2020), 5–15. 76. SERRANO, P., GRAMAGLIA, M., MANCINI, F., CHIARAVIGLIO, L., AND BIANCHI, G. Balloons in the sky: Unveiling the characteristics and trade-offs of the google loon service. IEEE Transactions on Mobile Computing (2021), 1–1. 77. SINGYA, P. K., MAKKI, B., D’ERRICO, A., AND ALOUINI, M.-S. Hybrid FSO/THz-based backhaul network for mmWave terrestrial communication. arXiv preprint:2204.08357 (2022). 78. SPACEX. Aerial view clouds Nasa satellite. Pexels, 2017. https://www.pexels.com/photo/ aerial-view-clouds-nasa-satellite-23781/. 79. STARLINK, 2022. Accessed: Sep. 30, 2022. https://www.starlink.com/. 80. SWAMINATHAN, R., SHARMA, S., VISHWAKARMA, N., AND MADHUKUMAR, A. HAPSbased relaying for integrated space–air–ground networks with hybrid FSO/RF communication: A performance analysis. IEEE Transactions on Aerospace and Electronic Systems 57, 3 (2021), 1581–1599. 81. SWINHOE, D. Alphabet shuts down Internet balloon subsidiary Project Loon. Data Center Dynamics, 2021. Accessed: Sep. 30, 2022. https://www.datacenterdynamics.com/en/news/ alphabet-shuts-down-internet-balloon-subsidiary-project-loon/. 82. TALGAT, A., KISHK, M. A., AND ALOUINI, M.-S. Stochastic geometry-based analysis of LEO satellite communication systems. IEEE Communications Letters 25, 8 (2021), 2458–2462. 83. TARANA, 2022. Accessed: Nov. 20, 2022. Available online: https://www.taranawireless.com/. 84. TELCOMA GLOBAL. CBRS, 2022. Accessed: Nov. 17, 2022. Online video: https://www. youtube.com/watch?v=g5Y9Ae2Ohj8. 85. TELESAT. Telesat Lightspeed, 2022. Accessed: Nov. 14, 2022. Available online: https://www. telesat.com/leo-satellites/. 86. TIWARI, A. SuperCell: Reaching new heights for wider connectivity, 2020. Facebook Engineering. Accessed: Sep. 23, 2022. Available online: https://engineering.fb.com/2020/12/03/ connectivity/supercell-reaching-new-heights-for-wider-connectivity/. 87. TRICHILI, A., RAGHEB, A., BRIANTCEV, D., ESMAIL, M. A., ALTAMIMI, M., ASHRY, I., OOI, B. S., ALSHEBEILI, S., AND ALOUINI, M.-S. Retrofitting FSO systems in existing RF infrastructure: A non-zero-sum game technology. IEEE Open Journal of the Communications Society 2 (2021), 2597–2615. 88. UNDSETH, M., JOLLY, C., AND OLIVARI, M. Space sustainability: The economics of space debris in perspective. 89. VEHICLES, H. A. Airlander 10, Accessed: Oct. 29, 2022. Available online: https://www. hybridairvehicles.com/our-aircraft/airlander-10/. 90. VINH, Q. N. Flying white drone. Pexels, 2017. https://www.pexels.com/photo/photo-of-aflying-white-drone-2164417/. 91. VOILA. A New Type of Satellite Constellation, Accessed: 14. Oct, 2022. Available online: https://www.leosat.com/to/technology. 92. WANG, R., KISHK, M. A., AND ALOUINI, M.-S. Stochastic geometry-based low latency routing in massive LEO satellite networks. IEEE Transactions on Aerospace and Electronic Systems (2022), 1–14.
Bridging the Digital Divide
139
93. WANG, R., KISHK, M. A., AND ALOUINI, M.-S. Ultra-dense LEO satellite-based communication systems: A novel modeling technique. IEEE Communications Magazine 60, 4 (2022), 25–31. 94. WANG, R., TALGAT, A., KISHK, M. A., AND ALOUINI, M.-S. Conditional contact angle distribution in LEO satellite-relayed transmission. IEEE Communications Letters (2022). 95. ZHANG, N., ZHANG, S., YANG, P., ALHUSSEIN, O., ZHUANG, W., AND SHEN, X. S. Software defined space-air-ground integrated vehicular networks: Challenges and solutions. IEEE Communications Magazine 55, 7 (2017), 101–109. 96. ZONG, P., AND KOHANI, S. Design of LEO constellations with inter-satellite connects based on the performance evaluation of the three constellations SpaceX, OneWeb and Telesat. Korean Journal of Remote Sensing 37, 1 (2021), 23–40.
Part II
Enabling Technologies for 6G Communications
AI-Native Air Interface Jiajia Guo, Chao-Kai Wen, and Shi Jin
1 Introduction The fifth-generation (5G) mobile communications standard is being developed by the 3rd Generation Partnership Project (3GPP). The first release, Release 15 (Rel-15), was completed in 2018 [37]. The subsequent releases, Rel-16 and Rel17, were frozen in 2020 and 2022, respectively [14, 38]. These three releases belong to the first evolution stage of 5G. Rel-18, which started in May 2022, marks the start of the second evolution stage of 5G, known as 5G-Advanced [7]. Researchers in industry and academia have also begun studying the visions, use cases, and potential key enabling technologies for sixth-generation (6G) cellular networks. These future generations are expected to adopt novel technologies, such as reconfigurable intelligent surfaces [31], integrated sensing and communications [30], terahertz communications [42], and artificial intelligence (AI) [29], to meet the high communication requirements of the future. Among these technologies, AI is expected to be a key and prominent technology in 5G-Advanced and 6G and has been studied since 3GPP Rel-17, specifically in the area of AI-enabled radio access network intelligence [39]. AI has a long history, which dates back to the 1956 Dartmouth conference [33]. In 2012, the AI-based AlexNet [25] achieved great success in the ILSVRC-2012 competition, which marked the beginning of a new era in the field of AI. The last decade has seen significant advancements in AI, particularly in the area of deep
J. Guo · S. Jin (O) National Mobile Communications Research Laboratory, Southeast University, Nanjing, China e-mail: [email protected]; [email protected] C.-K. Wen Institute of Communications Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_6
143
144
J. Guo et al.
learning (DL), which has achieved considerable success in various fields, such as computer vision, natural language processing, and medical image analysis [16, 43]. The main advantages of AI are as follows: • Unlike conventional machine learning (ML) algorithms that rely on handcrafted feature extractors, DL-based AI algorithms use neural networks (NNs) to automate the feature extraction process through an end-to-end learning approach. Thus, NNs can learn knowledge from training data without human intervention. • Most conventional methods rely on iterative algorithms that are computationally complex. Although the NNs in AI are also complex, their inference speed is high and can be accelerated with the use of powerful graphics processing units (GPUs) or other computational accelerators. Inspired by the success of AI in other fields, it has also been applied in wireless communications, particularly in the physical layer [27], for tasks such as channel estimation [44], beamforming [23], receiver design [49], channel state information (CSI) feedback [48], and end-to-end communications [34]. The industry has also made significant advancements in AI-based air interface [21]. In 2021, 3GPP proposed a new study item, “Study on Artificial Intelligence (AI)/Machine Learning (ML) for NR Air Interface” [1], which demonstrates the potential for deployment of AI-based air interface in 5G-Advanced and 6G. AI-native air interface has become a popular topic in academia and industry. This chapter aims to provide a guide for researchers in the field of AI-native air interface, answering several key questions, including the following: • How can AI be combined with air interface, and what improvements can AI bring to air interface algorithms? • What are the main directions in AI-native air interface? To answer these questions, the chapter uses AI-based CSI feedback, which is selected as a use case by 3GPP [1], as an example. The chapter is organized as follows: Sect. 2 introduces the ways in which AI can be applied to air interface, including single-module enhancement and end-to-end communications. Section 3 uses CSI feedback as an example and provides an overview of the recent progress in AI-native air interface. Finally, Sect. 4 concludes the chapter and highlights future directions in the field.
2 AI in Air Interface Two main approaches are proposed to introduce AI to the air interface, namely, end-to-end communications and single-module enhancement.
AI-Native Air Interface
145
2.1 End-to-End Communications A conventional communication system, in its simplest form, consists of a transmitter, a channel, and a receiver. The transmitter and the receiver contain several manually designed processing blocks, such as source coding/decoding, modulation, and equalization. Each module is independently designed and optimized to make up today’s efficient and stable communication systems. However, global optimality is difficult to guarantee in conventional communication systems. From the perspective of DL, the features and parameters of a DL model can be learned directly from data by optimizing the end-to-end loss function, without manual design. The use of an autoencoder (AE) is one of the primary ways that unsupervised learning models are developed, and it involves an encoding network and a decoding network. The encoding network learns the implicit features of the input data, and the decoding network attempts to reconstruct the original input data. Researchers have proposed an NN-based end-to-end communication system, particularly on an autoencoder [36]. Unlike conventional communication systems, channel AE allows the joint optimization of the transmitter and receiver for any differentiable channel model and the global optimization of the entire system without any conventional signal processing module. The general structure of channel AE is depicted in Fig. 1, where the encoding and decoding networks, which are represented by two separate deep neural networks, correspond to the transmitter and receiver in a traditional communication system, respectively. First, the transmitted symbol .s is transformed by the encoding network into encoded data .x, which is then sent to the channel. Upon the reception of .y, the decoding network learns to reconstruct .sˆ , i.e., the estimation of the original symbols. Channel AE is trained using measurements or simulated datasets, optimizing the loss function, which reflects the recovery accuracy. The channel model is a crucial aspect of channel AE. The entire network is optimized using the stochastic gradient descent (SGD) algorithm, and the backpropagation (BP) algorithm is used to calculate the gradient. The gradient must pass through the channel when backpropagating from the decoding network to the encoding network, indicating that the channel transfer function, .H, must be differentiable. However, actual communication systems having impairments, such as channel noise, time-varying environments, and hardware loss, make the accurate representation of .H difficult. The simplification of the channel model may lead to misleading results in the trained system. This problem has two solutions [52]. One is to assume an imperfect channel model and design the compensation structure for the impairments in advance, i.e.,
Encoding network
Channel
y
Fig. 1 End-to-end communication systems based on autoencoder
Decoding network
146 Fig. 2 General architecture of the RTN-based receiver
J. Guo et al.
Received signal Parameter estimation network Parameters Transformation layer
Decoding network
Recovered message
the model-assumed channel AE [9, 12, 35, 36, 41, 45, 51]. The other is to optimize the AE network through the approximate method without using the channel transfer function, that is, the model-free channel AE [4, 40, 50]. For model-assumed channel AE, impairments such as additive noise, carrier frequency and phase offset, sampling frequency and time offset, and multipath fading must be considered. Receivers based on radio transformer network (RTN) are often used to compensate for these impairments [52]. As shown in Fig. 2, the decoding network is preceded by a parameter estimation network and a transformation layer. The parameter estimation network learns the required parameters from the received signal, and the transformation layer transforms the signal according to the learned parameters to compensate for the impairments. Thus, the use of an RTN-based receiver helps improve the practicality and robustness of the model-assumed channel AE. For model-free channel AE, the actual system is shown in Fig. 3a. Given the unavailability of the actual channel transfer function when optimizing the network, the gradient generation network (GGN) can be used to obtain an approximate actual channel or an approximate transmitter gradient and then propagate the gradient to the transmitter [52]. The method of approximating the actual channel is shown in Fig. 3b. The idea of this method is to generate a differentiable surrogate channel that approximates the distribution of the actual channel so that the channel AE can be optimized over the surrogate channel. This goal can be achieved by a generative adversarial network [50]. The method of approximating the transmitter gradient is shown in Fig. 3c. The idea of this method is to transmit message through the
AI-Native Air Interface
Original message
Transmitter
147
Non-differentiable actual channel
Receiver
Recovered message
Receiver
Recovered message
Receiver
Recovered message
(a)
Original message
Transmitter
Differentiable NN-based channel generator
(b) Original message
Transmitter Update parameters
Non-differentiable actual channel
Obtain the approximate transmitter gradient
(c)
Fig. 3 (a) The actual system; (b) the training strategy of approximating the actual channel based on GGN; (c) the training strategy of approximating transmitter gradients based on GGN
actual channel and then optimize the transmitter on the basis of an approximate loss function gradient. This goal can be achieved by reinforcement learning [4] or simultaneous perturbation stochastic approximation [40]. The channel AE systems implemented by the above two approaches (modelassumed channel AE and model-free channel AE) can deal with channel impairments well, and their system performance is very competitive. In the future, channel AE is expected to become a widely used architecture in wireless communication systems.
2.2 Single-Module Enhancement As previously mentioned, conventional communications are composed of several communication modules. Unlike end-to-end communications that completely change the communication architecture, AI-enabled single-module enhancement uses AI techniques to enhance a certain communication module, thereby improving the system performance. In this section, we use CSI feedback as an example to demonstrate the use of AI in enhancing a single communication module. In massive MIMO systems, the accuracy of CSI plays a crucial role in achieving the gains of massive MIMO. High spatial resolution can only be achieved if the transmitter has a well-known propagation environment. Specifically, in downlink data transmission, the base station must design an accurate precoding matrix on the basis of the downlink CSI. Thus, the accuracy of CSI greatly affects the gains that
148
J. Guo et al.
1 2
2 3
1
1
2
2 3
Fig. 4 Illustration of TDD and FDD modes
can be achieved by massive MIMO technology. Massive MIMO systems primarily use two duplexing modes: time division duplexing (TDD) and frequency division duplexing (FDD). In the TDD mode, uplink and downlink transmissions occur on the same frequency band with different time slots. In this mode, the uplink and downlink channels within a coherent time block are assumed to have good reciprocity, meaning that the downlink channel can be inferred directly from the uplink one. Consequently, the base station in the TDD mode does not need to perform further downlink channel estimation after the uplink channel estimation. As shown in Fig. 4, the base station in the TDD mode does not need to perform the downlink channel estimation after the uplink channel estimation. In contrast, the FDD mode uses different frequency bands for uplink and downlink transmissions with the same time slot. The right sub-figure of Fig. 4 shows the downlink CSI acquisition and transmission process in the FDD mode. The base station first transmits a pilot signal to the user. Then, the user estimates the downlink CSI on the basis of the received pilot signal and feeds it back to the base station via the uplink control channel. The base station then designs the downlink transmission on the basis of the received downlink CSI [32]. Compared with MIMO systems in fourth-generation (4G) mobile communications, massive MIMO systems have a significantly increased number of antennas, which in turn increases the dimensionality of the CSI. Inconsequently, the feedback of the downlink CSI in massive MIMO systems occupies many uplink bandwidth resources, thus seriously affecting the overall system performance. In practical systems, the high requirements for CSI feedback can hinder the deployment of FDD massive MIMO in 5G. However, the FDD mode has certain advantages over the TDD mode. First, the coverage area of TDD base stations is typically smaller than that of FDD base stations due to the shorter uplink transmitting time in the TDD mode. Second, the TDD mode tends to have stronger intra- and inter-system interferences than the FDD mode due to the co-channel transmission and reception. Furthermore, the CSI changes rapidly in high-speed scenarios, and channel aging is more severe in the TDD mode than in the FDD mode. Therefore, an efficient CSI
AI-Native Air Interface
"
149
Encoder
Decoder
Encoder
Decoder
"
"
"
Fig. 5 Illustration of AE-based image compression and CSI feedback
feedback method must be developed to facilitate the deployment of the FDD mode in massive MIMO systems. In the past decade, AI-based AE has achieved great success in various fields, including image compression. AE-based image compression has entered the standardization stage, presenting improved compression and reconstruction accuracy compared with traditional algorithms [13]. Following this success, AIbased approaches were also applied to the CSI feedback problem in 2017, with the introduction of a novel architecture called CsiNet [48]. Simulation results show that the feedback accuracy and algorithm complexity of AI-based CSI feedback are superior to those of traditional feedback methods, such as compressive sensing (CS). Figure 5 illustrates the AE architecture for image compression and CSI feedback [19]. As shown in the first sub-figure, the AE consists of two core modules, namely, the encoder and the decoder. The encoder is used to reduce the dimensionality of the input image and generate a low-dimensional latent space representation of the original image, while the decoder reconstructs the original image based on the low-dimensional latent space representation. The encoder and the decoder are trained via an end-to-end approach, and both the input and output of the AE are the original images during the training process. Therefore, AE can be regarded as a kind of unsupervised learning. Inspired by AE-based image compression, AEbased CSI feedback treats the CSI matrix as a special kind of “image.” Specifically, the encoder at the user side compresses the CSI “image” to obtain the codeword to be fed back and then transmits the codeword to the base station through the
150
J. Guo et al.
Table 1 NMSE dB performance comparison of different CSI feedback algorithms [48] Compression ratio 1/4
1/16
1/32
1/64
Methods LASSO BM3D-AMP TVAL3 CsiNet LASSO BM3D-AMP TVAL3 CsiNet LASSO BM3D-AMP TVAL3 CsiNet LASSO BM3D-AMP TVAL3 CsiNet
Indoor .−7.59 .−4.33 .−14.87 .−17.36 .−2.72 0.26 .−2.61 .−8.65 .−1.03 24.72 .−0.27 .−6.24 .−0.14 0.22 0.63 .−5.84
Outdoor .−5.08 .−1.33 .−6.90 .−8.75 .−1.01 .−0.55 .−0.43 .−4.51 .−0.24
22.66 0.46 .−2.81 .−0.06 25.45 .−0.76 .−1.93
uplink control channel. Once the feedback code word is received, the base station immediately reconstructs the original downlink CSI through the decoder. Unlike traditional algorithms, which are based on certain channel distribution assumptions, such as sparsity in CS, AE-based CSI feedback does not require manually designed assumptions, and the compression and reconstruction modules are automatedly designed through end-to-end learning from large amounts of CSI data. Table 1 shows the normalized mean-squared error (NMSE) performance of the AE-based CsiNet and CS algorithms, including LASSO, BM3D-AMP, and TVAL3. The AIbased CSI feedback outperforms the traditional algorithms by a large margin under all compression ratios. When the compression ratio is extremely low, such as 1/64, the CS-based methods cannot work at all, whereas the AI method can still work well. Moreover, the AE structure of the AI-enabled CSI feedback is based on NNs, which have a high degree of parallelism. Therefore, given the support of a powerful GPU and other accelerators, the inference speed of AE-based CSI feedback is extremely high, and the compression and reconstruction operations can be completed within 1 ms and thus much faster than those of the codebook and CS-based feedback methods.
3 AI-Based CSI Feedback Although AI-enabled algorithms have achieved significant success in recent years, AI is a relatively new topic in air interface, and many issues must still be addressed
AI-Native Air Interface
151
before AI-enabled algorithms can be deployed in real systems. In this section, AIenabled CSI feedback is used as an example to showcase the recent efforts of researchers in creating an AI-native air interface. The related works can be divided into two categories. Many works focus on improving the CSI feedback performance further by introducing novel NN architectures and incorporating expert knowledge. The remaining works mainly focus on tackling the key challenges during deployment, such as joint design. In this section, we discuss the novel NN architecture design, the expert knowledge exploitation, and the joint design.
3.1 Novel NN Architecture Design The architecture of CsiNet is shown in Fig. 6. Specifically, the encoder consists of a convolutional layer and a fully connected layer. The convolutional layer uses .3 × 3 convolutional kernels to extract features from the input CSI matrix to generate CSI feature maps. The two channel feature maps are then combined, reshaped into a vector, and input to the fully connected layer for compression by reducing the neuron numbers. The decoder at the base station consists of a fully connected layer, two RefineNet modules, and a convolutional layer. The first fully connected layer at the base station decompresses the received codeword into two feature maps with the same dimension as the encoder input. Each RefineNet module consists of three convolutional layers with .3×3 convolutional kernels to generate 8, 16, and 2 channel feature maps. The idea of residual network [22] is introduced, and the output of the first and last layers are summed up as the entire module’s output, thereby effectively avoiding the gradient disappearance problem. The last convolutional layer uses a Sigmoid activation function to normalize the elements of the output CSI matrix. All the convolutional layers adopted in CsiNet are padded with zeros around the input so that the output maintains the dimensionality of the original input matrix. All the convolutional layers, except the last one, use a leaky rectified linear unit (LeakReLu)
Encoder
Decoder 2
Conv 3x3
2
Conv 3x3
16
RefineNet
8
Conv 3x3
Reshape 32x32
feedback
FC 32x32x2
FC Mx1
angular
Reshape Nx1
H
Conv 3x3
delay
1 1
Conv 3x3
1 2
RefineNet
2
ˆ H
User Base Station
Fig. 6 Illustration of the CsiNet architecture [48]. “Conv” and “FC” denote the convolutional layer and fully connected layer, respectively
152
J. Guo et al.
angle
32 × 32 × 2 32 × 32 × 2
delay
×1 2048/
5 RefineNet Blocks
2048 × 1
×1
Decoder (base station)
2048/
delay
2048 × 1
Encoder (user)
32 × 32 × 2 32 × 32 × 2
32 × 32 × 2 32 × 32 ×8
angle
32 × 32 ×16
Conv 7 × 7; BN; LeakyReLu;
Conv 7 × 7; BN; Sigmoid;
FC(Dense); Linear;
ReLu;
Conv 5 × 5; BN; LeakyReLu;
Conv 3 × 3; BN; Tanh;
Reshape;
Add;
Fig. 7 Illustration of the CsiNet+ architecture [18]. “Conv” and “FC” denote the convolutional layer and fully connected layer, respectively
as the activation function and the batch normalization (BN) operation before the activation function to reduce the training difficulty. The NN is trained with mean square error (MSE) as the loss function, and the NN parameters are updated using end-to-end learning and the Adam optimization algorithm. The CSI in the angular time delay domain usually exhibits a blocky sparse structure [11]. CsiNet and other CS-based feedback methods treat this sparsity as prior information. As previously mentioned, CsiNet’s encoder first uses a convolutional layer with .3 × 3 kernels to extract the features of the CSI matrix that exhibit block sparsity. In natural images, a .3 × 3 convolutional kernel is most commonly used to extract edge information within small fields of perception. However, the visualization of the output of the first CNN layer in CsiNet shows that most of the output coefficients are nearly zero, similar to the input that contains less information. Specifically, if the perceptual field of the convolution operation lies within a large “blank” region, nine coefficients are still zero after .3 × 3 convolution, which can be regarded as redundant and increases the complexity. Meanwhile, if the convolution operation is performed in a completely “non-blank” region, the sparsity is not expressed. The block sparsity property of the CSI is not effectively reflected and utilized with the .3 × 3 convolution operation. In [5], a novel block activation unit with a block size of 6 was introduced to handle block sparse vectors in wideband wireless communication systems. Reference [46] points out that increasing the perceptual field of the convolutional layers can significantly improve the super-resolution performance. Inspired by Bai et al. [5], and Wang et al. [46], two convolutional layers with a convolutional kernel size of .7 × 7 replace the original convolutional layer with a .3 × 3 convolutional kernel. As shown in Fig. 7, two cascaded .7× 7 convolution operations correspond to a perceptual field size of .13 × 13. This larger kernel size and perceptual field allows for the sparsity of the CSI to be effectively captured, given the low probability that the convolution operation will only occur in completely “blank” regions. Similarly, in the decoder side of the network, the RefineNet module’s two .3 × 3 convolutional layers are replaced by two convolutional layers with convolutional kernel sizes of .7 × 7 and .5 × 5. This improved NN architecture is called “CsiNet-M1.” Meanwhile, the last convolutional layer at the decoder of CsiNet is eliminated, allowing the RefineNet module to continuously and directly learn the residuals
AI-Native Air Interface
153
Table 2 NMSE (.dB) performance of AI-enabled CSI feedback with different NN architectures Indoor
Outdoor
Compression ratio 1/4 1/8 1/16 1/32 1/4 1/8 1/16 1/32
CsiNet −17.36 −12.70 −8.65 −6.24 −8.75 −7.61 −4.51 −2.81
CsiNet-M1 −20.80 −14.52 −11.77 −8.75 −10.14 −8.11 −4.99 −1.87
CsiNet-M2 −24.80 −15.23 −12.21 −8.65 −10.78 −8.55 −4.44 −2.78
CsiNet+ −27.37 −18.29 −14.14 −10.43 −12.40 −8.72 −5.73 −3.40
between the reconstructed CSI and the original CSI. This approach improves the implementation of the residual learning concept [22] and is called “CsiNet-M2.” The CsiNet architecture that includes both of these improvements is referred to as CsiNet+. Table 2 shows the NMSE performance of the AI-enabled CSI feedback with different NN architectures, including CsiNet, CsiNet-M1, CsiNet-M2, and CsiNet+. The dataset is the same as that in [48], and more simulation details can be found in [18]. The modifications made to the NN architecture can significantly improve the feedback performance, particularly at high compression ratios. For example, the respective performance gains of CsiNet-M1 and CsiNet-M2 are 3.44 and 7.44 dB at the compression ratio of 1/4. Furthermore, the combination of the two modifications in CsiNet+ results in even more performance gains. These simulation results demonstrate the potential of the NN architecture design and indicate that advanced NN architectures can provide further performance improvements. Therefore, careful NN architecture design is crucial before deployment in practical systems.
3.2 Expert Knowledge Exploitation The CsiNet+ framework, as previously mentioned, views the CSI matrix as an “image” and enhances the accuracy of intelligent CSI feedback through the use of novel NN structures, stacked NNs, and various advanced training algorithms. However, unlike natural images, CSI is a representation of the propagation environment, reflecting the characteristics of signal propagation and containing a vast amount of physical information and knowledge. Therefore, instead of treating CSI simply as an image, uncovering and utilizing the physical characteristics of CSI are crucial to designing a framework that is specifically geared toward intelligent CSI feedback. The physical characteristics of CSI can be considered as expert knowledge, which is primarily expressed as the correlations of CSI across multiple domains. In this section, we use the correlation among the CSI of nearby users as an example to demonstrate how to exploit expert knowledge to improve the performance of CSI feedback.
154
J. Guo et al.
|H1|
|H2|
Fig. 8 Illustration of the scatterers shared by nearby users
Connectivity density, which refers to the number of users that can be served by the network per unit area, is a crucial 5G metric. It necessitates the efficient transmission of 32 bytes of packet data to each user every 10 s. This metric requires extremely low transmission speed and latency as it is primarily intended for largescale Internet of Things (IoT) applications, such as smart cities. The International Telecommunication Union (ITU) set the 5G connectivity density requirement of “one million connections” in a 1 km.2 area. Specifically, 5G must be able to support over one million users in a 1 km.2 area without compromising the aforementioned transmission requirements. In other words, millions of users can be deployed in a 1 km.2 area. According to the first international 6G white paper [26], the number of users per cubic meter may even increase to hundreds in the future 6G network. The aforementioned extremely dense connectivity density poses a great challenge to the spatial spectral efficiency of the communication system. From the perspective of CSI feedback, a highly accurate CSI is required to meet this challenge. In the case of beamforming, the beam must be highly precise and narrowly focused to avoid inter-user interference in a densely distributed scenario. In summary, the significant increase in user density presents a significant challenge to CSI feedback but also provides additional opportunities. Given the high user density, the users are in close proximity to each other. In this scenario, nearby users may have similar propagation environments and share most of the scatterers, and their CSI is correlated. Such correlation can be fully exploited to reduce the feedback overhead of users, thereby improving the feedback accuracy. As illustrated in Fig. 8, two users are likely to share the same scatterer and similar deterministic multipath components if they are situated in close proximity to each other. According to the theoretical analysis and practical measurements in [3, 8, 10, 15, 24], some CSI parameters of nearby users are the same or similar.
AI-Native Air Interface
155
Rx
f en1
Feedback
( f en1 e (| H1 |, O en1 ))
| H1 |
( f en1 e (| H1 |, Oen1 ))1
Encoder
Tx
Decoder
ˆ | |H 1 v
f com1 Combination
f de1
ˆ | |H 1
User 1
Channel
ˆ| |H s
Decoder f co-de
Rx
f en2
Feedback
( f en2 e (| H 2 |, Oen2 ))
| H2 |
( f een2 (| H 2 |, Oen2 ))
Encoder
Decoder f de2
ˆ | |H 2 v
Combination
f com2 Copy
User 2
ˆ | |H 2
Base Station
Fig. 9 Illustration of the CoCsiNet framework
The CSI magnitudes in the angular domain are highly correlated, but the correlation between the CSI phases in the angular domain is low. In [8], to take advantage of the shared sparse structure in the CSI vector, the CSI sparse vector is divided into two parts: shared and individual sparse representation vectors. Inspired by this work, this section separates the information of the CSI magnitude (.|Hi |) in the angular domain of the i-th user into two parts: shared information with nearby users and individual information. To take advantage of the correlation among nearby users, the AE-based CSI feedback framework should be modified. The computational and storage capacity of the user is limited. Therefore, this work does not make any modifications to the encoder, and the proposed novel framework, CoCsiNet, only modifies the NN framework at the base station. According to the aforementioned division of information into two parts, the decoder at the base station consists of two types of modules, namely, shared and individual modules, which are used to reconstruct the shared and individual information, respectively. Figure 9 illustrates the proposed CoCsiNet framework, which comprises an encoder at the user side and a decoder at the base station side. At the user side, two encoders generate bitstreams through the NN layers and the binary (or quantization) operation, respectively. Once the bitstreams are generated, the users feed these bitstreams back to the base station via the uplink control channel. Then, the shared decoder at the base station reconstructs the CSI magnitude information shared by the nearby users from the bitstreams fed back by the two users. Meanwhile, the two individual decoders recover the CSI magnitude information unique to each of the two users. Finally, the final CSI
156
J. Guo et al.
Table 3 NMSE(dB) performance comparison among CS, AE, and CoCsiNet (channel model: 3GPP_38.901_UMi_NLOS scenario [2]) Method CS AE CoCsiNet
BPD 0.05 0.65 −8.71 −10.62
0.1
0.2
0.3
0.4
0.5
0.76 −10.01 −11.31
0.74 −11.36 −13.57
0.60 −12.04 −14.98
0.38 −13.00 −15.43
0.13 −13.65 −15.82
0.6 −0.12 −14.10 −16.38
magnitude is reconstructed by two combination modules. The network architecture of the shared decoder is similar to that of the individual decoder, except that the input of the shared decoder is the bitstream of all users’ feedback, while the input to the individual decoder is only the feedback bitstream of its corresponding users. The combination module consists of fully connected layers. The proposed CoCsiNet framework incorporates both explicit and implicit feedback cooperations. The explicit cooperation is the shared decoder at the base station, which uses the feedback from all users to recover the shared CSI information. Implicit cooperation is present in the compression process of the encoder. In the NN framework without cooperation, users must repeatedly feed back the shared CSI information. In the proposed CoCsiNet framework, the encoders of different users cooperatively feed back the shared CSI information. The details of the proposed CoCsiNet framework can be found in [20]. Table 3 shows the NMSE performance comparison of CS, AE, and CoCsiNet. The CSI dataset is generated by the QuaDRiGa software, and the channel model is 3GPP_38.901_UMi_NLOS [2]. Other simulation details can be found in [20]. In this part, bit per dimension (BPD) is used to represent the feedback overhead. The table indicates that the AE-based CSI feedback outperforms the traditional CSbased method by a large margin, thereby confirming the effectiveness of AI in CSI feedback. Additionally, the proposed CoCsiNet performs better than the AE without expert knowledge exploitation under all BPDs. This simulation result shows that the exploitation of expert knowledge can considerably improve the CSI feedback performance.
3.3 Joint Design The optimization goal of most existing CSI feedback methods is to reconstruct CSI as accurately as possible while disregarding the physical meaning of CSI. Specifically, most works aim to reduce the MSE between the reconstructed CSI and the original CSI, such as CsiNet [48]. However, the MSE is not always a reliable measure of signal fidelity [47]. In massive MIMO, the most crucial information in CSI is the angle and gain of each path. However, the MSE treats all information equally and sometimes preserves irrelevant information at the expense of important information. Consequently, situations where the performance of the
AI-Native Air Interface
157
s
s
h
θ RF Fu11y connected 1ayer
vRF
Lambda 1ayer
Fig. 10 Illustration of the CsiFBnet framework
overall communication system is poor despite a low MSE may exist. To more accurately evaluate the quality of the CSI reconstructed from the base station, the quality of the feedback can be assessed based on the impact of its accuracy on the subsequent communication modules, such as beamforming. In this section, the joint intelligent CSI feedback and beamforming design scheme is investigated, focusing on reducing the overhead and improving the final system performance through a joint design approach. In beamforming design, spectral efficiency is widely used to evaluate the beamforming performance, as seen in [6], which aims to maximize the spectral efficiency of the system. The system spectral efficiency can be expressed as | H |2 ) ( |h w| , .R = log2 1 + σ2
(1)
where .h represents the channel vector, .w is the beamforming vector, and .σ denotes the variance of the complex additive white Gaussian noise with a zero mean. Beamforming vector .w can be written as w = vRF vD .
.
(2)
where .vRF and .vD represent the analog and digital precoders, respectively. Given a constant modulus constraint on the analog precoder, that is, .|[vRF ]i |2 = 1, and 2 a maximum transmitting √ power constraint, that is, .|w| ≤ P , the optimal .vD can be achieved by .vD = P /Nt . Hence, the beamforming optimization problem of spectral efficiency turns into the optimization of .vRF : .
( maximize log2 1 + vRF
ρ Nt
|| H || ) ||h vRF ||2 ,
(3)
where .ρ = σP2 denotes the downlink SNR. As shown in Fig. 10, the proposed joint CSI feedback and beamforming design framework, that is, CsiFBnet contains three main modules: encoder, quantizer, and
158
J. Guo et al.
decoder. The encoder and decoder modules consist of fully connected layers. The first two fully connected layers of the encoder are used to extract CSI features, and the last one compresses the original downlink CSI by reducing the neuron number. In the conventional feedback strategy, the base station first reconstructs the downlink CSI on the basis of the feedback codewords and then designs the beamforming vector according to the reconstructed downlink CSI. However, the decoder of CsiFBnet is different from that adopted in the conventional feedback strategy. The CsiFBnet framework directly generates the analog beamforming vector, .vRF , on the basis of the feedback codewords. As shown in Eq. (3), the analog beamforming vector, .vRF , must satisfy the constant mode constraint. However, the output of the fully connected layer cannot satisfy this requirement regardless of the activation function used. Inspired by Lin and Zhu [28], an extra Lambda layer is added after the last fully connected layer of the decoder. Let .θ RF represent the output of the last fully connected layer, then it physically corresponds to the phase of the analog beamforming vector. The output of the Lambda layer is defined as vRF = exp(j · θ RF ) = cos(θ RF ) + j · sin(θ RF ),
.
where .j =
√
(4)
−1. Therefore, the entire CsiFBnet framework can be formulated as ( ( ( )) ) vRF = exp j · fde,s Q fen,s (h, Oen,s ) , Ode,s ,
.
(5)
where .fen,s (·) and .fde,s (·) respectively represent the CSI compression (encoder) and beamforming generation processes (decoder), .Oen,s and .Ode,s denote the parameters of the corresponding NNs, and .Q(·) represents the quantization operation. To maximize the rate in Eq. (1), the loss function is defined as ( ( ( )) ) . .Losss = −h exp j · fde,s Q fen,s (h, Oen,s ) , Ode,s H
(6)
Unlike the traditional AE-based CSI feedback methods, the training of CsiFBnet does not require labeled data, making it an unsupervised learning process. Other information about the proposed CsiFBnet framework can be found in [17]. As seen in Fig. 11, the proposed CsiFBnet framework demonstrates superior spectral efficiency over the baseline algorithms. The simulation utilized the 3GPP spatial channel model with .Nt transmit antennas at the base station and a single receiving antenna at the user side. Other details on the simulation parameters and setup can be found in [17]. A comparison was made between CsiFBnet and three other methods: Baseline-1, Baseline-2, and Baseline-3. Baseline-1 uses AE to feedback the downlink CSI and traditional beamforming to design beamforming vectors, Baseline-2 uses CS to compress and reconstruct the downlink CSI and traditional beamforming to design beamforming vectors, and Baseline-3 uses an
AI-Native Air Interface
159
Spectral efficiency (bits/Hz/s)
5 4.5 4 3.5 3 2.5 2 1.5 0
5
CsiFBnet Nt=32
CsiFBnet Nt=64
Baseline-1 Nt=32
Baseline-1 Nt=64
Baseline-2 Nt=32
Baseline-2 Nt=64
Baseline-3 Nt=32
Baseline-3 Nt=64
10
15
20
25
Feedback bits Fig. 11 Spectral efficiency performance comparison among CsiFBnet and baseline algorithms
AE to compress and reconstruct the downlink CSI and neural networks to design beamforming vectors. Baseline-2 performs the worst because of the low-quality CSI achieved by CS. Our results show that Baseline-1 outperforms Baseline-3, which contradicts the simulation results reported in [28]. In [28], the estimated CSI quality was found to be low, and the gap between conventional and AI-based algorithms decreased as the estimated CSI quality improved. However, the beamforming design in [28] assumes a perfect CSI. Therefore, when using a perfect CSI, the AI-based algorithm outperforms the conventional algorithm in [28]. As a result, Baseline-1 performs better than Baseline-3. Additionally, CsiFBnet consistently outperforms the baseline algorithms. When the feedback overhead is limited, the performance gap between CsiFBnet and the baseline algorithms is small. For example, when using 4 feedback bits and 64 antennas, CsiFBnet presents an improved performance of only 0.14 bits/Hz/s compared with Baseline-1. However, when the number of feedback bits is low (e.g., 4), only a limited number of beamforming vectors can be selected using a codebook-based approach, resulting in poor performance regardless of the algorithm used. The performance gain of CsiFBnet increases with the number of feedback bits because the feedback NN in the baseline algorithm cannot selectively feedback useful information, whereas CsiFBnet only feeds back information that is useful for the beamforming design, resulting in improved performance. Overall, these simulation results demonstrate the considerable potential of the CsiFBnet framework.
160
J. Guo et al.
4 Conclusion and Future Work This chapter provides an introduction to the background of AI-native air interface and highlights two representative examples of AI-enabled works in the field: endto-end communication and single-module enhancement. Using AI-enabled CSI feedback as an example, the chapter also presents recent developments in the field. Despite the notable successes and potential of AI in air interface, several challenges must still be addressed to fully realize its potential in 6G. These challenges include the following: • The performance of AI-enabled algorithms should be evaluated in the real world, as current evaluations are based on simulations. • AI-enabled algorithms face generalizability challenges from the mismatch between the training and inference dataset distributions. This concern highlights the need to develop robust algorithms and carefully design online training to improve generalizability. • The limitations of computational power in baseband processors and the increased complexity of AI-enabled algorithms pose a challenge to meeting the real-time requirements of air interface algorithms and call for reducing the complexity without sacrificing performance. • Currently, the impact of AI on standardization, particularly considering the need for dataset collection and online training, is not considered in conventional 3GPP standards. New protocols for these and other considerations will need to be developed to support AI-enabled algorithms. Despite these challenges, AI will clearly play a crucial role in the air interface of 6G. These challenges can be addressed by introducing novel AI techniques and incorporating expert knowledge in the field of communication. Acknowledgments This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grants 62261160576 and 61921004.
References 1. 3GPP RP-213599: New SI: Study on artificial intelligence (AI)/Machine Learning (ML) for NR air interface. Tech. rep., Moderator (Qualcomm) (2021). https://www.3gpp.org/ftp/tsg_ran/ TSG_RAN/TSGR_94e/Docs/RP-213599.zip. Accessed on Apr. 22, 2023 2. 3GPP TR 38.901: Study on channel model for frequencies from 0.5 to 100 ghz (release 16). Tech. rep. https://www.etsi.org/deliver/etsi_tr/138900_138999/138901/16.01.00_60/tr_ 138901v160100p.pdf. Accessed on Apr. 22, 2023 3. Algans, A., Pedersen, K., Mogensen, P.: Experimental analysis of the joint statistical properties of azimuth spread, delay spread, and shadow fading. IEEE Journal on Selected Areas in Communications 20(3), 523–531 (2002). https://doi.org/10.1109/49.995511 4. Aoudia, F.A., Hoydis, J.: Model-free training of end-to-end communication systems. IEEE Journal on Selected Areas in Communications 37(11), 2503–2516 (2019). https://doi.org/10. 1109/JSAC.2019.2933891
AI-Native Air Interface
161
5. Bai, Y., Ai, B., Chen, W.: Deep learning based fast multiuser detection for massive machinetype communication. In: 2019 IEEE 90th Vehicular Technology Conference (VTC2019-Fall), pp. 1–5 (2019). https://doi.org/10.1109/VTCFall.2019.8891371 6. Bhagavatula, R., Heath, R.W.: Adaptive limited feedback for sum-rate maximizing beamforming in cooperative multicell systems. IEEE Transactions on Signal Processing 59(2), 800–811 (2011). https://doi.org/10.1109/TSP.2010.2090346 7. Chen, W., Montojo, J., Lee, J., Shafi, M., Kim, Y.: The standardization of 5G-Advanced in 3GPP. IEEE Communications Magazine 60(11), 98–104 (2022). https://doi.org/10.1109/ MCOM.005.2200074 8. Dai, J., Liu, A., Lau, V.K.N.: Joint channel estimation and user grouping for massive MIMO systems. IEEE Transactions on Signal Processing 67(3), 622–637 (2019). https://doi.org/10. 1109/TSP.2018.2883852 9. Dorner, S., Cammerer, S., Hoydis, J., Brink, S.t.: Deep learning based communication over the air. IEEE Journal of Selected Topics in Signal Processing 12(1), 132–143 (2018). https://doi. org/10.1109/JSTSP.2017.2784180 10. Du, X., Sabharwal, A.: Massive MIMO channels with inter-user angle correlation: Openaccess dataset, analysis and measurement-based validation. IEEE Transactions on Vehicular Technology 71(2), 1602–1616 (2022). https://doi.org/10.1109/TVT.2021.3131606 11. Eldar, Y.C., Bolcskei, H.: Block-sparsity: Coherence and efficient recovery. In: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2885–2888 (2009). https://doi.org/10.1109/ICASSP.2009.4960226 12. Felix, A., Cammerer, S., Dorner, S., Hoydis, J., Ten Brink, S.: OFDM-autoencoder for endto-end learning of communications systems. In: 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 1–5 (2018). https:// doi.org/10.1109/SPAWC.2018.8445920 13. Foessel, S., Ascenso, J., da Silva Cruz, L.A., Ebrahimi, T., Lemieux, P.A., Pagliari, C., Pinheiro, A.M.G., Sneyers, J., Temmermanns, F.: JPEG status and progress report 2022. SMPTE Motion Imaging Journal 131(8), 111–119 (2022). https://doi.org/10.5594/JMI.2022.3190917 14. Ghosh, A., Maeder, A., Baker, M., Chandramouli, D.: 5G evolution: A view on 5G cellular technology beyond 3GPP release 15. IEEE Access 7, 127639–127651 (2019). https://doi.org/ 10.1109/ACCESS.2019.2939938 15. Gudmundson, M.: Correlation model for shadow fading in mobile radio systems. Electronics letters 23(27), 2145–2146 (1991). https://doi.org/10.1049/el:19911328 16. Guo, J., Du, H., Zhu, J., Yan, T., Qiu, B.: Relative location prediction in CT scan images using convolutional neural networks. Computer Methods and Programs in Biomedicine 160, 43–49 (2018). https://doi.org/10.1016/j.cmpb.2018.03.025 17. Guo, J., Wen, C.K., Jin, S.: Deep learning-based CSI feedback for beamforming in single- and multi-cell massive MIMO systems. IEEE Journal on Selected Areas in Communications 39(7), 1872–1884 (2021). https://doi.org/10.1109/JSAC.2020.3041397 18. Guo, J., Wen, C.K., Jin, S., Li, G.Y.: Convolutional neural network-based multiple-rate compressive sensing for massive MIMO CSI feedback: Design, simulation, and analysis. IEEE Transactions on Wireless Communications 19(4), 2827–2840 (2020). https://doi.org/10.1109/ TWC.2020.2968430 19. Guo, J., Wen, C.K., Jin, S., Li, G.Y.: Overview of deep learning-based CSI feedback in massive MIMO systems. IEEE Transactions on Communications 70(12), 8017–8045 (2022). https:// doi.org/10.1109/TCOMM.2022.3217777 20. Guo, J., Yang, X., Wen, C.K., Jin, S., Li, G.Y.: Deep learning–based CSI feedback and cooperative recovery in massive MIMO. arXiv preprint arXiv:2003.03303 (2020). https://arxiv. org/abs/2003.03303 21. Han, S., Xie, T., I, C.L., Chai, L., Liu, Z., Yuan, Y., Cui, C.: Artificial-intelligence-enabled air interface for 6G: Solutions, challenges, and standardization impacts. IEEE Communications Magazine 58(10), 73–79 (2020). https://doi.org/10.1109/MCOM.001.2000218
162
J. Guo et al.
22. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90 23. Huang, H., Peng, Y., Yang, J., Xia, W., Gui, G.: Fast beamforming design via deep learning. IEEE Transactions on Vehicular Technology 69(1), 1065–1069 (2020). https://doi.org/10.1109/ TVT.2019.2949122 24. Kaltenberger, F., Gesbert, D., Knopp, R., Kountouris, M.: Correlation and capacity of measured multi-user MIMO channels. In: 2008 IEEE 19th International Symposium on Personal, Indoor and Mobile Radio Communications, pp. 1–5 (2008). https://doi.org/10.1109/PIMRC.2008. 4699493 25. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Communications of the ACM 60(6), 84–90 (2017). https://doi.org/10.1145/ 3065386 26. Latva-aho, M., Leppänen, K., Clazzer, F., Munari, A.: Key drivers and research challenges for 6G ubiquitous wireless intelligence (2020) 27. Letaief, K.B., Chen, W., Shi, Y., Zhang, J., Zhang, Y.J.A.: The roadmap to 6G: AI empowered wireless networks. IEEE Communications Magazine 57(8), 84–90 (2019). https://doi.org/10. 1109/MCOM.2019.1900271 28. Lin, T., Zhu, Y.: Beamforming design for large-scale antenna arrays using deep learning. IEEE Wireless Communications Letters 9(1), 103–107 (2020). https://doi.org/10.1109/LWC.2019. 2943466 29. Lin, X., Chen, M., Rydén, H., Jeong, J., Lee, H., Sundberg, M., Timo, R., Razaghi, H.S., Poor, H.V.: Fueling the next quantum leap in cellular networks: Embracing AI in 5G evolution towards 6G. arXiv preprint arXiv:2111.10663 (2021) 30. Liu, F., Cui, Y., Masouros, C., Xu, J., Han, T.X., Eldar, Y.C., Buzzi, S.: Integrated sensing and communications: Toward dual-functional wireless networks for 6G and beyond. IEEE Journal on Selected Areas in Communications 40(6), 1728–1767 (2022). https://doi.org/10. 1109/JSAC.2022.3156632 31. Liu, R., Alexandropoulos, G.C., Wu, Q., Jian, M., Liu, Y.: How can reconfigurable intelligent surfaces drive 5G-Advanced wireless networks: A standardization perspective. In: 2022 IEEE/CIC International Conference on Communications in China (ICCC Workshops), pp. 221–226 (2022). https://doi.org/10.1109/ICCCWorkshops55477.2022.9896658 32. Lu, L., Li, G.Y., Swindlehurst, A.L., Ashikhmin, A., Zhang, R.: An overview of massive MIMO: Benefits and challenges. IEEE Journal of Selected Topics in Signal Processing 8(5), 742–758 (2014). https://doi.org/10.1109/JSTSP.2014.2317671 33. McCarthy, J., Minsky, M.L., Rochester, N., Shannon, C.E.: A proposal for the dartmouth summer research project on artificial intelligence, august 31, 1955. AI magazine 27(4), 12– 12 (2006). https://doi.org/10.1609/aimag.v27i4.1904 34. O’Shea, T., Hoydis, J.: An introduction to deep learning for the physical layer. IEEE Transactions on Cognitive Communications and Networking 3(4), 563–575 (2017). https:// doi.org/10.1109/TCCN.2017.2758370 35. O’Shea, T.J., Erpek, T., Clancy, T.C.: Deep learning based MIMO communications. CoRR (2017). http://arxiv.org/abs/1707.07980 36. O’Shea, T.J., Karra, K., Clancy, T.C.: Learning to communicate: Channel auto-encoders, domain specific regularizers, and attention. In: 2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), pp. 223–228 (2016). https://doi.org/10. 1109/ISSPIT.2016.7886039 37. Patzold, M.: 5G is coming around the corner [mobile radio]. IEEE Vehicular Technology Magazine 14(1), 4–10 (2019). https://doi.org/10.1109/MVT.2018.2884042 38. Peisa, J., Persson, P., Parkvall, S., Dahlman, E., Grøvlen, A., Hoymann, C., Gerstenberger, D.: 5G evolution: 3GPP releases 16 & 17 overview. Ericsson Technology Review 2020(2), 2–13 (2020). https://doi.org/10.23919/ETR.2020.9904659 39. Rahman, I., Razavi, S.M., Liberg, O., Hoymann, C., Wiemann, H., Tidestav, C., SchliwaBertling, P., Persson, P., Gerstenberger, D.: 5G evolution toward 5G Advanced: An overview
AI-Native Air Interface
163
of 3GPP releases 17 and 18. Ericsson Technology Review 2021(14), 2–12 (2021). https://doi. org/10.23919/ETR.2021.9904665 40. Raj, V., Kalyani, S.: Backpropagating through the air: Deep learning at physical layer without channel models. IEEE Communications Letters 22(11), 2278–2281 (2018). https://doi.org/10. 1109/LCOMM.2018.2868103 41. Raj, V., Kalyani, S.: Design of communication systems using deep learning: A variational inference perspective. IEEE Transactions on Cognitive Communications and Networking 6(4), 1320–1334 (2020). https://doi.org/10.1109/TCCN.2020.2985371 42. Rappaport, T.S., Xing, Y., Kanhere, O., Ju, S., Madanayake, A., Mandal, S., Alkhateeb, A., Trichopoulos, G.C.: Wireless communications and applications above 100 GHz: Opportunities and challenges for 6G and beyond. IEEE Access 7, 78729–78757 (2019). https://doi.org/10. 1109/ACCESS.2019.2921522 43. Schmidhuber, J.: Deep learning in neural networks: An overview. Neural networks 61, 85–117 (2015). https://doi.org/10.1016/j.neunet.2014.09.003 44. Soltani, M., Pourahmadi, V., Mirzaei, A., Sheikhzadeh, H.: Deep learning-based channel estimation. IEEE Communications Letters 23(4), 652–655 (2019). https://doi.org/10.1109/ LCOMM.2019.2898944 45. Stark, M., Ait Aoudia, F., Hoydis, J.: Joint learning of geometric and probabilistic constellation shaping. In: 2019 IEEE Globecom Workshops (GC Wkshps), pp. 1–6 (2019). https://doi.org/ 10.1109/GCWkshps45667.2019.9024567 46. Wang, Q., Fan, H., Cong, Y., Tang, Y.: Large receptive field convolutional neural network for image super-resolution. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 958–962 (2017). https://doi.org/10.1109/ICIP.2017.8296423 47. Wang, Z., Bovik, A.C.: Mean squared error: Love it or leave it? a new look at signal fidelity measures. IEEE Signal Processing Magazine 26(1), 98–117 (2009). https://doi.org/10.1109/ MSP.2008.930649 48. Wen, C.K., Shih, W.T., Jin, S.: Deep learning for massive MIMO CSI feedback. IEEE Wireless Communications Letters 7(5), 748–751 (2018). https://doi.org/10.1109/LWC.2018.2818160 49. Ye, H., Li, G.Y., Juang, B.H.: Power of deep learning for channel estimation and signal detection in OFDM systems. IEEE Wireless Communications Letters 7(1), 114–117 (2018). https://doi.org/10.1109/LWC.2017.2757490 50. Ye, H., Liang, L., Li, G.Y., Juang, B.H.: Deep learning-based end-to-end wireless communication systems with conditional gans as unknown channels. IEEE Transactions on Wireless Communications 19(5), 3133–3143 (2020). https://doi.org/10.1109/TWC.2020.2970707 51. Zhang, H., Lan, M., Huang, J., Huang, C., Cui, S.: Noncoherent energy-modulated massive simo in multipath channels: A machine learning approach. IEEE Internet of Things Journal 7(9), 8263–8270 (2020). https://doi.org/10.1109/JIOT.2020.2989078 52. Zou, C., Yang, F., Song, J., Han, Z.: Channel autoencoder for wireless communication: State of the art, challenges, and trends. IEEE Communications Magazine 59(5), 136–142 (2021). https://doi.org/10.1109/MCOM.001.2000804
Waveform and Modulation Design of Terahertz Communications Yongzhi Wu and Chong Han
1 Introduction Terahertz (THz) band (0.1–10 THz) communication is envisioned as one of the key enabling technologies in the sixth-generation (6G) wireless systems to satisfy revolutionary enhancement of data transmission rate and connectivity density. On the one hand, the peak data rate is expected to reach Terabit-per-second (Tbps) within the next decade [1]. On the other hand, thanks to the immersion of the Internet of Things (IoT) paradigm, ultra-massive number of devices, e.g., on a scale of tens of billions, would be connected [2]. Millimeter-wave (mmWave) communications (30–300 GHz) under 100 GHz have been officially adopted in recent 5G cellular systems [3], while it is still difficult for mmWave systems to support Tbps data rates, limited by the total consecutive available bandwidth of less than 10 GHz in the mmWave systems under 100 GHz. Following the trend toward higher carrier frequencies, the Federal Communication Commission has recently opened up a new category of experimental licenses for 6G communication systems or beyond between 95 GHz and 3 THz [4]. Besides, 275–450 GHz has been identified by the World Radiocommunication Conference 2019 (WRC-19) for the land mobile and fixed services applications [5]. THz band reveals its potential as a key wireless technology to fulfill the future demands for 6G wireless systems, thanks to its four strengths: (1) from tens and up to hundreds of GHz of contiguous bandwidth, (2) picosecond-level symbol duration, (3) integration of thousands of submillimeter-long antennas, (4) ease of coexistence with other regulated and standardized spectrum [6]. As shown in Fig. 1, being known as the THz gap for many years, the THz band has traditionally been one of the least
Y. Wu · C. Han (O) Shanghai Jiao Tong University, Shanghai, China e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_7
165
166
Y. Wu and C. Han
Fig. 1 The spectrum and wavelength of the THz band
explored frequency bands in the electromagnetic (EM) spectrum, mostly owing to the lack of efficient THz transceivers and antennas. Nevertheless, practical THz communication systems have been enabled by the major process in the last ten years to expedite the way of fulfilling this gap [7–9]. The THz spectrum can resolve the spectrum scarcity problem and tremendously enhance current wireless system capacity [10]. Various promising applications are envisaged, such as Tbps WLAN system (Tera-WiFi) for metaverse, Tbps Internet of Things (Tera-IoT) in wireless data center, Tbps integrated access backhaul (Tera-IAB) wireless networks, and ultra-broadband THz space communications (Tera-SpaceCom). Besides these macro-/micro-scale applications, the THz band can be utilized for wireless connections in nanomachine networks, to enable wireless networks-on-chip communications (WiNoC) and the Internet of Nano-Things (IoNT), motivated by the state-of-the-art nanoscale transceivers and antennas that oscillate in the THz band. Despite the wonderful vision and the great promise of THz networks, when it comes to the THz band, stringent challenges are encountered as a result of the distinctive features of THz wave propagation. First, compared to the lowerfrequency bands, including the ultrahigh-frequency (UHF) and mmWave EM waves, the THz band experiences much higher free-space path loss as it increases quadratically with frequency. Even one-meter communication distance at 0.67 THz can lead to a free-space path loss in the excess of 90 dB. In this case, directional antennas with high antenna gains are enabled by beamforming and ultra-massive multiple input multiple output (UM-MIMO) techniques to compensate for the severe path loss [11]. Second, the atmospheric effects may attenuate the THz wave propagation and cause the molecular absorption loss, particularly obvious at some THz frequencies, which is not observed at lower-frequency bands [12]. Third, reflection and scattering losses in the THz band become stronger [13], which results in the decrease of the number of the dominant rays, since the signal power of THz wave becomes very weak when it is reflected or scattered two or more times. Furthermore, THz waves nearly lose the ability to go through the blockages due to the very high penetration loss. Fourth, along with the increase of carrier frequencies, the Doppler shift, which is proportional to the carrier frequency, becomes larger and thus causes stronger Doppler effects.
Waveform and Modulation Design of Terahertz Communications
167
In order to make full use of channel uniqueness and unleash the THz band’s true potential, designing efficient THz-specific waveform and modulation schemes has been viewed as a key step in the physical-layer THz communication system design. However, while varieties of waveforms have been raised for previous generations of systems in the microwave band, whether they can be used as potential candidate THz waveform is still an open issue. Due to the peculiarities of the THz channel and transceivers, it is of great significance to develop novel paradigm of waveform and modulation design in the THz band.
2 THz Waveform and Modulation Strategy In this section, we investigate the THz waveform and modulation strategy to take advantages of the THz spectrum features. On the one hand, the optimized strategy for THz modulation schemes and baseband waveform design should be tailored for specific THz communication channels to make the best use of the abundant bandwidth. On the other hand, since the transceiver impairments increase drastically at higher frequencies, specific waveform design can mitigate the limitations in THz sources and receivers.
2.1 THz Channel Features The following features of THz channel should be considered in the waveform and modulation designs. As the free-space propagation loss increases quadratically with frequency, it becomes much stronger in the THz band than in the microwave band. In this case, directional antennas are used to provide high gains and compensate for the severe path loss. As a result, with the decrease of the beamwidth toward the communication receiver, the delay spread is reduced and the coherence bandwidth is increased. Meanwhile, the reflection and scattering losses of the THz ray depend on the angle of incidence and usually result in a strong power loss of a non-line-of-sight (NLoS) path, as well as the decrease of the number of the dominant rays with nonnegligible power. Hence, the energy of the received THz signal may be concentrated in the LoS ray and several specular reflected rays, which also reduces the root-meansquare (RMS) delay spread. In an indoor environment, the delay spread of 11 ns is measured based on a wideband channel measurement campaign between 130 GHz and 143 GHz [14]. Thus, the coherence bandwidth for the multipath propagation can reach hundreds of MHz. This may cause a significant influence on the selection of the THz waveform. The THz wave propagation may be attenuated by the atmospheric effects, which result in molecular absorption loss [15]. Due to the existence of molecular absorption peaks, the THz band is divided into THz windows with different widths, which are also dependent on the transmission distance, as illustrated in Fig. 2.
168
Y. Wu and C. Han 300
Path Loss [dB]
250
d=0.1 m d=1 m d=10 m d=100 m
w w w7
200
w w
150
2
w
3
w4
w
w5
9
w
10
8
6
1
100 50 0.2
0.4
0.6
0.8
1
1.2
1.4
Frequency [THz]
Fig. 2 Path loss in dB at different transmission distances
Hence, an adaptive modulation scheme needs to be adopted to improve the transmission range or maximize the data rate. Besides, the ultra-broad bandwidth of the individual spectral window can reach tens of GHz and even hundreds of GHz. If we consider current narrowband systems, then the resulting number of subbands will become quite large, which results in very complicated system design. This motivates the multi-wideband transmission by dividing each spectral window into a set of sub-windows [16]. In this case, the interference, including the intersymbol and inter-band interferences of the multiband systems, is also required to be investigated thoroughly. Toward higher carrier frequencies, the Doppler spread effect becomes severer in the THz band, especially in high-mobility scenarios [17]. Fast time-varying channels with high Doppler spread may cause inter-carrier interference (ICI) in the time-frequency domain. Waveforms that work in the time-frequency domain, such as orthogonal frequency division multiplexing (OFDM), meet trouble in equalizing multiple Doppler shifts along each path. As a result, the link performance would be seriously deteriorated if maintaining current waveform and numerology. In addition, with high connectivity density and diverse applications, the THz waveform design is supposed to become more flexible to satisfy various services with different requirements, including but not limited to communication as well as sensing capabilities. Therefore, the mixture of different waveforms and numerologies might be the trend in the future systems. Compared with the mmWave band, many effects become severer in the THz band and cause stricter requirement on THz waveform and modulation design, such as stronger Doppler effect and propagation loss. Furthermore, some fundamental channel features in the THz band can be summarized as follows: . The channel sparsity in the THz band caused by stronger power loss of NLoS paths may result in ultrashort delay spread and unique multipath fading distribution. . The atmospheric effects attenuate the THz wave propagation, which leads to the existence of molecular absorption peaks and division of spectral windows with different bandwidths.
Waveform and Modulation Design of Terahertz Communications
169
. With short wavelength in the THz band, the communication distance might be shorter than the Rayleigh distance such that the spherical-wave propagation needs to be considered. As a summary, THz waveform and modulation should be properly designed to take advantage of the above channel features in the THz band. The goal of the THz waveform and modulation strategy is to improve the key performance metrics for point-to-point communications, including the data rate and the bit error rate (BER), which represent the effectiveness and reliability of communication systems. In addition, the spectral efficiency, the latency, and the connectivity are also significant metrics in the next-generation wireless systems, which need to be taken into account in the THz waveform principle.
2.2 THz Transceiver Features With the growing complexity of wireless communication transceivers and the trend of moving up the carrier frequencies, overall system performance has become much more sensitive to radio frequency (RF) analog front-end impairments. In particular, from the perspective of the THz transceiver, the performance metrics of the supported waveforms, such as the power amplifier (PA) efficiency, the phase noise (PN) robustness, and the power leakage, need to be considered. Due to the saturation characteristics of PAs, they can impose a nonlinear distortion on their outputs caused by an input much larger than its nominal value. When the maximum possible output is limited by the saturation power .Psat , the input power must be backed off so as to operate in the linear region. The average output power of the PA is given by .P [dBm] = Psat [dBm] − Pbo [dB], where .Pbo denotes the amount of PA power backoff. A large backoff from the peak power of the PA can result in low transmit power and low power efficiency, which is measured by the ratio of transmit power to PA power consumption [19]. This backoff is essentially proportional to the peak-to-average power ratio (PAPR) of the transmit signal. In order to maximize the transmit power and power efficiency of the transmitter, the key step is to decrease the power backoff by reducing the PAPR. As is shown in Fig. 3, the saturated output power rapidly decreases as the carrier frequency increases, e.g., the saturation power in the 100 GHz band is at least 10 dB lower than that in the 10 GHz band [18]. Thus, the PA efficiency of transmitters is more sensitive to the PAPR in the THz band. Lower PAPR should be taken into account to provide higher coverage and promote efficient THz communications. When the complex baseband signal is up-converted around a carrier frequency in the RF transmitter and the passband signal is down-converted to baseband in the receiver, there exists a phase noise in the real local oscillator. The PN can result in a random phase rotation in time domain signal, which causes inter-carrier interference in frequency domain. Since the PN increases 6 dB for every doubling of the carrier
170
Y. Wu and C. Han
Fig. 3 Saturated output power of the PAs versus the carrier frequency [18]
frequency [20], it becomes more significant to evaluate the increased PN distortion effect in terms of the THz waveform design. The frequency-dependent I/Q imbalance (IQI) is a dominant hardware impairment in THz transceivers operating over ultra-wide bandwidths. The I/Q imbalance stems from the unavoidable amplitude and phase differences between the physical analog in-phase (I) and quadrature (Q) signal paths, owing to the finite tolerances of the capacitors and resistors used in the implementation of the analog RF frontend components [21]. Till date, only a few studies investigate the wideband IQI model in the THz band, e.g., [22] for channel estimation and equalization for THz transceivers with wideband IQI. Besides, another important factor for the waveform design is the out-of-band power leakage. Large out-of-band power can incur adjacent channel interference (ACI). In this case, a guard band is required to reduce the effect of ACI, which causes the decrease of the spectral efficiency. Focusing in the spatial domain, the large-scale antenna arrays are used to provide high antenna gain and compensate for severe path loss in the mmWave and THz bands. Herein, a beam squint effect arises in wideband UM-MIMO beamforming. Since most of the existing hybrid beamforming studies utilize phase shifters that tune the same weight for all frequencies, the beam directions may move away from the central frequency and are deviated from the target. As a result, the array gain is substantially reduced, which affects the communication performance. Last but not least, to support wireless links with ultrafast data rates, THz transceivers are required to be well designed in terms of the implementation complexity. Since complex transceivers with Tbps digital processors are still challenging, low-complexity signal processing methods are more preferred, especially the receiver processing, including channel parameter estimation and data detection. The algorithms should be designed with logarithmic or linear time complexity with respect to the number of subcarriers and symbols. To summarize, for efficient design of waveform and modulation in the THz band, we need to take into account coherence bandwidth, molecular absorption
Waveform and Modulation Design of Terahertz Communications
171
effects and Doppler effects of THz channel, and PA efficiency, robustness against RF impairment, out-of-band emission, beam squint effects, and implementation complexity from the THz transceiver perspectives.
3 Waveform Design For carrier-based modulations in the digital communications, two concepts dominate waveform design and application, i.e., single-carrier waveforms and multicarrier waveforms. In a single-carrier communication system, the transmit symbols are pulse-based by a transmit filter in the transmitter. However, the complexity of the equalizers is one concern as the inter-symbol interference increases with the data rate. To overcome the frequency selectivity of the wideband channel experienced by single-carrier transmission, multiple carriers can be used for high-rate data transmission.
3.1 OFDM A popular type of multi-carrier waveform is the orthogonal frequency division multiplexing (OFDM) modulation as adopted in 5G new radio (NR) downlink, which can achieve high spectral efficiency for time-invariant frequency selective channels. OFDM is well known to be highly spectral-efficient and robust to frequency selective channels, which is vital to meet extreme data rate requirements [23]. The baseband signal of OFDM is represented as M−1 N −1 1 E E X[m, n]rect (t − nTo ) ej 2π mAf (t−Tcp −nTo ) , s(t) = √ M m=0 n=0
.
(1)
where M and N denote the number of subcarriers and the number of symbols, respectively. .X[m, n] describes the information symbols in the time-frequency domain, .Af represents the subcarrier spacing, .To = Tcp + T refers to the total 1 denotes the original symbol duration, .Tcp stands for the symbol duration, .T = Af cyclic prefix (CP) duration, and .rect(t) is a rectangular pulse function and equals to 1 for .0 < t < To and 0 otherwise. The principal advantages of OFDM in wireless communication systems can be summarized as follows: . For a given channel delay spread, OFDM introduces a cyclic prefix to avoid inter-symbol interference (ISI) and uses a frequency domain equalizer to conduct the transmission of large bandwidth signals over wireless channels. The receiver
172
Y. Wu and C. Han
complexity of OFDM is much lower than that of a single-carrier system with a time domain equalizer. . Capacity can be significantly increased by adapting the data rate per subcarrier according to the signal-to-noise ratio (SNR) of the individual subcarrier, also known as water-filling algorithm for frequency domain link adaptation. Since the antenna directivity increases and the number of dominant rays reduces in the THz band, the coherence bandwidth of the THz channel becomes much larger. In this case, the THz channels become flat and the robustness to frequency-selective channels of OFDM makes less sense. Meanwhile, an alternative approach to overcoming the effects of a multipath channel is to use the single-carrier modulation with frequency domain equalization (SC-FDE) waveform with cyclic prefix. SCFDE is also able to address the ISI and frequency-selectivity problem by means of FDE and DFT/IDFT [24]. Discrete Fourier transform spread OFDM (DFT-sOFDM), which is an extension of SC-FDE to accommodate multiuser access, can be considered as a potential candidate THz waveform. Frequency selectivity, however, can still arise due to frequency-dependent molecular absorption losses, frequency-dependent receiver characteristics, and the existence of several multipath components in indoor sub-THz systems [25], especially when the bandwidth of THz transmission window is much larger than the coherence bandwidth. THz OFDM with hybrid beamforming and MIMO techniques can be used for THz communications over frequency-selective channels [26]. With OFDM, different carriers experience beam squint at THz frequencies, which can be mitigated by utilizing delay Vandermonde matrices [27]. Besides challenges from THz channel, OFDM suffers a number of hardware limitations at THz transceivers. The transmit signals in an OFDM system can have high peak values in the time domain since many subcarrier components are added via an IDFT operation. Thus, OFDM has a high PAPR and cannot support powerefficient transmission from devices at sub-THz frequencies. Moreover, OFDM suffers from a high out-of-band emission. Since each subcarrier signal is timelimited for each symbol, there exists an out-of-band radiation in the power spectrum of OFDM, which causes a non-negligible adjacent channel interference (ACI). A guard band at outer subcarriers is required, which results in the decrease of the spectral efficiency. The power leakage at both ends of the transmission band in an OFDM system is approximately 20 dB higher than that of a single-carrier system [28].
3.2 DFT-s-OFDM When it comes to the THz band, DFT-s-OFDM and its variants are regarded as more potential candidate waveforms for THz communications in contrast with OFDM. Single-carrier waveforms, containing DFT-s-OFDM, are preferred by THz communication systems due to their low PAPR. Although not as low as pure single-
Waveform and Modulation Design of Terahertz Communications
173
carrier waveforms, DFT-s-OFDM has lower PAPR than OFDM [19]. Classical OFDM and DFT-s-OFDM systems are required to be cyclic prefixed for each symbol, and the length of cyclic prefix is usually set to be longer than the delay spread of communication channels in order to remove the ISI effect. However, in the THz band, the delay spread might fluctuate substantially, e.g., when the signal power of a long NLoS path becomes too weak to influence the received signal, it can be ignored and thereby causes a shorter delay spread. In this case, we can use a short guard interval to reduce the overhead and improve the spectral efficiency. Nevertheless, the insertion of CP is not flexible, since varying its length may cause different symbol durations and further leads to unfixed frame structure, which makes various settings incompatible. We can design a flexible guard interval (FGI) scheme by modifying the data blocks into the combination of data part and guard interval part. By flexibly adjusting the number of data symbols and the length of the GI, it can satisfy different requirements of guard interval length in case of various channel delay spreads. By designing different guard interval types and using a frequency domain windowing, more flexible DFT-s-OFDM schemes are developed, such as zero tail DFT-s-OFDM (ZT-DFT-s-OFDM) and unique word DFT spread windowed OFDM (UW-DFT-S-W-OFDM) [29]. Another factor we need to consider is that phase noise robustness becomes crucial for all link types at higher frequencies. While an OFDM system can be made robust to phase noise by a proper choice of subcarrier spacing [23], singlecarrier waveforms still outperform OFDM when handling the PN. The estimation of PN is convenient for single-carrier waveforms because the phase rotation can be easily estimated by the detected data symbol in time domain [30]. Toward subTHz communications, DFT-s-OFDM provides consistently better link performance under phase noise than OFDM [31].
3.3 OTFS Along with the increase of carrier frequencies, the Doppler shift, which is proportional to the carrier frequency, becomes larger and thus causes stronger Doppler effect. For instance, the Doppler shift is enlarged by 100 times when the frequency is increased from 3 GHz to 0.3 THz. The severe Doppler spread destroys the orthogonality of subcarriers in OFDM and causes ICI in the frequency domain, and thus, the Doppler shift is difficult to estimate and equalizes in OFDM systems. Recently, an orthogonal time frequency space (OTFS) modulation scheme is proposed to deal with high Doppler spread in doubly selective channels [32]. OTFS modulates information in the delay-Doppler (DD) domain and conveniently accommodates the channel dynamics, which shows significant advantages over OFDM in the presence of severe Doppler effects [33]. A time-varying multipath channel can be transformed into a near-stationary channel in the delay-Doppler domain [34].
174
Y. Wu and C. Han
At the OTFS transceiver, after the modulated symbols are mapped to the delay-Doppler domain, an orthogonal 2D precoding, i.e., the inverse symplectic finite Fourier transform (ISFFT), transplants the DD domain signal into the timefrequency domain, which is expressed as ( ) N −1 M−1 ml 1 E E j 2π nk N −M X[m, n] = √ , x[l, k]e MN k=0 l=0
.
(2)
where .x[l, k] denotes the modulated signal in the DD domain. Then, a multi-carrier modulator, such as Heisenberg transform or OFDM, is employed in each time slot for further transforming the time-frequency domain signal to the time domain before being transmitted over the channel [33]. While OFDM requires a CP for each symbol, namely, N CPs for a frame, OTFS inserts only one CP in a frame for guarding against inter-frame interference, which will significantly reduce the CP overhead. At the receiver side, a cascaded combination of Wigner transform and symplectic finite Fourier transform (SFFT) transforms the received signal back into the DD domain. The time domain transmit samples of OTFS are equivalent to the output of inverse DFT (IDFT) of the information symbols in the delay-Doppler domain [35], which behaves like OFDM with a relatively high PAPR value. For a small value of symbol number N , the PAPR of OTFS is lower than OFDM but still not satisfactory for THz PAs. Thus, since OTFS has similar PAPR problem to OFDM, a PAPR reduction scheme is required to improve the THz PA efficiency. Some studies working on OTFS modulation assume that the delay and Doppler shifts of channel impulse response are integer multiples of delay and Doppler resolution, respectively. The presence of fractional Doppler may cause channel spreading across the Doppler indices and breaks the channel sparsity in the delayDoppler domain [33]. A Dolph-Chebyshev (DC) window design is proposed in [36] to suppress the channel spreading and improved the channel estimation accuracy. Furthermore, to achieve high-resolution channel estimation in THz wireless systems, both fractional delay and Doppler are supposed to be considered. Despite the promising communication abilities, the implementation complexity of OTFS is a pivotal issue, specifically the complexity of OTFS channel estimation and data detection. The requirement of computational complexity is more stringent in the THz band to realize high-speed baseband digital processing. Low-complexity channel estimation and detection of OTFS signals have been studied in [32, 37–46]. An embedded pilot (EP)-based OTFS frame structure is proposed in [37], where the guard symbols are arranged to preserve the pilot symbol from the interference data symbols. In this case, the insertion of guard symbols results in non-negligible delay-Doppler resource overhead and reduces the spectral efficiency. Schemes of superimposed pilot design are developed in [38, 39], while [38] still uses EP frame to estimate delay and Doppler parameters and [39] does not consider fractional delay and Doppler. Channel estimation approaches for MIMO-OTFS system are developed in [40, 41], while they add cyclic prefix for each symbol and only consider
Waveform and Modulation Design of Terahertz Communications
175
integer delay. Message passing-based methods are proposed to estimate channel parameters and design a data detector in [37, 42, 47]. A variational Bayes approach is developed in [43] to reduce the receiver complexity of the OTFS receiver. Nevertheless, when the information symbols are transformed with a DFT precoding rather than being directly mapped on the delay-Doppler domain, these approaches using the posterior probability are not applicable. Low-complexity minimum mean square error (MMSE) channel equalizers for OTFS are proposed in [44, 45], which use the block-circulant property of channel matrix, but this property becomes invalid when considering non-integer delay and Doppler. In summary, to satisfy the requirements of THz communications, OTFS needs to overcome the problems, including PAPR, pilot design with reduced overhead, low-complexity channel estimation, and data detection with fractional delay and Doppler.
3.4 DFT-s-OTFS By developing a DFT precoding operation along the Doppler axis before mapping the information symbols onto the delay-Doppler plane, DFT spread OTFS (DFTs-OTFS) waveform is proposed in [48] to reduce the PAPR of OTFS. In this way, DFT-s-OTFS has almost the same PAPR as DFT-s-OFDM. Nevertheless, traditional embedded pilot arrangement scheme of OTFS does not work on DFT-s-OTFS, since no guard region can be arranged around the pilot symbol. A superimposed pilot scheme is developed to overcome this problem by placing the nonzero pilot symbol onto one data symbol in the delay-Doppler domain [49]. Since there is no dedicated grid for the pilot arrangement, the superimposed pilot design can reduce the pilot overhead and improve the spectral efficiency of DFT-s-OTFS. In the DFT-s-OTFS receivers, it is expected to estimate the channel parameters based on the received signal and the knowledge of transmit pilot. At the beginning, we can use superimposed pilots to obtain the coarse results of channel parameters, by treating the received data symbols as interference. The initial estimation results can be used to equalize the received data signal and obtain approximate recovered data symbols. With the detected data and superimposed pilots, more accurate channel estimation can be realized to further refine the performance of data detection. To realize delay and Doppler parameter estimation, we can develop a two-phase estimation method, where in the first phase, we perform on-grid search with coarse estimation and in the second phase we conduct off-grid search to refine the estimation result. Meanwhile, channel equalization can be implemented by using iterative least square (LS) algorithms, since the channel matrix with fractional delay and Doppler has partial Fourier matrices and the fast Fourier transform (FFT) algorithm can be employed to reduce the complexity.
176
Y. Wu and C. Han
CP
IDFT
THz channel
CP-1
channel equalizer
DFT
(a) CP-OFDM DFT & subcarrier mapping
IDFT
CP
THz channel
CP-1
DFT
channel equalizer
Subcarrier demapping & IDFT
(b) CP-DFT-s-OFDM FGI
DFT & subcarrier mapping
IDFT
THz channel
DFT
channel equalizer
Subcarrier demapping & IDFT
FGI-1
(c) FGI-DFT-s-OFDM ISFFT
Heisenberg transform
CP
THz channel
CP-1
Wigner transform
SFFT
channel equalizer
(d) OTFS DFT
ISFFT
Heisenberg transform
CP
THz channel
CP-1
Wigner transform
SFFT
channel equalizer
IDFT
(e) DFT-s-OTFS
Fig. 4 Carrier-based waveform transceiver block diagrams: (a) CP-OFDM, (b) CP-DFT-sOFDM, (c) FGI-DFT-s-OFDM, (d) OTFS, (e) DFT-s-OTFS
3.5 Waveform Comparison and Discussion In Fig. 4, the transceiver block diagrams of the abovementioned waveforms, including CP-OFDM, CP-DFT-s-OFDM, FGI-DFT-s-OFDM, OTFS, and DFT-s-OTFS, are described in terms of the transmitter and receiver implementation modules. In OFDM, baseband discrete-time transmit signal is generated by performing IDFT on data symbols. In CP-DFT-s-OFDM, data symbols are first spread in DFT precoding and the output is mapped onto the OFDM subcarriers. FGI-DFT-s-OFDM insert guard interval with flexible length into data blocks before DFT precoding while removing CP part. OTFS modulates the data symbols in the DD domain and introduces a 2D ISFFT operation before conducting time-frequency domain modulation. In DFT-s-OTFS, data symbols are first spread in DFT precoding before mapping the data signal onto the DD domain. The PAPR values for OFDM, DFT-s-OFDM, OTFS, and DFT-s-OTFS are studied in [48]. As shown in Fig. 5, in contrast with OTFS, the PAPR for DFT-s-OTFS transmit signal can be approximately reduced by 3 dB. Meanwhile, DFT-s-OTFS has similar PAPR as DFT-s-OFDM. In Fig. 6, the BER performance of OFDM, DFT-s-OFDM, OTFS, and DFTs-OTFS is evaluated under high-mobility channels in the THz band. The carrier frequency is set as 0.3 THz. The used subcarrier spacing equals to 1.92 MHz. The number of subcarriers is 64 and the symbol number is 16. The 4-QAM modulation scheme is employed. The simulation results indicate that OTFS and DFT-s-OTFS can achieve much better BER performance than OFDM and DFT-s-OFDM under severe Doppler effects, since the ICI problem in OFDM is not handled. Table 1 presents a summary of the aforementioned waveforms’ performance under various metrics [50]. In summary, the selection of THz waveform depends
Waveform and Modulation Design of Terahertz Communications 100
Pr(PAPR>PAPR0)
Fig. 5 PAPR comparison for continuous-time baseband signal of OFDM, DFTs-OFDM, OTFS, and DFT-s-OTFS when .M = 64, .N = 16
177
10-1
OFDM DFT-s-OFDM OTFS DFT-s-OTFS
10-2
10-3
0
2
4
6
8
10
12
PAPR0 [dB] Fig. 6 BER performance comparison with different waveforms. The number of paths is 3. The values of maximum velocity are set as 150 km/h
100 10-1
BER
10-2 10-3 10-4
OFDM 150km/h
10-5
OTFS 150km/h
CP-DFT-s-OFDM 150km/h
DFT-s-OTFS 150km/h
10-6
0
5
10
15
20
SNR [dB]
on the scenarios. In low-mobility channels, OFDM is still an excellent candidate waveform in the THz band and achieves high data rate, since it enables good compatibility with UM-MIMO, flexible multiuser scheduling, and resource allocation. In high-mobility scenarios, OTFS and DFT-s-OTFS provide stronger robustness to severe Doppler spread by exploiting the channel sparsity of delayDoppler domain, while it comes at the price of increased detection complexity. For energy-constrained links, DFT-s-OFDM and DFT-s-OTFS are expected to realize high-energy efficiency due to their low PAPR characteristics.
4 Modulation Design 4.1 Traditional Modulation The first IEEE 802.15.3d standardization efforts for sub-THz communications toward 6G were reported in [51], where the THz single-carrier mode supports
Waveform Properties PAPR Guard interval overhead Pilot overhead Slow fading Fast fading Rectangular pulse Out-of-band Raised-cosine Emission Robustness to phase noise effect Detection complexity Robustness to doppler effect Extension to MIMO
DFT-s-OFDM
Low High Each symbol uses a CP Low Low High High High High Pulse Low Medium High Low Low Low Low Extendable
OFDM
Table 1 Performance evaluation metrics for different THz waveforms
Low Tunable Low High Medium Low High Low Low
FGI-DFT-s-OFDM
DFT-s-OTFS
High Low Each frame uses a CP Medium Medium Medium Medium Medium Medium Low Low Medium High High High High High
OTFS
178 Y. Wu and C. Han
Waveform and Modulation Design of Terahertz Communications
179
six traditional modulations. The first four are phase-shift keying modulations: binary (BPSK), quadrature (QPSK), 8-phase (8-PSK), and 8-phase asymmetric (8-APSK). The mode also supports quadrature amplitude modulation with 16 and 64 constellation points: 16-QAM and 64-QAM. This physical-layer mode primarily targets the bandwidth-oriented use cases from earlier, such as wireless fronthaul/backhaul and additional links in the data center [51]. In addition, the .π /2BPSK modulation, as a special constellation-rotated BPSK modulation, exhibits a much smaller PAPR of the transmitted signal than the QPSK and higher-order modulations do [52].
4.2 Ultrashort-Pulse-Based Modulation Although continuous carrier-based transmission can be supported in the sub-THz range, carrier-based transmission is challenging at high THz frequencies. The lack of nanoscale transceivers able to generate a carrier signal at such frequencies limits the feasibility of carrier-based modulations. Pulse-based modulations have been widely used in very high-speed communications systems, such as ultra-wide-band (UWB) systems [53]. A pulse-based modulation and channel access scheme for nanonetworks in the THz band is proposed in [54], which employs the transmission of one-hundred-femtosecond-long pulses by following an asymmetric on-off keying modulation spread in time (TS-OOK). It is tailored to the expected capabilities of THz band signal generators and detectors and exploits the peculiarities of the THz channel. A logical “1” is transmitted by using a one-hundred-femtosecondlong pulse and a logical “0” is transmitted as silence, i.e., the nano-device remains silent when a logical zero is transmitted. Meanwhile, the time between transmissions is fixed and much longer than the pulse duration. As a result, pulse-based THz communications can achieve a Tb/s data rate in nanonetwork scenarios [54].
4.3 Distance-Adaptive Modulation From the communications perspective, the molecular absorption effect in the THz band brings about additional attenuation and noise. A widely adopted strategy to counteract the molecular absorption effect of the THz spectrum is to avoid transmitting signals under path loss peaks [2]. Between such peaks, multiple distance-dependent spectral windows emerge in the THz band, where the molecular absorption is negligibly weak. Modulation schemes can be optimized to use these fragmented THz spectral windows to mitigate the absorption effect and turn it into an advantage. The very strong relationship between the distance and the spectral windows at THz frequencies motivates the development of distance-adaptive communication techniques [16]. Each spectral window has an ultrabroad bandwidth ranging from
180
Y. Wu and C. Han
multi-gigahertz to THz, which can be divided into narrower but still broadband subwindows and allow parallel multi-wideband transmissions. For the transmission on the uth sub-window, the baseband signal consists of different symbols. For the ith information symbol, .Nfu frames are used to describe this symbol with one pulse in one frame, and then the baseband signal is expressed as
xu (t) =
.
/
Pu
E i
Nfu −1
au(i)
E
) ( pu(i,m) g t − iNfu Tf − mTf − cu(i,m) Tp ,
(3)
m=0
where .Tf and .Tp denote the time duration for one frame and one pulse, with their ratio defining the number of pulse positions in a frame, i.e., .Np = Tf /Tp . Additionally, .Pu stands for the allocated power in the uth sub-window and .Nfu is the number of frames to represent one information symbol. In the signal model with polarity randomization and time hopping, .au(i) ∈ +1, −1 is the ith binary information symbol, .pu(i,m) denotes the random polarity code that takes .±1 with (i,m) (i,m) equal probability, and .cu refers to the time hopping code where .cu ∈ {0, 1, · · · , Np − 1} with equal probability of the mth frame. .g(t) is the transmitted wideband pulse with the duration .Tp and unit energy. An optimization framework can be used to solve for the multi-wideband waveform design parameters of the transmit power and sub-window rate (i.e., the number of frames for each symbol). The communication distance can be effectively enhanced as the transmit power and the number of frames increase, at the cost of power consumption and rate decrease. Besides improving the individual user achievable data rates, the very large available bandwidth can be leveraged to efficiently allocate multiple users. As the distance decreases, the path loss drops, and hence the received power and the usable bandwidth of a spectral window increase. Thus, the strategic spectrum utilization principle for one spectral window operates as follows. First, the center sub-windows are allocated to the long-distance links. Then, the side sub-windows are allocated to the short-distance transmissions. The resource allocation model that incorporates the aforementioned spectrum allocation principle is developed in [55] to enable multiple ultrahigh-speed links. To maximize the channel utilization, a hierarchical bandwidth modulation scheme is proposed in [56] to cope with the distance-dependent bandwidth of the THz channel. The fundamental idea in this case is to embed multiple binary information streams on the same carrier signal by adapting the modulation order and manipulating the symbol time. Specifically, for users over short distances, in which the available bandwidth is larger and the path loss is much lower, symbol duration can be made shorter than that for users over longer distances. Users are able to recover different information based on their perceived SNR. As a result, users at different distances can utilize a different bandwidth.
Waveform and Modulation Design of Terahertz Communications
181
4.4 Modulation for THz Integrated Sensing and Communication Following the trend of sensing and connecting all things in 6G communications, THz wireless systems are expected to simultaneously transmit billions of data streams and sense the environment or human activity, namely, THz integrated sensing and communication (ISAC) [57, 58]. With multiple objectives including sensing targets and information transmission, when a THz ISAC system transmits a common designed signal, there exists different requirements on the modulation design. Typically used signals for sensing include simple unmodulated single-carrier, pulse, or continuous-wave frequency modulated signals, while communication waveform contains unmodulated (pilots/training sequences) and modulated data symbols [59]. Based on communication waveforms, a number of THz ISAC systems are proposed with corresponding sensing algorithms and display promising communication and sensing abilities, including DFT-s-OFDM [60] and DFT-s-OTFS [49]. A dataembedded multi-sub-band quasi-perfect (MS-QP) waveform is proposed for THz ISAC systems, which enables simultaneous high-resolution sensing and high-rate communication without mutual interference [61]. This sensing-centric modulation design achieves ultrahigh-resolution sensing only with cost-efficient A/Ds with low sampling rate at the cost of sacrificing the spectral efficiency. To realize mutually enhanced performance, further investigations on novel modulation schemes are needed for THz ISAC.
5 Conclusion In this chapter, we have discussed several waveform and modulation schemes for THz wireless communications. Among them, DFT-s-OFDM combines the simplicity of OFDM processing and the benefits of single-carrier transmission. Meanwhile, the flexible guard interval scheme by adjusting the length of guard interval according to the delay spread of THz channel is able to reduce the CP overhead and improve the data rate. In high-mobility scenarios, OTFS/DFT-s-OTFS can provide stronger robustness to Doppler spread compared to OFDM/DFT-sOFDM, while they introduce higher equalization and detection complexity, which can be reduced by designing low-complexity channel estimation and data detection methods. For efficient THz communications, a pulse-based modulation scheme is attractive for its simplicity and benefits in time/frequency resolution. Moreover, distance-adaptive modulations can be optimized by turning atmospheric loss into an opportunity of the utilization maximization for distance-dependent THz bandwidth.
182
Y. Wu and C. Han
References 1. I. F. Akyildiz, C. Han, and S. Nie, “Combating the distance problem in the millimeter wave and terahertz frequency bands,” IEEE Communications Magazine, vol. 56, no. 6, pp. 102–108, 2018. 2. I. F. Akyildiz, J. M. Jornet, and C. Han, “Terahertz band: Next frontier for wireless communications,” Physical Communication, vol. 12, pp. 16–32, 2014. 3. T. S. Rappaport et al., “Wireless communications and applications above 100 ghz: Opportunities and challenges for 6g and beyond,” IEEE Access, vol. 7, pp. 78 729–78 757, 2019. 4. F. C. Commission, “Fcc takes steps to open spectrum horizons for new services and technologies.” [Online]. Available: https://docs.fcc.gov/public/attachments/DOC-356588A1.pdf 5. WRC-19, “World radiocommunication conference 2019 (WRC-19) final acts.” [Online]. Available: https://www.itu.int/dms_pub/itu-r/opb/act/R-ACT-WRC.14-2019-PDF-E.pdf 6. I. F. Akyildiz, C. Han, Z. Hu, S. Nie, and J. M. Jornet, “Terahertz band communication: An old problem revisited and research directions for the next decade,” IEEE Transactions on Communications, vol. 70, no. 6, pp. 4250–4285, 2022. 7. T. W. Crowe, W. R. Deal, M. Schröter, C.-K. Clive Tzuang, and K. Wu, “Terahertz rf electronics and system integration,” Proceedings of the IEEE, vol. 105, no. 6, pp. 985–989, 2017. 8. K. M. S. Huq, S. A. Busari, J. Rodriguez, V. Frascolla, W. Bazzi, and D. C. Sicker, “Terahertzenabled wireless system for beyond-5g ultra-fast networks: A brief survey,” IEEE Network, vol. 33, no. 4, pp. 89–95, 2019. 9. H.-J. Song and N. Lee, “Terahertz communications: Challenges in the next decade,” IEEE Transactions on Terahertz Science and Technology, vol. 12, no. 2, pp. 105–117, 2022. 10. V. Petrov, J. Kokkoniemi, D. Moltchanov, J. Lehtomaki, Y. Koucheryavy, and M. Juntti, “Last meter indoor terahertz wireless access: Performance insights and implementation roadmap,” IEEE Communications Magazine, vol. 56, no. 6, pp. 158–165, 2018. 11. C. Han and Y. Chen, “Propagation modeling for wireless communications in the terahertz band,” IEEE Communications Magazine, vol. 56, no. 6, pp. 96–101, 2018. 12. J. M. Jornet and I. F. Akyildiz, “Channel modeling and capacity analysis for electromagnetic wireless nanonetworks in the terahertz band,” IEEE Transactions on Wireless Communications, vol. 10, no. 10, pp. 3211–3221, 2011. 13. C. Han, A. O. Bicen, and I. F. Akyildiz, “Multi-ray channel modeling and wideband characterization for wireless communications in the terahertz band,” IEEE Transactions on Wireless Communications, vol. 14, no. 5, pp. 2402–2412, 2015. 14. Y. Chen, Y. Li, C. Han, Z. Yu, and G. Wang, “Channel measurement and ray-tracing-statistical hybrid modeling for low-terahertz indoor communications,” IEEE Transactions on Wireless Communications, vol. 20, no. 12, pp. 8163–8176, 2021. 15. C. Han, W. Gao, N. Yang, and J. M. Jornet, “Molecular absorption effect: A double-edged sword of terahertz communications,” IEEE Wireless Communications, to appear, 2022. Early access, https://doi.10.1109/MWC.016.2100704 16. C. Han, A. O. Bicen, and I. F. Akyildiz, “Multi-wideband waveform design for distanceadaptive wireless communications in the terahertz band,” IEEE Transactions on Signal Processing, vol. 64, no. 4, pp. 910–922, 2016. 17. Y. Chen and C. Han, “Time-varying channel modeling for low-terahertz urban vehicleto-infrastructure communications,” in Proc. of IEEE Global Communications Conference (GLOBECOM), 2019. 18. H. Wang et al., “Power amplifiers performance survey 2000-present.” [Online]. Available: https://gems.ece.gatech.edu/PA_survey.html 19. H. G. Myung, J. Lim, and D. J. Goodman, “Single carrier fdma for uplink wireless transmission,” IEEE Vehicular Technology Magazine, vol. 1, no. 3, pp. 30–38, 2006. 20. E. Dahlman, S. Parkvall, and J. Sköld, “5G NR: the next generation wireless access technology.” Academic Press, 2018, pp. 419–441. 21. A.-A. A. Boulogeorgos, V. M. Kapinas, R. Schober, and G. K. Karagiannidis, “I/q-imbalance self-interference coordination,” IEEE Transactions on Wireless Communications, vol. 15, no. 6, pp. 4157–4170, 2016.
Waveform and Modulation Design of Terahertz Communications
183
22. Z. Sha and Z. Wang, “Channel estimation and equalization for terahertz receiver with rf impairments,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 6, pp. 1621– 1635, 2021. 23. A. A. Zaidi, R. Baldemair, H. Tullberg, H. Bjorkegren, L. Sundstrom, J. Medbo, C. Kilinc, and I. Da Silva, “Waveform and numerology to support 5g services and requirements,” IEEE Communications Magazine, vol. 54, no. 11, pp. 90–98, 2016. 24. H. G. Myung and D. J. Goodman, Single carrier FDMA: a new air interface for long term evolution. John Wiley & Sons, 2008, vol. 8. 25. H. Sarieddeen, M.-S. Alouini, and T. Y. Al-Naffouri, “An overview of signal processing techniques for terahertz communications,” Proceedings of the IEEE, vol. 109, no. 10, pp. 1628– 1665, 2021. 26. H. Yuan, N. Yang, K. Yang, C. Han, and J. An, “Hybrid beamforming for terahertz multi-carrier systems over frequency selective fading,” IEEE Transactions on Communications, vol. 68, no. 10, pp. 6186–6199, Oct. 2020. https://doi.10.1109/TCOMM.2020.3008699 27. S. M. Perera, A. Madanayake, and R. J. Cintra, “Radix-2 self-recursive sparse factorizations of delay vandermonde matrices for wideband multi-beam antenna arrays,” IEEE Access, vol. 8, pp. 25 498–25 508, 2020. 28. D. Falconer, S. L. Ariyavisitakul, A. Benyamin-Seeyar, and B. Eidson, “Frequency domain equalization for single-carrier broadband wireless systems,” IEEE Communications Magazine, vol. 40, no. 4, pp. 58–66, 2002. 29. A. Sahin, R. Yang, E. Bala, M. C. Beluri, and R. L. Olesen, “Flexible dft-s-ofdm: Solutions and challenges,” IEEE Communications Magazine, vol. 54, no. 11, pp. 106–112, 2016. 30. X. Cheng, N. Lou, and B. Yuan, “Iterative decision-aided compensation of phase noise in millimeter-wave SC-FDE systems,” IEEE Communications Letters, vol. 20, no. 5, pp. 1030– 1033, 2016. 31. O. Tervo, T. Levanen, K. Pajukoski, J. Hulkkonen, P. Wainio, and M. Valkama, “5g new radio evolution towards sub-thz communications,” in Proc. of 2nd 6G Wireless Summit (6G SUMMIT), 2020. 32. P. Raviteja, K. T. Phan, Y. Hong, and E. Viterbo, “Interference cancellation and iterative detection for orthogonal time frequency space modulation,” IEEE Transactions on Wireless Communications, vol. 17, no. 10, pp. 6501–6515, 2018. 33. Z. Wei, W. Yuan, S. Li, J. Yuan, G. Bharatula, R. Hadani, and L. Hanzo, “Orthogonal time-frequency space modulation: A promising next-generation waveform,” IEEE Wireless Communications, vol. 28, no. 4, pp. 136–144, 2021. 34. G. D. Surabhi, R. M. Augustine, and A. Chockalingam, “On the diversity of uncoded otfs modulation in doubly-dispersive channels,” IEEE Transactions on Wireless Communications, vol. 18, no. 6, pp. 3049–3063, 2019. 35. ——, “Peak-to-average power ratio of OTFS modulation,” IEEE Communications Letters, vol. 23, no. 6, pp. 999–1002, 2019. 36. Z. Wei, W. Yuan, S. Li, J. Yuan, and D. W. K. Ng, “Transmitter and receiver window designs for orthogonal time-frequency space modulation,” IEEE Transactions on Communications, vol. 69, no. 4, pp. 2207–2223, 2021. 37. P. Raviteja, K. T. Phan, and Y. Hong, “Embedded pilot-aided channel estimation for OTFS in delay–doppler channels,” IEEE Transactions on Vehicular Technology, vol. 68, no. 5, pp. 4906–4917, 2019. 38. H. B. Mishra, P. Singh, A. K. Prasad, and R. Budhiraja, “Otfs channel estimation and data detection designs with superimposed pilots,” IEEE Transactions on Wireless Communications, vol. 21, no. 4, pp. 2258–2274, 2022. 39. W. Yuan, S. Li, Z. Wei, J. Yuan, and D. W. K. Ng, “Data-aided channel estimation for otfs systems with a superimposed pilot and data transmission scheme,” IEEE Wireless Communications Letters, vol. 10, no. 9, pp. 1954–1958, 2021. 40. M. Li, S. Zhang, F. Gao, P. Fan, and O. A. Dobre, “A new path division multiple access for the massive MIMO-OTFS networks,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 4, pp. 903–918, 2021.
184
Y. Wu and C. Han
41. Y. Liu, S. Zhang, F. Gao, J. Ma, and X. Wang, “Uplink-aided high mobility downlink channel estimation over massive MIMO-OTFS system,” IEEE Journal on Selected Areas in Communications, vol. 38, no. 9, pp. 1994–2009, 2020. 42. F. Liu, Z. Yuan, Q. Guo, Z. Wang, and P. Sun, “Message passing-based structured sparse signal recovery for estimation of otfs channels with fractional doppler shifts,” IEEE Transactions on Wireless Communications, vol. 20, no. 12, pp. 7773–7785, 2021. 43. W. Yuan, Z. Wei, J. Yuan, and D. W. K. Ng, “A simple variational bayes detector for orthogonal time frequency space (otfs) modulation,” IEEE Transactions on Vehicular Technology, vol. 69, no. 7, pp. 7976–7980, 2020. 44. S. Tiwari, S. S. Das, and V. Rangamgari, “Low complexity LMMSE receiver for OTFS,” IEEE Communications Letters, vol. 23, no. 12, pp. 2205–2209, 2019. 45. G. D. Surabhi and A. Chockalingam, “Low-complexity linear equalization for otfs modulation,” IEEE Communications Letters, vol. 24, no. 2, pp. 330–334, 2020. 46. H. Qu, G. Liu, L. Zhang, S. Wen, and M. A. Imran, “Low-complexity symbol detection and interference cancellation for otfs system,” IEEE Transactions on Communications, vol. 69, no. 3, pp. 1524–1537, 2021. 47. L. Gaudio, M. Kobayashi, G. Caire, and G. Colavolpe, “On the effectiveness of OTFS for joint radar parameter estimation and communication,” IEEE Transactions on Wireless Communications, vol. 19, no. 9, pp. 5951–5965, 2020. 48. Y. Wu, C. Han, and Z. Chen, “An energy-efficient DFT-spread orthogonal time frequency space system for terahertz integrated sensing and communication,” in Proc. of IEEE International Conference on Communications (ICC), 2022. 49. Y. Wu, C. Han and Z. Chen, “DFT-spread orthogonal time frequency space system with superimposed pilots for terahertz integrated sensing and communication,” IEEE Transactions on Wireless Communications, 2023. Early access, https://doi.10.1109/TWC.2023.3250267 50. S. Tarboush, H. Sarieddeen, M.-S. Alouini, and T. Y. Al-Naffouri, “Single- versus multicarrier terahertz-band communications: A comparative study,” IEEE Open Journal of the Communications Society, vol. 3, pp. 1466–1486, 2022. 51. V. Petrov, T. Kurner, and I. Hosako, “IEEE 802.15.3d: First standardization efforts for subterahertz band communications toward 6g,” IEEE Communications Magazine, vol. 58, no. 11, pp. 28–33, 2020. 52. J. Kim, Y. H. Yun, C. Kim, and J. H. Cho, “Minimization of papr for dft-spread ofdm with bpsk symbols,” IEEE Transactions on Vehicular Technology, vol. 67, no. 12, pp. 11 746–11 758, 2018. 53. M. Win and R. Scholtz, “Ultra-wide bandwidth time-hopping spread-spectrum impulse radio for wireless multiple-access communications,” IEEE Transactions on Communications, vol. 48, no. 4, pp. 679–689, 2000. 54. J. M. Jornet and I. F. Akyildiz, “Femtosecond-long pulse-based modulation for terahertz band communication in nanonetworks,” IEEE Transactions on Communications, vol. 62, no. 5, pp. 1742–1754, 2014. 55. C. Han and I. F. Akyildiz, “Distance-aware bandwidth-adaptive resource allocation for wireless systems in the terahertz band,” IEEE Transactions on Terahertz Science and Technology, vol. 6, no. 4, pp. 541–553, 2016. 56. Z. Hossain and J. M. Jornet, “Hierarchical bandwidth modulation for ultra-broadband terahertz communications,” in Proc. of IEEE International Conference on Communications (ICC), 2019. 57. C. Han, Y. Wu, Z. Chen, Y. Chen, and G. Wang, “THz ISAC: A physical-layer perspective of terahertz integrated sensing and communication,” IEEE Communications Magazine, to appear, 2022. Available: https://arxiv.org/abs/2209.03145 58. Z. Chen, C. Han, Y. Wu, L. Li, C. Huang, Z. Zhang, G. Wang, and W. Tong, “Terahertz wireless communications for 2030 and beyond: A cutting-edge frontier,” IEEE Communications Magazine, vol. 59, no. 11, pp. 66–72, 2021. 59. J. A. Zhang, M. L. Rahman, K. Wu, X. Huang, Y. J. Guo, S. Chen, and J. Yuan, “Enabling joint communication and radar sensing in mobile networks—a survey,” IEEE Communications Surveys & Tutorials, vol. 24, no. 1, pp. 306–345, 2022.
Waveform and Modulation Design of Terahertz Communications
185
60. Y. Wu, F. Lemic, C. Han, and Z. Chen, “Sensing integrated DFT-spread OFDM waveform and deep learning-powered receiver design for terahertz integrated sensing and communication systems,” IEEE Transactions on Communications, vol. 71, no. 1, pp. 595–610, 2023. 61. T. Mao, J. Chen, Q. Wang, C. Han, Z. Wang, and G. K. Karagiannidis, “Waveform design for joint sensing and communications in millimeter-wave and low terahertz bands,” IEEE Transactions on Communications, vol. 70, no. 10, pp. 7023–7039, 2022.
OTFS and Delay-Doppler Domain Modulation: Signal Detection and Channel Estimation Qinghua Guo, Zhengdao Yuan, Fei Liu, and Jinhong Yuan
1 Introduction Sixth-generation (6G) wireless networks are expected to support ubiquitous connectivity to a variety of mobile terminals, ranging from autonomous cars to unmanned aerial vehicles (UAV), low-earth-orbit (LEO) satellites, high-speed trains, etc. Providing reliable communications in such high-mobility environments has been recognized as a critical challenge in 6G. The ultrahigh data rate requirements of panoramic and holographic video streaming push wireless networks to utilize higher-frequency bands, such as the millimeter-wave (mmWave) bands, where a huge chunk of unused spectrum is available. In general, wireless communications in high-mobility scenarios at high carrier frequencies are extremely challenging due to the hostile channel variations. Recently, an increasing amount of research attention has been dedicated to designing new modulation waveform and schemes for highmobility communications of next-generation wireless networks. High-mobility communications operating at high carrier frequencies suffer from severe Doppler spreads, mainly caused by the relative motion between the transmitter, receiver, and scatterers. The orthogonal frequency-division multiplexing Q. Guo (O) University of Wollongong, Wollongong, NSW, Australia e-mail: [email protected] Z. Yuan Open University of Henan, Zhengzhou, China e-mail: [email protected] F. Liu Zhengzhou University, Zhengzhou, China J. Yuan University of New South Wales, Sydney, NSW, Australia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_8
187
188
Q. Guo et al.
(OFDM) modulation, which has been widely used in both the fourth-generation (4G) and the emerging fifth-generation (5G) cellular systems, suffers in highmobility scenarios because OFDM waveform is impaired by severe inter-carrier interference (ICI), which is aggravated by the fact that the highest and lowest subcarriers exhibit rather different normalized Doppler shifts. The recently proposed modulation scheme, orthogonal time-frequency space (OTFS) [1, 2], has emerged as a promising candidate for high-mobility communications. OTFS is a two-dimensional (2D) modulation scheme, which modulates information in the delay-Doppler (DD) domain rather than in the time-frequency (TF) domain of classic OFDM modulation. OTFS aims to provide a strong delay resilience and Doppler resilience, while enjoying the potential of full diversity [1] for supporting reliable communications. OTFS modulation exploits a 2D quasitime-invariant channel in the DD domain rather than a time-variant channel in the TF domain. Given that most of the existing wireless system designs have been conceived for low-mobility and low-carrier scenarios, OTFS introduces new critical challenges in transceiver designs. For the sake of unleashing the full potential of OTFS, challenging fundamental research problems have to be addressed, including channel estimation, signal detection, etc.
2 OTFS Modulation and Demodulation Provided that the wireless channels are time-invariant or have long coherence time, they can be modeled by a linear time-invariant (LTI) system, characterized by the channel impulse response (CIR). The presence of multiple scatters leads to multipath propagation, but the CIR is time-invariant. In this case, a onedimensional (1D) CIR in the delay domain .h(τ ) is sufficient for characterizing the time-dispersive channel. The Fourier transform (FT) of this CIR is a frequencyselective channel transfer function (CTF). However, the assumption of having LTI CIRs may no longer be valid in the cases of increased user mobility and carrier frequency. Therefore, in high-mobility scenarios, the linear time-variant (LTV) channel model [3] has attracted considerable attention. LTV channels give rise to frequency shifts due to the Doppler effect, yielding a spectral-smeared version of the transmitted signal, that is, frequency-dispersive. In practice, LTV channels of highmobility scenarios are often doubly dispersive due to the joint presence of multipath propagation and Doppler effects. The transmitted signals suffer from dispersion both in the TD and FD. In such scenarios, each tap of the CIR function is time-dependent. Hence, this results in a 2D CIR function .h(t, τ ) in the time-delay domain. Applying the FT to .h(t, τ ) w.r.t. t yields the DD domain channel (spreading function) .h(τ, ν), given as h(τ, ν) =
P E
.
i=1
hi δ(τ − τi )δ(ν − νi ),
(1)
OTFS and Delay-Doppler Domain Modulation: Signal Detection and Channel. . .
189
where .δ(·) is the Dirac delta function, P is the number of resolvable propagation paths, and .hi , .τi , and .νi represent the gain, delay shift, and Doppler shift associated with the ith path, respectively. The DD domain channel .h(τ, ν) characterizes the intensity of scatterers having a propagation delay of .τ and Doppler frequency shift of .ν, which directly captures the underlying physics of radio propagation in high-mobility environments. More importantly, the LTV channel in the DD domain exhibits beneficial features of separability, stability, compactness, and possibly sparsity, which can be exploited to facilitate efficient channel estimation and data detection. It is noted that parameterizing channel with the aid of delay and Doppler is not new, which has been widely adopted in the areas of radar and sonar. The main benefit of the OTFS waveform is the DD domain multiplexing. Classic modulation techniques multiplex data in time domain (TD) or FD, which however may not work well in the face of severe Doppler spread. For example, the OFDM modulation efficiently transforms a frequency-selective fading channel to multiple parallel frequency-flat subchannels, allowing a low-complexity singletap FD equalization for transmission over LTI channels. However, the orthogonality of OFDM waveform erodes in high-mobility scenarios. In contrast to the existing 1D TD or FD modulation techniques, OTFS is a 2D modulation technique, where the data symbols are multiplexed in the DD domain and each symbol is spread right across the entire TF domain, leading to the potential of attaining the maximum achievable diversity for transmission over doubly dispersive channels, provided that each TD and FD sample experiences independent fading. The maximum attainable diversity order is determined by the number of independently faded resolvable paths in the DD domain. In OTFS, the information symbols are carried in the DD domain. As shown in Fig. 1, we use M and N to denote the number of time delay grid and Doppler grid of the DD plane, respectively. The DD domain resolution along the delay dimension is inversely proportional to the system bandwidth .MAf , and along the Doppler dimension is inversely proportional to time duration NT occupied by an OTFS
Doppler
frequency
N = 3, M = 6
ISFFT-based Mapping
delay
DD plane
time
TF plane (MC, OFDM)
Fig. 1 Illustration of OTFS DD and TF planes, where .N = 3 and .M = 6
190
Q. Guo et al.
x[k, l]
ISFFT
X[n, m] Heisenberg s(t) transform
Channel h(τ, ν)
r(t)
Y [n, m] Wigner transform Time-Frequency Domain
SFFT
y[k, l]
Delay-Doppler Domain
Fig. 2 OTFS modulation and demodulation [5]
frame [4].1 Conveying the information in the DD domain is beneficial for exploiting the underlying physical propagation properties. We note that conventional wireless communication designs treat channel fading as an inevitable deleterious effect, aiming for combating or exploiting it while ignoring its basic underlying causes. In more detail, the channel impairments of propagation delay and Doppler frequency shifts are modeled as a pair of operations imposed by wireless channels on the transmitted waveform. Furthermore, fading is viewed as a phenomenon that manifests itself at the channel’s output imposed by the combined destructive effect of this pair of fundamental channel-induced phenomena. In contrast, OTFS generate a family of signals, which is closed under arbitrary combinations of the time delay and frequency shift. In other words, upon transmitting a signal in this family, the received signal will remain within the family under arbitrary channel impairments. This property provides an opportunity to exploit the interference pattern in the DD domain, representing the channel impairments. Since the time and frequency shifts are separated in the DD domain, there is a potential to provide full time and frequency diversity. The OTFS modulation and demodulation are shown in Fig. 2, which are implemented with 2D inverse symplectic finite Fourier transform SFFT (ISFFT) and SFFT at the transmitter and receiver, respectively [6, 7]. A (coded) bit sequence is mapped to symbols .{x[k, l], k = 0, · · · , N − 1, l = 0, · · · M − 1} in the DD domain, where .x[k, l] ∈ A = {α1 , ....α|A| } with .|A| being the cardinality of .A, l and k denote the indices of delay and Doppler shifts, respectively. As shown in Fig. 2, ISFFT is performed to convert the symbols to signals in the time-frequency (TF) domain, i.e., Xtf [n, m] = √
1
N −1 M−1 E E
.
MN
nk
ml
x[k, l]ej 2π( N − M ) .
(2)
k=0 l=0
Then the signals .{Xtf [m, n]} in the TF domain are converted to a continuous-time waveform .s(t) using the Heisenberg transform with a transmit waveform .gtx (t), i.e.,
1 Note
that a practical signal or pulse needs to satisfy the Heisenberg uncertainty principle.
OTFS and Delay-Doppler Domain Modulation: Signal Detection and Channel. . .
s(t) =
N −1 M−1 E E
.
Xtf [n, m]gtx (t − nT )ej 2π mAf (t−nT ) ,
191
(3)
n=0 m=0
where .Af is subcarrier spacing and .T = 1/Af . The signal .s(t) is then transmitted through a time-varying channel, and the received signal in the time domain can be expressed as f f r(t) =
h(τ, ν)s(t − τ )ej 2π ν(t−τ ) dτ dν.
.
(4)
In DD domain channel model (1), the delay and Doppler shift taps for the ith path are given by τi =
li ,. MAf
(5)
νi =
ki + κi , NT
(6)
.
where .0 ≤ li ≤ lmax and .−kmax ≤ ki ≤ kmax are the delay index and Doppler index of the ith path; .lmax and .kmax represent the largest indices of the delay taps and Doppler taps, respectively; and .κi .∈ (−0.5, 0.5] is a fractional Doppler associated with the ith path. It is noted here that we assume a wideband system with sufficient delay resolution, so that fractional delay shifts are considered. At the receiver side, a receive waveform .grx (t) is used to transform the received signal .r(t) to the TF domain, i.e., f Y (t, f ) =
.
'
∗ ' grx (t − t)r(t ' )e−j 2πf (t −t) dt ' ,
(7)
which is then sampled at .t = nT and .f = mAf , yielding .Y [n, m]. Finally, an SFFT is applied to .{Y [n, m]} to obtain the signal .y[k, l] in the DD domain, i.e., N −1 M−1 nk ml 1 E E Y [n, m]e−j 2π( N − M ) . y[k, l] = √ MN n=0 m=0
.
(8)
If the transmit waveform .gtx (t) and receive waveform .grx (t) satisfy the biorthogonal property [6], the channel input-output relationship in the DD domain can be expressed as [5, 8] y[k, l] =
Ni P E E
.
i=1 q=−Ni
hi
1 − e−j 2π(−q−κi ) N − Ne−j
2π N
(−q−κi )
e−j 2π
li (ki +κi ) MN
] [ ×x [k − ki + q]N , [l − li ]M + ω[k, l],
(9)
192
Q. Guo et al.
where .Ni > λ as [41] 2π
G(pr ) = −j η
.
e−j λ ||pr || I3 − pˆ r pˆ Hr 2λ||pr ||
(A.52)
with .pˆ r = ||pprr || and .η is the impedance of free space. This approximation is tight when the receiver is beyond the reactive near-field of the transmitter. It was assumed in the lemma that only the Y -direction of .J(pt ) is excited at the transmitter, and thus we have that .J(pt ) = Jy (pt )uˆ y . The electric field reduces to E(pt , pr ) = Gy (pr − pt )Jy (pt ),
(A.53)
.
where .Gy (pr − pt ) = G(pr − pt )uˆ y is the second column of the Green function in (A.52). The complex-valued channel coefficient .e(pt , pr ) from the considered transmitting surface located in .pt to the receive point .pr in the XY -plane can be divided into its amplitude and phase as e(pt , pr ) = |e(pt , pr )|e−j
.
2π λ ||pr −pt ||
.
(A.54)
It follows from [22, Eqs. (16) and (19)] that Power gain
Projection on the Z direction
#$ % " " #$ % (pr − pt )T uˆ z 4 2 2 .|e(pt , pr )| = At ||Gy (pr − pt )|| ||pr − pt || η2 2 2 d (xr − xt ) + d 1 = . 4π (xr − xt )2 + (yr − yt )2 + d 2 5/2
(A.55)
As indicated on the first row, this is the channel gain in the Z-direction (i.e., −pt perpendicularly to the array) where . ||pprr −p is the pointing direction of the electric t || 2 field and .At = λ /(4π ) is the area of an isotropic antenna. The considered antenna is assumed to have the dimensions .a × a in the XY -plane, around .pn = (xn , yn , 0), and thus the channel is hn (pt ) =
.
1 a
xn +a/2 yn +a/2
xn −a/2
yn −a/2
e(pt , pr )∂xr ∂yr .
The channel gain can be computed as .|hn (pt )|2 and is given in (6).
(A.56)
Near-Field Beamforming and Multiplexing Using Extremely Large Aperture Arrays
345
A.2 Proof of Theorem 1 We can compute an upper bound on the channel gain expression in (6) as
.
2 xn +a/2 yn +a/2 1 |hn (pt )|2 = e(pt , pr )∂xr ∂yr a xn −a/2 yn −a/2 xn +a/2 yn +a/2 |e(pt , pr )|2 ∂xr ∂yr = ζpt ,pn , ≤ xn −a/2
yn −a/2
(A.57)
2 2 using the Cauchy–Schwarz inequality .| e(pt , pr )dxdy| ≤ |e(pt , pr )| dxdy · 1dxdy. To compute .ζpt ,pn in (A.57) in closed form, we need to solve the integral ζpt ,pn
.
1 = 4π =
×
xn +a/2 yn +a/2 xn −a/2
yn −a/2
xn +a/2 yn +a/2
xn −a/2
yn −a/2
d (xr − xt )2 + d 2 ∂xr ∂yr 5/2 (xr − xt )2 + (yr − yt )2 + d 2
(A.58)
d
(xr − xt + (yr − yt )2 + d 2 # $ %" Reduction in effective area from directivity )2
(xr − xt )2 + d 2 ∂xr ∂yr × . 2 2 2 2 4π((xr − xt ) + (yr − yt )2 + d 2 ) (x − xt ) + (yr − yt ) + d # $ %" # %" $ r Polarization loss factor Free-space pathloss
The contributions of the three fundamental properties when operating in the nearfield of the array (i.e., the variations in distances to the antennas, in the effective antenna areas, and in the polarization losses) are stated explicitly in this expression. The rest of the proof follows from computing the integral in (A.58), and the details can be found in [21, App. A].
A.3 Proof of Corollary 3 √ When .d cos(ϕ) >> 2NA, it follows that .B + 1 ≈ 1 and .2B + 1 ≈ 1. We can then utilize the fact that .tan−1 (x) ≈ x for .x ≈ 0 to approximate (22) as ξd,ϕ,N ≈
2
.
i=1
√ B + (−1)i B tan(ϕ) . √ 2π tan2 (ϕ) + 1 + 2(−1)i B tan(ϕ)
(A.59)
346
P. Ramezani and E. Björnson
√ Furthermore, we can utilize the fact that . 1 + x ≈ 1+x/2 for .x ≈ 0 to approximate the denominator of (A.59) and obtain ξd,ϕ,N ≈
2
.
i=1
√ B + (−1)i B tan(ϕ)
√ i B tan(ϕ) 2π 1 + tan2 (ϕ) 1 + (−1) 1+tan2 (ϕ) 2
tan (ϕ) 2B − 2B 1+tan2 (ϕ)
√ = B tan(ϕ) 1− 2π 1 + tan2 (ϕ) 1 + 1+tan 2 (ϕ)
≈
√ B tan(ϕ) 2 1+tan (ϕ)
B = N βd cos(ϕ) cos3 (ϕ), $ %" # π(1 + tan2 (ϕ))3/2
(A.60)
=ζd,ϕ
where we simplified the expression by writing the two fractions as a single fraction, √ i B tan(ϕ) ≈ 1, and finally that .1 + tan2 (ϕ) = 1/ cos2 (ϕ). then utilized that .1 − (−1) 1+tan2 (ϕ)
A.4 Proof of Corollary 5 The upper bound is obtained by applying the Cauchy–Schwarz inequality to the numerator of (37) as .| Sn E(x, y)dxdy|2 ≤ Sn |E(x, y)|2 dxdy Sn 1dxdy. This results in √N A/2 √N A/2 2 √ √ |E(x, y)|2 dx dy |E(x, y)| A dx dy n=1 − N A/2 − N A/2 S √ √ = . n A/2 A/2 N A S |E(x, y)|2 dx dy √ |E(x, y)|2 dx dy N √
N Garray ≤
.
− A/2 − A/2
(A.61) The remaining integrals are of the same kind as in Theorem 1 and equal .αd,N and αd,1 from Corollary 1, respectively.
.
A.5 Abbreviations and Notations Table A.1 provides the list of abbreviations and notations used throughout this chapter along with their respective descriptions.
Near-Field Beamforming and Multiplexing Using Extremely Large Aperture Arrays Table A.1 List of abbreviations and mathematical notations Abbreviation/Symbol ELAA LoS mMIMO SNR SE MF DF BD ZF DoF A d .Ptx .Prx .βd .λ f N .pt .pn a .hn .ζ .α .ϕ .h .φn s n .n 2 .σ .v .w D ' .d W .E0 G .η K
Description Extremely Large Aperture Array Line of Sight Massive Multiple Input Multiple Output Signal-to-Noise Ratio Spectral Efficiency Matched Filtering Depth of Focus Beam Depth Zero-Forcing Degrees of Freedom Received antenna area Distance from the transmitter to the center of the receiver Transmit power Receive power Far-field channel gain Wavelength Carrier frequency Number of antennas Coordinate of the transmitter Coordinate of the nth receive antenna Side length of the receive antenna Channel gain to the nth receive antenna Upper bound on the channel gain to individual antennas Upper bound on the total channel gain to the antenna array Angle from the transmitter to the antenna array Channel vector to the antenna array Phase shift of the channel to the nth receive antenna Unit-norm information signal Independent noise at a receive antenna Independent noise vector of the receive antenna array Noise variance Receive combining vector Transmit precoding vector Maximum length of a receive antenna Distance from the transmitter to the receiver’s edge Maximum length of the antenna array Electric intensity Antenna gain Impedance of free space Number of multiplexed users
347
348
P. Ramezani and E. Björnson
References 1. E. Björnson, L. Sanguinetti, H. Wymeersch, J. Hoydis, and T. L. Marzetta. Massive MIMO is a reality—What is next? Five promising research directions for antenna arrays. Digital Signal Processing, 94:3–20, 2019. 2. E. Björnson, J. Hoydis, and L. Sanguinetti. Massive MIMO networks: Spectral, energy, and hardware efficiency. Foundations and Trends® in Signal Processing, 11(3–4):154–655, 2017. 3. T. L. Marzetta. Noncooperative cellular wireless with unlimited numbers of base station antennas. IEEE Transactions on Wireless Communications, 9(11):3590–3600, 2010. 4. E. G. Larsson, F. Tufvesson, O. Edfors, and T. L. Marzetta. Massive MIMO for next generation wireless systems. IEEE Communications Magazine, 52(2):186–195, 2014. 5. I. F. Akyildiz and J. F. Jornet. Realizing ultra-massive MIMO (1024 × 1024) communication in the (0.06–10) terahertz band. Nano Communication Networks, 8:46–54, 2016. 6. A. Faisal, H. Sarieddeen, H. Dahrouj, T. Y. Al-Naffouri, and M-S. Alouini. Ultra-massive MIMO systems at terahertz bands: Prospects and challenges. IEEE Vehicular Technology Magazine, 15(4):33–42, 2020. 7. V. Jamali, A. M. Tulino, G. Fischer, and R. R. Müller. Intelligent surface-aided transmitter architectures for millimeter-wave ultra massive MIMO systems. IEEE Open Journal of the Communications Society, 2:144–167, 2020. 8. A. Pizzo, T. L. Marzetta, and L. Sanguinetti. Spatially-stationary model for holographic MIMO small-scale fading. IEEE Journal on Selected Areas in Communications, 38(9):1964–1979, 2020. 9. C. Huang, S. Hu, G. C. Alexandropoulos, A. Zappone, R. Zhang, M. Di Renzo, and M. Debbah. Holographic MIMO surfaces for 6G wireless networks: Opportunities, challenges, and trends. IEEE Wireless Communications, 27(5):118–125, 2020. 10. D. Dardari and N. Decarli. Holographic communication using intelligent surfaces. IEEE Communications Magazine, 59(6):35–41, 2021. 11. A. Pizzo, L. Sanguinetti, and T. L. Marzetta. Fourier plane-wave series expansion for holographic MIMO communications. IEEE Transactions on Wireless Communications, early access. 12. B. P. Horváth B. T. Csathó and P. Horváth. Modeling the near-field of extremely large aperture arrays in massive MIMO systems. Infocommunications Journal, XII(3):39–46, 2020. 13. K. T. Selvan and R. Janaswamy. Fraunhofer and Fresnel distances: Unified derivation for aperture antennas. IEEE Antennas and Propagation Magazine, 59(4):12–15, 2017. 14. S. Hu, F. Rusek, and O. Edfors. Beyond massive MIMO: The potential of data transmission with large intelligent surfaces. IEEE Transactions on Signal Processing, 66(10):2746–2758, 2018. 15. E. Björnson and L. Sanguinetti. Demystifying the power scaling law of intelligent reflecting surfaces and metasurfaces. IEEE International Workshop on Computational Advances in MultiSensor Adaptive Processing (CAMSAP), pages 549–553, December 2019. 16. J. C. B. Garcia, A. Sibille, and M. Kamoun. Reconfigurable intelligent surfaces: Bridging the gap between scattering and reflection. IEEE Journal on Selected Areas in Communications, 38(11):2538–2547, 2020. 17. W. Tang, M. Z. Chen, X. Chen, J. Y. Dai, Y. Han, M. Di Renzo, Y. Zeng, S. Jin, Q. Cheng, and T. J. Cui. Wireless communications with reconfigurable intelligent surface: Path loss modeling and experimental measurement. IEEE Transactions on Wireless Communications, 20(1):421– 439, 2021. 18. S. W. Ellingson. Path loss in reconfigurable intelligent surface-enabled channels. IEEE Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), pages 829–835, September 2021. 19. E. Björnson, Ö. T. Demir, and L. Sanguinetti. A primer on near-field beamforming for arrays and reconfigurable intelligent surfaces. Asilomar Conference on Signals, Systems, and Computers, pages 105–112, November 2021.
Near-Field Beamforming and Multiplexing Using Extremely Large Aperture Arrays
349
20. H. T. Friis. A note on a simple transmission formula. IRE, 34(5):254–256, 1946. 21. E. Björnson and L. Sanguinetti. Power scaling laws and near-field behaviors of massive MIMO and intelligent reflecting surfaces. IEEE Open Journal of the Communications Society, 1:1306– 1324, 2020. 22. D. Dardari. Communicating with large intelligent surfaces: Fundamental limits and models. IEEE Journal on Selected Areas in Communications, 38(11):2526–2537, 2020. 23. E. Telatar. Capacity of multi-antenna Gaussian channels. European Transactions on Telecommunications, 10(6):585–595, 1999. 24. Ngo H. Q., E. G. Larsson, and T. L. Marzetta. Energy and spectral efficiency of very large multiuser MIMO systems. IEEE Transactions on Communications, 61(4):1436–1449, 2013. 25. J. Hoydis, S. Ten Brink, and M. Debbah. Massive MIMO in the UL/DL of cellular networks: How many antennas do we need? IEEE Journal on Selected Areas in Communications, 31(2):160–171, 2013. 26. B. Friedlander. Localization of signals in the near-field of an antenna array. IEEE Transactions on Signal Processing, 67(15):3885–3893, 2019. 27. D. K. Cheng. On the simulation of Fraunhofer radiation patterns in the Fresnel region. IRE Transactions on Antennas and Propagation, 5(4):399–402, 1957. 28. J. Sherman. Properties of focused apertures in the Fresnel region. IRE Transactions on Antennas and Propagation, 10(4):399–408, 1962. 29. J. D. Kraus and R. J Marhefka. Antenna for all applications. McGraw-Hill, 2002. 30. C. Polk. Optical Fresnel-zone gain of a rectangular aperture. IRE Transactions on Antennas and Propagation, 4(1):65–69, 1956. 31. A. Kay. Near-field gain of aperture antennas. IRE Transactions on Antennas and Propagation, 8(6):586–593, 1960. 32. R. Hansen. Focal region characteristics of focused array antennas. IRE Transactions on Antennas and Propagation, 33(12):1328–1337, 1985. 33. P. Nepa and A. Buffi. Near-field-focused microwave antennas: Near-field shaping and implementation. IEEE Antennas and Propagation Magazine, 59(3):42–53, 2017. 34. A. Lozano, A.M. Tulino, and S. Verdú. High-SNR power offset in multiantenna communication. IEEE Transactions on Information Theory, 51(12):4134–4151, 2005. 35. A. Pizzo, T. L. Marzetta, and L. Sanguinetti. Degrees of freedom of holographic MIMO channels. In IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pages 1–5, May 2020. 36. X. Wei and L. Dai. Channel estimation for extremely large-scale massive MIMO: Far-field, near-field, or hybrid-field? IEEE Communications Letters, 26(1):177–181, 2022. 37. A. Amiri, M. Angjelichinoski, E. de Carvalho, and R. W. Heath. Extremely large aperture massive MIMO: Low complexity receiver architectures. In IEEE Global Communications Conference Workshops (GLOBECOM Workshops), 2018. 38. H. Zhang, N. Shlezinger, F. Guidi, D. Dardari, M. F. Imani, and Y. C. Eldar. Beam focusing for near-field multi-user MIMO communications. IEEE Transactions on Wireless Communications, 21(9):7476–7490, 2022. 39. Ö. T. Demir, E. Björnson, and L. Sanguinetti. Channel modeling and channel estimation for holographic massive MIMO with planar arrays. IEEE Wireless Communications Letters, 11(5):997–1001, 2022. 40. Z. Dong and Y. Zeng. Near-field spatial correlation for extremely large-scale array communications. IEEE Communications Letters, 26(7):1534–1538, 2022. 41. A. S. Y. Poon, R. W. Brodersen, and D. N. C. Tse. Degrees of freedom in multiple-antenna channels: A signal space approach. IEEE Transactions on Information Theory, 51(2):523–536, 2005.
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving Multiplicative Spectrum Efficiency for 6G Communications Wenchi Cheng, Liping Liang, Haiyue Jing, Hailin Zhang, and Zan Li
1 Introduction The millimeter-wave (mmWave) communications with frequencies between 30 and 300 GHz have been treated as a powerful technique for the sixth-generation (6G) wireless communications networks. Due to the high available bandwidth and small wavelength, the mmWave communications can potentially achieve high capacity as compared with the traditional radio wireless communications [1]. The massive multiple input multiple output (massive-MIMO) can be easily implemented in mmWave communications because the size of mmWave antenna can be made to be very small [2, 3]. However, although the mmWave communications can increase the capacity, it resorts to high bandwidth that cannot increase the spectrum efficiency (SE) for wireless communications. In fact, as the traditional plane-electromagnetic (PE) wave-based wireless communications are becoming more and more mature, it is now very hard to significantly increase the SE of traditional PE wave (which has linear momentum)-based wireless communications to meet the rapidly increasing of SE required by the extremely high-data traffics and tremendous amount of users [4]. Fortunately, the electromagnetic wave not only has the characteristic of linear momentum that has been studied over a century, but also possesses the angular momentum that has attracted much attention during the past decade [5–10]. The orbital angular momentum (OAM), which is another important property of electromagnetic wave [5–7], has not been well studied yet. The electromagnetic wave carrying OAM has the helical phase fronts structure of the exponential function with the product of OAM states/modes, azimuthal angle, and imaginary j . The OAM-
W. Cheng (O) · L. Liang · H. Jing · H. Zhang · Z. Li State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an, China e-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_13
351
352
W. Cheng et al.
based vortex wave has different topological charges, which are independent and orthogonal to each other, bridging a new way to significantly increase the SE of wireless communications using different OAM modes [11–17]. Although the OAM-based vortex radio transmission can increase the SE of wireless communications, some academic researchers have shown that the communication over the orthogonal states of OAM is a subset of the solutions offered by MIMO communications, thus offering no SE gain as compared with the traditional massive-MIMO communications [8, 9, 18]. However, noticing that the array elements of one OAM generation plate are fed with the same input signal [8], the OAM signal can be generated within one antenna that has several array elements but only one radio-chain (RF) chain [19]. Thus, the OAM communications can be implemented within one array-elements-based antenna, where the distances among array elements are not strictly required. On the other hand, in order to achieve the maximum SE for massive-MIMO communications, it is required that the distances among antennas in massive-MIMO system are larger than half of the carrier wavelength to achieve the optimal multiplexing of massive-MIMO communications. According to different displacement requirements for optimal OAM and massive-MIMO communications, it is clear that OAM and massiveMIMO cannot be entirely equally treated. Although now many research works regarding OAM transmission only consider the line-of-sight scenario, the OAM transmission can be used for multipath scenario when the phase change can be estimated. There exist some works considering the OAM transmission in multipath environment [20, 21]. Moreover, OAM and massive-MIMO are not conflicting with each other [22]. Thus, a question raised is that can we obtain the SE gain of OAM while earning the SE gain offered by the traditional multiplexing massive-MIMO communications? In this chapter, we solve the above-mentioned problem and give the answer YES. Some academic researchers have studied how to combine OAM and MIMO [22– 24]. The authors of [22] proposed space-division demultiplexing in the OAM-based MIMO radio system and compared the capacities between the conventional MIMO system and OAM-based MIMO system. The authors of [23] proposed plane spiral OAM-based MIMO system and compared with conventional MIMO. In addition, the authors of [24] proposed an analog eigenmode transmission technique based on the theory of OAM for short-range communications. We build up the OAM-embedded massive-MIMO (OEM) communication model, where the MIMO communications are performed among different uniform circular array (UCA) antennas and the OAM communications are performed among the array elements corresponding to each UCA. Based on the OEM communication model, we design the parabolic antenna to converge the OAM signal. Then, we develop the mode-decomposition and multiplexing-detection schemes to obtain the signal corresponding to each OAM mode of each transmit UCA antenna. We also develop the OEM-water-filling power allocation policy to obtain the maximum average multiplicative SE gain for joint OAM and multiplexing massive-MIMO communications. We conduct extensive simulations to validate and evaluate our developed schemes, showing that our developed convergent OEM communications
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
353
can significantly increase the average SE as compared with traditional multiplexing massive-MIMO communications. The rest of this chapter is organized as follows. Section 2 gives the OEM communication model. Section 3 designs the parabolic antenna to converge the OAM wave, proposes the mode-decomposition and multiplexing-detection schemes to obtain the signal on each OAM mode of each transmit UCA antenna, develops the OEM-water-filling power allocation policy to maximize the average SE of OEM communications, and answers the question when the SE of OEM communication is larger than the SE of traditional massive-MIMO communication. Section 4 evaluates our developed schemes and compares the SEs of OEM and traditional multiplexing massive-MIMO communications. The chapter concludes with Sect. 5.
2 OAM-Embedded-MIMO Communication Model The OEM communication model is shown in Fig. 1, where N and M UCAs [25] are equidistantly around perimeter of the OEM transmit circle and the OEM receive circle, respectively. We denote by .r1 the radius of OEM transmit circle and OEM receive circle, .r2 the radius of UCAs [26], .ϕ the included angle between the normal line of the OEM transmit circle and the line from the center of the OEM transmit circle to the center of the OEM receive circle, and .θ the included angle between x-axis and the projection of the line from the center of the OEM transmit circle to the center of the OEM receive circle on the plane spanned by x-axis and y-axis. Fig. 1 The OAM-embedded-MIMO communication model
354
W. Cheng et al.
Fig. 2 The geometry of each array element
Each UCA at the transmitter is equipped with U array elements, which are fed with the same input signal, but with a successive delay from array element to array element such that after a full turn the phase has been incremented by an integer multiple u of .2π with .1 ≤ u ≤ U . Each UCA at the receiver is equipped with V array elements with index from .v = 1 to .v = V . The OAM modes from .U/2 to .U − 1 are equivalent to the OAM modes from .−U/2 to .−1. Thus, to easily write in the following equations, the OAM mode can be expressed as .0 ≤ l ≤ U − 1. Each UCA at the transmitter is equipped with U array elements with index from .u = 1 to .u = U . All array elements within one transmit or receive UCA antenna share the same RF chain. Thus, equipped with multiple array elements, one transmit or receive UCA antenna is actually a single antenna [19]. The N UCAs at the transmitter and the M UCAs at the receiver can be treated as N-transmit and Mreceive antennas, forming a N-transmit M-receive MIMO communication model. All UCAs are embedded in the MIMO, which is the OEM communication model. The OAM signal can be generated within one antenna that has several array elements but only one RF chain [19]. Thus, the OAM communications can be implemented within one array-elements-based antenna, where the distances among array elements are not strictly required. Each UCA is one antenna in massive-MIMO system. For each UCA antenna, we design the array and signal feeding to guarantee that the UCA antenna can generate OAM modes and form vortex waves to transmit signal. The geometry of each array element is shown in Fig. 2, where .WI represents the width of array elements, .LE denotes the length of array elements, and .LF is the length of the feed source away from the bottom of array element, respectively. The array element, which is identified in yellow color, is put on the substrate that is identified in blue color. Each array element is excited by coaxial feed method. All patch elements are fed by the signals with the same amplitude and different phases. The phase difference of adjacent array elements is .2π l/N , where l is the index of OAM mode. For simultaneous multiple OAM modes transmission, the signals
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
355
are linearly added and then fed to array elements accordingly. In addition, U array elements are equidistantly around the perimeter of each transmit UCA antenna. We set the distance between the centers of two adjacent array elements as one carrier wavelength. We denote by .WI the width of array elements, .LE the length of array elements, and .εr the relative permittivity of substrates, respectively. Then, we can calculate the width of array elements as follows [27–29]: ( )− 21 λ εr + 1 , 2 2
WI =
.
(1)
where .λ denotes the carrier wavelength. Then, we can calculate the effective permittivity, denoted by .εe , for substrates as follows: εe =
.
) 1 ( TH − 2 εr + 1 εr − 1 1 + 12 + , 2 WI 2
(2)
where .TH is the thickness of substrates and often be set as fixed value. Then, the length of wave guide, denoted by .λe , can be obtained from λ λe = √ εe
(3)
.
and .LE can be derived as follows: LE =
.
λe − 2ALE , 2
(4)
where .ALE is the equivalent length of radiation gap and can be calculated as follows: ALE = 0.412TH
.
(εe + 0.3)(WI /TH + 0.264) . (εe − 0.258)(WI /TH + 0.8)
(5)
The feed source is located .LF long away from the bottom of each array element and .WI /2 long away from left side of the array element. We denote by .Zw the wall admittance of array element, which can be calculated as follows [27]: Zw =
.
1 0.00836 WλI
+ j 0.01668 ALTEHWλI εe
.
(6)
We denote by .Z0 the characteristic impedance of each array element. Then, the input admittance at the feed source, denoted by .Y1 , can be obtained as follows:
356
W. Cheng et al.
( 1 Z0 cos ψLF + j Zw sin ψLF .Y1 = Z0 Zw cos ψLF + j Z0 sin ψLF +
) Z0 cos ψ(LE − LF ) + j Zw sin ψ(LE − LF ) , Zw cos ψ(LE − LF ) + j Z0 sin ψ(LE − LF )
(7)
where .ψ is a constant representing the phase of medium. Then, the input impedance, denoted by .Zin , for each array element can be obtained as follows: ) ( 1 2π TH 377 . .Zin = + j √ tan εr λ Y1
(8)
As a result, when the input impedance for each array element is equal to 50 .O, we can obtain .LF as follows: ) ( LE 1 .LF = (9) , 1− √ 2 ξre where ) 1 ( TH − 2 εr + 1 εr − 1 1 + 12 + . .ξre = 2 LE 2
(10)
3 Achieving Maximum Multiplicative Spectrum Efficiency for OEM Communications In the section, we mainly propose the mode-decomposition and multiplexingdetection schemes as well as the optimal power allocation policy to maximize the multiplicative SE for convergent OEM communications. First, we design the parabolic-antenna-based converging method for OAM wave to enable the efficient high-order OAM mode. Second, we propose the mode-decomposition and multiplexing-detection schemes to resolve the transmit signal on each OAM mode of each transmit antenna. Third, we develop the OEM-water-filling power allocation policy to maximize the multiplicative SE for OEM communications. At last, we answer the question when we need to use the OEM communications.
3.1 Parabolic-Antenna-Based OAM Wave Converging Since one UCA consisting of U array elements can generate U OAM modes, according to the principle of OAM communications [8, 30, 31], we give the signal at the uth (.1 ≤ u ≤ U ) array element on the nth (.1 ≤ n ≤ N) transmit UCA antenna
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
357
for the total U OAM modes, denoted by .xn,u , as follows: xn,u =
U −1 E
.
l=0
U −1 E 2π(u−1) 1 1 √ sn,l ej φu l = √ sn,l ej U l , U U l=0
(11)
where .0 ≤ l ≤ U − 1 is the OAM mode number, .φu is the azimuthal angle (defined as the angular position on a plane perpendicular to the axis of propagation) corresponding to the uth array element, and .sn,l is the signal on the lth OAM mode of the nth transmit UCA antenna. The emitted signal, denoted by .xn,l , of the lth mode from the nth transmit UCA antenna can be treated as a continuous signal and can be expressed as follows: xn,l = sn,l ej φl ,
(12)
.
where . 0 ≤ φ < 2π is the continuous-phase parameter. For the nth transmit UCA antenna and the mth receive UCA antenna, we can derive the received signal, denoted by .rmn,v,l , for the lth OAM mode on the vth receive array element as follows: rmn,v,l =
U E
.
hmn,vuxn,l ,
(13)
u=1
where .hmn,vu denotes the channel amplitude gain corresponding to the lth OAM mode from the uth transmit array element to the vth receive array element. The expression for .hmn,vu is given as follows: 2π
hmn,vu
.
||
-
- ||
βλe−j λ d mn − r u + g v = √ |- |, r u + g v| 4π U | d mn − -
(14)
where .β denotes all relevant constants such as attenuation and phase rotation caused r u and . g v are the vectors from the by antennas and their patterns on both sides, . center of transmit UCA antenna to the uth transmit array element and from the center of receive UCA antenna to the vth receive array element, respectively, . d mn is the vector from the center of |the mth transmit UCA antenna and the center of the - | r u + g v | represents the distance between nth receive UCA antenna, and .| d mn − the uth transmit array element on the nth transmit UCA antenna | -and the vth receive | r u | and array element on the mth receive UCA antenna. Because of .| d mn | >> | |- | | |- | |.| d mn | >> | g v |, we approximate .| d mn − r u + g v | at the denominator of Eq. (14) |- | || |- | ru to .| d mn | and .| d mn − r u + g v | at the nominator of Eq. (14) to .| d mn | − d mn |- | | | /| d mn |. We denote by .dmn = d mn . Thus, we can rewrite .hmn,vu as follows:
358
W. Cheng et al.
hmn,vu
.
[ ( - - )] 2π βλ d mn r u ≈ exp −j . dmn − √ λ dmn 4π U dmn
(15)
Based on Eqs. (13) and (15), we can derive the equivalent channel amplitude gain, denoted by .hmn,vu,l , for the lth OAM mode corresponding to the uth transmit array element and the vth receive array element as follows: 2π U U E 2π(u−1) 1 βλ E 1 j 2π(u−1) l e−j λ dmn ej h. mn,vu,l = ej U l ≈ √ e U h mn,vu 4π dmn U
u=1
2π d-mn r-u λdmn
, (16)
u=1
where d-mn = (dmn sin(ϕ) cos(θ ), dmn sin(ϕ) sin(θ ), dmn cos(ϕ))
(17)
)) ) ( ( ( 2π(u − 1) 2π(u − 1) . , r2 sin .r -u = r2 cos U U
(18)
.
and
Thus, .hmn,vu,l can be obtained as follows: hmn,vu,l . =
[ ] 2π U βλe−j λ dmn E j 2π(u−1) l j 2πλ r2 cos 2π(u−1) −θ sin ϕ U e U e . √ 4π U dmn u=1
(19)
When U is relatively large enough, we denote by .θ ' = 2π(u − 1)/U − θ , and we have [ ] f 2π U 2π e−j θl j l E j 2π(u−1) l j 2πλ r2 cos 2π(u−1) jl ' ' −θ sin ϕ U . ≈ e U e ej θ l ej λ r2 sin ϕ cos θ dθ ' U 2π 0 u=1 ( ) 2π = Jl (20) r2 sin ϕ , λ
where Jl (α) =
.
jl 2π
f
2π
ej lτ ej α cos τ dτ
(21)
0
is the l-order Bessel function [32]. Then, based on Eq. (20), we can rewrite .hmn,vu,l as follows: hmn,vu,l ≈
.
βλe−j
√ j θl −l ( ) 2π Ue j r2 sin ϕ . Jl λ 4π dmn
2π λ dmn
(22)
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
359
Equation (22) shows that the channel amplitude gains corresponding to the lth OAM mode from the uth array element on the nth transmit UCA antenna to the vth array element on the mth receive UCA antenna are the same for .1 ≤ u ≤ U and .1 ≤ v ≤ V . Thus, the UCA-based OAM signal can be treated as sending from the center of the transmit UCA antenna and receiving at the center of the receive UCA antenna. Then, we have the channel amplitude gain, denoted by .h'mn,l , for the lth mode corresponding to the nth transmit UCA antenna and the mth receive UCA antenna as follows: ' .hmn,l
=
βλe−j
2π λ dmn
√
U ej θl j −l
4π dmn
( Jl
) 2π r2 sin ϕ . λ
(23)
Observing Eq. (23), we can find that the transmit UCA antenna causes the OAM signal like going through the Bessel-form channel. According to the characteristics of Bessel function, .h'mn,l severely decreases as l increases, resulting in very small received signal-to-noise ratio (SNR) when using the high-order (the OAM mode number is relatively large) OAM communications. That is not what we expected since we aim to obtain the maximum SE combined from all orthogonal OAM modes. In fact, it has been shown that the electromagnetic wave with OAM is centrally hollow and divergent [33]. Also, as the OAM mode number increases, the corresponding electromagnetic wave becomes even more divergent. Therefore, if the OAM mode number is relatively large, it is impossible to get high SE for the OAM communications since the received SNR corresponding to the large OAM mode number is very small as compared with the PE-wave-based communications. To significantly increase the received SNR for each mode of OAM communications, the electromagnetic waves corresponding to all OAM modes need to be convergent. Generally, the electromagnetic waves can be converged using the backscatter of parabolic antenna, the refraction of lens antenna, and the diffractionfree super-surface material [34], etc. These schemes can be treated as making a pre-distortion before transmission. Here, we propose the parabolic-antenna-based converging method to get high SE for high-order OAM communications. Each UCA needs to equip one parabolic antenna to converge the OAM signal. The center of UCA is placed at the focus of the parabolic antenna with array elements facing the parabolic antenna. One example for joint UCA and parabolic antenna model is shown in Fig. 3, which is a profile for joint transmit UCA antenna and parabolic antenna model designed in software HFSS. To design the parabolic antenna [35, 36], we denote by F and D the focal length and the diameter of aperture, respectively. Given the gain, denoted by G, of parabolic antenna and the efficiency of aperture’s utilization, denoted by .η, we can derive the diameter of aperture as follows: D=
.
√ λ Gη . π
(24)
360
W. Cheng et al.
Fig. 3 The designed parabolic antenna for converging OAM waves
Then, we can obtain the focal length as follows: F = κD,
.
(25)
where .κ is the focal length to aperture diameter ratio of parabolic antenna. The value of .κ is often chosen between 0.25 and 0.5. Then, the equation for parabolic antenna can be derived as follows: Z=
.
X2 + Y 2 , 4F
(26)
where X, Y , and Z are the x-axis, y-axis, and z-axis coordinates for the parabolic antenna, respectively. After converging by the parabolic antenna, the impact of item .Jl (2π r2 sin ϕ/λ) can be greatly reduced. Then, the equivalent channel amplitude gain, denoted by .hmn,l , for the lth mode corresponding to the nth transmit UCA antenna and the mth receive UCA antenna can be modeled as follows: hmn,l = Al h'mn,l ,
.
(27)
where .Al denotes the amplitude gain on the lth OAM mode caused by the backscatter of parabolic antenna. We assume that the parabolic antenna can reduce the item .Jl (2π r2 sin ϕ/λ) to .δ0 , where .δ0 is a constant very close to 1 [27]. Thus, we have .Al = 1/Jl (2π r2 sin ϕ/λ).
3.2 Mode-Decomposition and Multiplexing-Detection Schemes At the receiver, the convergent vortex signal received at the mth receive UCA antenna, denoted by .rm,v , can be obtained as follows:
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
rm,v =
N U −1 E E
.
361
hmn,l sn,l .
(28)
l=0 n=1
The convergent vortex signal can be spatially sampled at each array element of the receive UCA antenna. Then, after the sampling, the signal received at the vth array element of the mth receive UCA antenna, denoted by .ym,v , can be derived as follows: ym,v =
N U −1 E E
.
Al
l=0 n=1
U E 2π(v−1) 1 √ hmn,uv sn,l ej V l + wm,v , U u=1
(29)
where .wm,v denotes the received noise at the vth array element of the mth receive UCA antenna. To obtain the received signal on the .l0 th (.0 ≤ l0 ≤ U − 1) mode sent from all transmit UCA antennas, we multiply .ym,v with .exp{−j [2π(v − 1)l0 ]/V }, and we have the received signal, denoted by .ym,v,l0 , on the .l0 mode corresponding to the vth array element of the mth receive UCA antenna as follows: ym,v,l0 =ym,v e−j
.
=
2π(v−1) l0 V
U −1 N E E
Al
n=1 l=0,l/=l0
+
N E
Al
n=1
U E 2π(v−1) 1 √ hmn,uv sn,l ej V (l−l0 ) U u=1
U E 2π(v−1) 1 √ hmn,uv sn,l0 + wm,v e−j V l0 . U u=1
(30)
Observing Eq. (15), we can find that .hmn,uv is independent of v. Thus, the polynoh
2π(v−1)
s
√ n,l corresponding to different v are the same for different .ej V (l−l0 ) mials . mn,uv U in Eq. (30). Then, the received signal, denoted by .ym,l0 , on the .l0 th mode of the mth receive UCA antenna can be derived as follows:
ym,l . 0 =
V E
ym,v,l0
v=1
=
N V E E v=1 n=1
Al
U E 1 -m,lo , √ hmn,uv sn,l0 + w U u=1
(31)
-m,l0 denotes the received noise on the .l0 th mode corresponding to the mth where .w receive UCA antenna. Now, we have obtained the estimated decomposed signal, specified in Eq. (31), for all OAM modes (.0 ≤ l0 ≤ U − 1). It is clear that .ym,l0 follows the standard form of received signal as that in MIMO communications. We denote by .y l = [y1,l , y2,l , · · · , yM,l ]T and .wl = [w1,l , w -2,l , · · · , w -M,l ]T the received signal and noise, respectively, corresponding to the lth mode, where
362
W. Cheng et al.
(·)T represents the transpose operation. Using the zero-forcing detection scheme, the estimate value of transmit signal, denoted by .sl , can be derived as follows:
.
H −1 H .sl = (H l H l ) H l yl −1 H = s l + (H H l H l ) H l wl ,
(32)
where .(·)H denotes the conjugation operation and [
h11,l ⎢⎢ h21,l .Hl = V ⎢ ⎢ . ⎣ .. hM1,l
⎤ h1N,l h12,l · · · ⎥ h2N,l ⎥ h22,l · · · ⎥ .. . . . ⎥ . .. ⎦ . hMN,l hM2,l · · · -
(33)
is the amplitude gain matrix of all channels on the lth mode. Then, the received SNR, denoted by .SNRl , for the lth model can be derived as follows: SNRl =
.
|s l |2 , H 2 2 σl |(H l H l )−1 H H l |
(34)
where .σl2 denotes the variance of received noise corresponding to the lth mode.
3.3 OEM-Water-Filling Power Allocation Policy For the OEM communications, the instantaneous SE, denoted by .COEM , can be derived as follows: ( ) U −1 E |s l |2 .COEM = log2 det I + 2 −1 H 2 σl |(H H l H l) H l | l=0 =
U −1 rank(H E E l) l=0
( ) log2 1 + Pi,l γi,l ,
(35)
i=1
/ −1 H 2 where I is the identity matrix, .γi,l = P [σl2 |(H H l H l ) H l | ] represents the received SNR corresponding to matrix .H l with .P representing the upper bound of total power, .rank (H l ) denotes the rank of .H l , and .Pi,l denotes the power allocation policy. We aim to maximize the SE for OEM communications. Thus, we formulate the average SE maximization problem, denoted by .P 1, as follows:
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
[ P 1 . : max Eγ ⎣
U −1 rank(H E E l) l=0
[ s.t. : 1). Eγ ⎣
.
363
⎤ ( ) log2 1 + Pi,l γi,l ⎦
(36)
i=1
U −1 rank(H E E l) l=0
⎤ Pi,l ⎦ ≤ P ; .
(37)
i=1
2). Pi,l ≥ 0, ∀l ∈ [0, U − 1], ∀i ∈ [1, rank(H l )],
(38)
where [
γ1,0 γ2,0 .. .
⎢ ⎢ γ =⎢ ⎣
.
··· ··· .. .
γ1,1 γ2,1 .. .
γ1,U −1 γ2,U −1 .. .
⎤ ⎥ ⎥ ⎥ ⎦
(39)
γrank(H l ),0 γrank(H l ),1 · · · γrank(H l ),U −1 is the instantaneous received SNR for all channels of the OEM communications and Eγ {·} is the expectation operation with respect to .γ . We assume that the channels corresponding to all modes of all UCAs follow the Rayleigh distribution, where the probability density function (PDF), denoted by .p(γi,l ), is given by .p(γi,l ) = exp(−γi,l /γ i,l )/γ i,l and .γ i,l denotes the average received SNR. It is clear that P1 is a strictly convex optimization problem [37]. To solve P1, we construct the Lagrangian function for P1, denoted by J , as follows:
.
[ J = Eγ ⎣
⎤ ( ) log2 1 + Pi,l γi,l ⎦
U −1 rank(H E E l)
.
l=0
+
i=1
rank(H −1 E l ) UE i=1
⎛
[
ei,l Pi,l − μ ⎝Eγ ⎣
l=0
U −1 rank(H E E l) l=0
⎤
⎞
Pi,l ⎦ − P ⎠ ,
(40)
i=1
where .μ ≥ 0 and .ei,l ≥ 0 (.1 ≤ i ≤ rank(H l ), .0 ≤ l ≤ U − 1) are the Lagrangian multipliers associated with the constraints specified by Eqs. (37) and (38), respectively. Taking the derivative for J with the respect to .Pi,l and setting the derivative equal to zero, we can obtain 1
.
∂J log 2 = γi,l pr (γ ) − μpr (γ ) + ei,l = 0, ∂Pi,l 1 + Pi,l γi,l
(41)
where .pr (γ ) is the probability density function corresponding to the channel. According to the principle of complementary slackness, we have .ei,l Pi,l = 0 for
364
W. Cheng et al.
∀l ∈ [0, U − 1] and .∀i ∈ [1, rank(H l )]. Correspondingly, we consider two different cases as follows:
.
Case A: The inequality .Pi,l > 0 holds for .∀l ∈ [0, U −1] and .∀i ∈ [1, rank(H l )]. Under this case, all channels corresponding to all OAM modes of all transmit UCA antennas are assigned non-zero power for data transmission. Thus, based on the complementary slackness, we have .ei,l = 0 for .∀l ∈ [0, U − 1] and .∀i ∈ [1, rank(H l )]. Then, Eq. (41) can be reduced to
.
γi,l log 2
1 + Pi,l γi,l
− μ∗ = 0,
(42)
where .μ∗ is the optimal Lagrangian multiplier corresponding to .μ. Solving Eq. (42), we can obtain the optimal power allocation policy for Case A as follows: Pi,l =
.
1 μ∗ log
2
−
1 , for 1 ≤ i ≤ rank(H l ) and 0 ≤ l ≤ U − 1, γi,l
(43)
where the optimal value for .μ∗ can be numerically obtained by substituting .Pi,l into [ ⎤ U −1 rank(H E E l) (44) .Eγ ⎣ Pi,l ⎦ = P . l=0
i=1
We define N1 = {(i, l)|1 ≤ i ≤ rank(H l ), 0 ≤ l ≤ U − 1} ,
.
(45)
and we also define .N2 as the index set that satisfies the strict inequality as follows: | } { | 1 1 | (46) .N2 A (i, l) ∈ N1 | μ∗ log2 − γ > 0 . i,l Then, Eq. (43) is the optimal solution only if .N2 = .N1 . Otherwise, if .N2 ⊂ N1 , we need to consider the following case. Case B: There exist some .Pi,l such that .Pi,l = 0. If .N2 ⊂ N1 , there certainly exist some .Pi,l such that .Pi,l = 0. That means some OAM modes (maybe some transmit UCAs) are not assigned power and are not used for data transmission. We give the following lemma. Lemma 1 If .(i, l) ∈ / N2 , then .Pi,l = 0.
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
365
Proof We need to show that for any non-empty subset .C .⊆ .N2 , there is no power allocation policy .Pi,l such that .Pi,l .> 0 for all .(i, l) .∈ .(C ∪ N2 ). If .C = N2 , we have already known that there is no power allocation policy. Otherwise, if .C ⊂ N2 , we suppose that there exists such a power allocation policy so that it can be expressed as Pi,l =
.
⎧ 1 ⎨ μ∗ log 2 − ⎩ 0,
1 γi,l ,
(i, l) ∈ (C ∪ N2 ); otherwise.
(47)
According to Eq. (47), we can select an element (.i ' , l ' ) .∈ .C, which satisfies .Pi ' ,l ' > 0. Thus, we can obtain γi ' ,l ' > μ∗ log 2.
.
(48)
On the other hand, according to the definition of .N2 specified in Eq. (46) and .N2 , we know γi ' ,l ' < μ∗ log 2,
.
(49)
which is opposed to Eq. (48). Therefore, such a power allocation policy does not exist and Lemma 1 follows. O Following the same procedure as that used in Case A, if the strict inequality Pi,l > 0 holds for all .(i, l) ∈ N2 , we can obtain the optimal power allocation policy as follows: ⎧ 1 ⎨ μ∗ log 2 − γ1i,l , (i, l) ∈ N2 ; (50) .Pi,l = ⎩ 0, otherwise.
.
Otherwise, if not all .(i, l) ∈ N2 satisfy the strict inequality .Pi,l > 0, we need to further divide .N2 and repeat the procedure again. In summary, the optimal power allocation policy, called the OEM-water-filling power allocation policy, is given by Algorithm 1. To show the execution procedure of our developed OEM-water-filling power allocation policy, we demonstrate a particular case when .i = 1 and .l ∈ {0, 1} for OEM communications. Using Algorithm 1, we can see that the optimal power allocation policy partitions the SNR plane (.γ1,0 , γ1,1 ) into four exclusive regions by the lines as shown in Fig. 4. If (.γ1,0 , γ1,1 ) falls into region .R1 , both two subchannels corresponding to .l = 0 and .l = 1 will be assigned with power for data transmission. The boundaries of region .R1 are determined by .γ1,0 = μ∗ log 2 and .γ1,1 = μ∗ log 2 that are obtained by solving the boundary condition .N2 = N1 . On the other hand, if (.γ1,0 , γ1,1 ) falls into either .R2 or .R3 , then only one of the subchannels will be
366
W. Cheng et al.
Algorithm 1 The OEM-water-filling power allocation policy 1) Initialization: a) Obtain N2 by Eq. (46). b) k = 2. 2) While (Nk /= Nk−1 ): | { | 1 a) Nk+1 = (i, l) ∈ Nk || μ∗ log 2 −
1 γi,l
} >0 .
b) k = k + 1. 3) Obtain the optimal power allocation policy: a) Denote N ∗ = Nk . b) lPi,l =
⎧ 1 ⎨ μ∗ log 2 − ⎩ 0,
1 γi,l ,
(i, l) ∈ N ∗ ; otherwise.
Fig. 4 The regions for the OEM-water-filling power allocation policy corresponding to the twoOAM-mode case (.l = 0 and .l = 1)
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
367
assigned with power. Otherwise, if (.γ1,0 , γ1,1 ) belongs to region .R4 , there is no power allocated to any subchannel and the system will be in an outage state. Substituting the OEM-water-filling power allocation policy into Eq. (35), we can ∗ obtain the maximum average SE, denoted by .COEM , for OEM communications as follows: [ ⎤ ( ) U −1 rank(H E E l) γ i,l ∗ ⎦. (51) .COEM = Eγ ⎣ log2 μ∗ log 2 l=0
i=1
To compare the SE of OEM communications with that of the traditional massiveMIMO communications, we also give the maximum average SE, denoted by ∗ .C MIMO , for the traditional massive-MIMO communications as follows: ∗ CMIMO =
-) rank(H
E
.
( ) -i γ log2 1 + P -i ,
(52)
i=1
- ) denotes rank corresponding to the MIMO communication channel where .rank(H -i is the optimal power allocation [38] for MIMO communications, and - , .P matrix .H -. .γ -i is the received SNR corresponding to matrix .H For OEM communications, each UCA consists of U array elements at the transmitter and V array elements at the receiver, respectively. However, all array elements within one transmit or receive UCA antenna share the same RF chain. Thus, each UCA antenna is actually a single antenna. Based on Eqs. (51) and (52), we can calculate the average SEs of OEM and traditional massive-MIMO communications under the same number of transmit and receive antennas. When the distance between neighboring antennas in UCA is less than .λ/2 for massive-MIMO communications, mutual coupling between the two neighboring antennas arises, thus resulting in the decrease of SE for massive-MIMO system. However, when the distance between neighboring array elements in UCA is less than .λ/2 for OEM communications, the SE still can be calculated by Eq. (51). Thus, OEM can be used when the distance between neighboring antennas in UCA is less than .λ/2.
3.4 When Do We Use the OEM Communications? The traditional massive-MIMO requires that the distance between two adjacent antennas needs to be not less than half of the carrier wavelength [39, 40], which limits the displacement of multiple antennas in MIMO systems. However, there is no minimum distance restriction on OAM vortex communications, thus bridging the new way to achieve very high SE.
368
W. Cheng et al.
As for our proposed OEM model, if the distance between two adjacent UCA array elements is less than half of the carrier wavelength, the UCA-based OAM vortex communications can be embedded into the massive-MIMO system, forming the OEM communication to achieve larger SE than the traditional massive-MIMO system. We denote by .da (r1 , N) and .de (r2 , U ) the distances between the centers of two adjacent UCAs and between the center of two adjacent array elements, respectively. Then, we have { .
da (r1 , N) = 2r1 sin de (r2 , U ) = 2r2 sin
(π ) N
;
U
.
(π )
(53)
Based on Eq. (53), we can analyze two scenarios as follows: Scenario I: .da (r1 , N) > λ/2 and .de (r2 , U ) ≤ λ/2. Under this scenario, the OAM signal can be generated by every UCA antenna. However, multiple antennas cannot be equipped at the position of UCA due to the half-length distance restriction. The OEM communications can significantly increase the SE as compared with the traditional massive-MIMO communications. For fixed .r1 , .r2 , U , and N, we can obtain the wavelength region for applying the OEM communications as follows: (π ) (π ) . ≤ λ < 4r1 sin (54) .4r2 sin N U Scenario II: .da (r1 , N) > λ/2 and .de (r2 , U ) > λ/2. In this case, we do not need to use OEM communications since the traditional massive-MIMO has achieved the maximum SE.
4 Performance Evaluations In this section, we evaluate the performance of our designed parabolic antenna converging and developed OEM-water-filling power allocation policy for achieving the maximum multiplicative SE in OEM communications. Throughout our evaluations, we set the frequency of carriers as 35 GHz and the angle .ϕ as .30◦ . We also set .LE = 2.947 mm, .LF = 1.912 mm, and .WI = 3.388 mm. We use HFSS to evaluate the converging and MATLAB to evaluate the multiplicative SE, respectively. The circular array consists of sixteen identical rectangular patches (array element), which are excited by coaxial feed method [27]. The UCA is designed using F4B substrate with a relative permittivity of 2.2 and a thickness of 0.245 mm [27]. We also set .G = 36 dB, .η = 50%, and .k = 0.4, respectively. Using HFSS, we evaluate the design for the UCA antenna and the parabolic antenna. The results obtained verify that the OAM signals are divergent when
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
369
transmitted by the UCA antenna and convergent after using the parabolic antenna. The obtained MATLAB results prove that OEM communications can obtain the multiplicative SE. Figure 5 shows the directional diagrams of different OAM modes (.l = 0, 1, 2, 3) for non-convergent and convergent OEM waves with .N = 1 and .M = 1, where we set .U = V = 16. As we expected, the maximum gain of non-convergent OEM wave decreases as the OAM mode number increases. The array mainly radiates along the z-axis direction without a central hollow when mode number .l = 0. This is because the OAM wave with .l = 0 is the traditional PE wave. The central hollow becomes larger and larger as the order of OAM increases. The radius of central hollow is significantly reduced for the convergent OAM waves, and the radiated power concentrates on the main lobe with little power allocated on the side lobe after converging. Figure 6 shows the unconverged and converged OAM beams for OAM modes 0, 1, 2, and 3. OAM beams are reflected by the parabolic antenna. Figure 7 depicts our developed OEM-water-filling power allocation policy versus the instantaneous received SNR in OEM communications, where we set .N = M = 8 (U = V = 2), .N = M = 16 (U = V = 4), and .N = M = 32 (U = V = 8), respectively. The average transmit power is given by .P = 0.2 W. As shown in Fig. 7 before converging, the allocated power severely decreases as the order of OAM increases. There is little power allocated to the high-order OAM modes in OEM communications. This is because the average received SNRs on high order of OAM modes are very small, which is consistent with the property of Bessel functions. After converging, almost all OAM modes can be allocated part of power. Thus, for U array elements, the generated U OAM modes can be used to increase the SE for OEM communications. Figure 8 shows our developed OEM-water-filling power allocation versus the instantaneous received SNR, where we set .N = M = 32 (.U = V = 4 and .U = V = 2, respectively) and .N = M = 16 (.U = V = 4 and .U = V = 2, respectively) for OEM communications. As shown in Fig. 8, the power allocation of OEM-water-filling policy increases as the instantaneous received SNR increases until reaching to a maximum value and holds on. The allocated power for .U = V = 4 OEM communications is less than that for corresponding .U = V = 2 OEM communications. This is because the power is generally averaged allocated to different OAM modes. The allocated power for .N = M = 32 (.U = V = 2) OEM communications and .N = M = 16 (.U = V = 4) OEM communications are the same. This is because both of them have the same number of orthogonal channels. Figure 9 depicts the average SEs of non-convergent OEM communications and traditional massive-MIMO communications, where we set .U = V = 2 and .M = N = 8, 16, 32, and 64, respectively. Observing the curves in Fig. 9, all SEs for non-convergent OEM and traditional massive-MIMO increase as the average SNR increases. More importantly, when equipping with the same transmit-UCAs and receive-UCAs, the non-convergent OEM and the traditional massive-MIMO obtain the same SE, as verified for .M = N = 8, 16, 32, and 64 cases in Fig. 9. The reason why the SEs of non-convergent OEM and traditional massive-MIMO are the same is that much power is allocated to the low-order OAM mode, while the high-order
370
W. Cheng et al.
Fig. 5 The directional diagram for non-convergent and convergent OEM waves with .N = 1 and = 1. (a) Mode 0 (non-convergent OEM, .l = 0). (b) Mode 0 (convergent OEM, .l = 0). (c) Mode 1 (non-convergent OEM, .l = 1). (d) Mode 1 (convergent OEM, .l = 1). (e) Mode 2 (nonconvergent OEM, .l = 2). (f) Mode 2 (convergent OEM, .l = 2). (g) Mode 3 (non-convergent OEM, .l = 3). (h) Mode 3 (convergent OEM, .l = 3) .M
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
371
Fig. 6 The E field of unconverged and converged OAM beams observed from the vertical direction. (a) The E field of unconverged OAM beams (vertical direction). (b) The E field of converged OAM beams (vertical direction)
OAM mode is allocated almost zero power. Therefore, it is necessary to converge OEM waves to efficiently use the high-order OAM mode to achieve maximum SE for OEM communications. Figure 10 compares the average SEs of convergent OEM and traditional massiveMIMO communications, where we set .N = M = 32 (.U = V = 8 and .U = V = 4, respectively) for OEM communication, .N = M = 16 (.U = V = 4) for OEM communication, and .N = M = 32 for massive-MIMO communication, respectively. As shown in Fig. 10, the OEM communications significantly increase average SEs as compared with the traditional massive-MIMO communications. The .N = M = 16 (.U = V = 4), .N = M = 32 (.U = V = 4), and .N = M = 32 (.U = V = 8) OEM communications achieve 2 times, 4 times, 8 times SEs, respectively, as compared with the .N = M = 32 massive-MIMO communication. This is because the OEM communication can obtain not only the SE of traditional massive-MIMO communication, but also the SE of embedded OAM communication. The joint OAM and MIMO communications offer the multiplicative SE that is the SE of traditional massive-MIMO communication times the SE of embedded OAM communication.
5 Conclusions We proposed the framework for OEM communications to obtain multiplicative spectrum efficiency gains for joint OAM and massive-MIMO mmWave wireless communications. We designed the parabolic antenna to converge the OAM waves.
372
W. Cheng et al.
Fig. 7 The three cases of instantaneous OEM-water-filling power allocations for non-convergent and convergent OEM communications. (a) Non-convergent OEM, .N = M = 8 (U = V = 2). (b) Convergent OEM, .N = M = 8 (U = V = 2). (c) Non-convergent OEM, .N = M = 16 (U = V = 4). (d) Convergent OEM, .N = M = 16 (U = V = 4). (e) Non-convergent OEM, .N = M = 32 (U = V = 8). (f) Convergent OEM, .N = M = 32 (U = V = 8)
Then, we developed the joint mode-decomposition and multiplexing-detection schemes to decompose and detect the transmit signal on each OAM mode of each transmit UCA antenna. We also developed the OEM-water-filling power allocation policy to achieve the maximum multiplicative spectrum efficiency for OEM communications. The simulation results validated the parabolic-antenna-based converging, the joint mode-decomposition and multiplexing-detection schemes, and the OEM-water-filling policy. The OEM communications can obtain the multiplicative spectrum efficiency gain for joint OAM and massive-MIMO communications, thus achieving much larger spectrum efficiency than the traditional massive-MIMO mmWave communications.
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
373
Fig. 8 The OEM-water-filling power allocation versus instantaneous received SNR
Fig. 9 The average spectrum efficiencies of non-convergent OEM and traditional massive-MIMO
Fig. 10 The average spectrum efficiency versus the average SNR
References 1. D. Moltchanov, E. Sopin, V. Begishev, A. Samuylov, Y. Koucheryavy, and K. Samouylov, “A tutorial on mathematical modeling of 5G/6G millimeter wave and terahertz cellular systems,” IEEE Communications Surveys Tutorials, pp. 1–45, March 2022. 2. E. G. Larsson, O. Edfors, F. Tufvesson, and T. L. Marzetta, “Massive MIMO for next generation wireless systems,” IEEE Communications Magazine, vol. 52, no. 2, pp. 186–195, Feb. 2014. 3. Z. Gao, L. Dai, D. Mi, Z. Wang, M. A. Imran, and M. Z. Shakir, “MmWave massive-MIMObased wireless backhaul for the 5G ultra-dense network,” IEEE Wireless Communications, vol. 22, no. 5, pp. 13–21, Oct. 2015. 4. M. Dohler, R. W. Heath, A. Lozano, C. B. Papadias, and R. A. Valenzuela, “Is the PHY layer dead?,” IEEE Communications Magazine, vol. 49, no. 4, pp. 159–165, Apr. 2011.
374
W. Cheng et al.
5. B. Thidé, H. Then, J. Sjöholm, K. Palmer, J. Bergman, T. D. Carozzi, N. Istomin, N. H. Ibragimov, and R. Khamitova, “Utilization of photon orbital angular momentum in the lowfrequency radio domain,” Phys. Rev. Lett., vol. 99, no. 8, pp. 087701–1–087701–4, Aug. 2007. 6. A. A. Amin and S. Y. Shin, “Channel capacity analysis of non-orthogonal multiple access with OAM-MIMO system,” IEEE Wireless Communications Letters, vol. 9, no. 9, pp. 1481–1485, Sept. 2020. 7. K. A. Opare and Y. Kuang, “Performance of an ideal wireless orbital angular momentum communication system using multiple-input multiple-output techniques,” in Telecommunications and Multimedia (TEMU), 2014 International Conference on, 2014, pp. 144–149. 8. O. Edfors and A. J. Johansson, “Is orbital angular momentum (OAM) based radio communication an unexploited area?,” IEEE Transactions on Antennas and Propagation, vol. 60, no. 2, pp. 1126–1131, Feb. 2012. 9. M. Tamagnone, J. S. Silva, S. Capdevila, J. R. Mosig, and J. Perruisseau-Carrier, “The orbital angular momentum (OAM) multiplexing controversy: OAM as a subset of MIMO,” in 2015 9th European Conference on Antennas and Propagation (EuCAP), 2015, pp. 1–5. 10. K. A. Opare, Y. Kuang, and J. J. Kponyo, “Mode combination in an ideal wireless OAM-MIMO multiplexing system,” IEEE Wireless Communications Letters, vol. 4, no. 4, pp. 449–452, Aug. 2015. 11. I. Lee, A. Sawant, and E. Choi, “High-directivity orbital angular momentum antenna for millimeter-wave wireless communications,” IEEE Transactions on Antennas and Propagation, vol. 69, no. 7, pp. 4189–4194, July 2021. 12. X. Xu, A. Mazzinghi, A. Freni, and J. Hirokawa, “Simultaneous generation of three OAM modes by using a RLSA fed by a waveguide circuit for 60 GHz-band radiative near-field region OAM multiplexing,” IEEE Transactions on Antennas and Propagation, vol. 69, no. 3, pp. 1249–1259, March 2021. 13. A. Almradi, M. Abbasi, M. Matthaiou, and V. F. Fusco, “On the spectral efficiency of orbital angular momentum with mode offset,” IEEE Transactions on Vehicular Technology, vol. 70, no. 11, pp. 11748–11760, Nov. 2021. 14. X. Hui, S. Zheng, Y. Hu, C. Xu, X. Jin, H. Chi, and X. Zhang, “Ultralow reflectivity spiral phase plate for generation of millimeter-wave OAM beam,” IEEE Antennas and Wireless Propagation Letters, vol. 14, pp. 966–969, Jan. 2015. 15. A. Cagliero, A. D. Vita, R. Gaffoglio, and B. Sacco, “A new approach to the link budget concept for an OAM communication link,” IEEE Antennas and Wireless Propagation Letters, vol. 15, pp. 568–571, July 2016. 16. G.-B. Wu, K. Chan, K. Shum, and C. Chan, “Millimeter-wave holographic flat lens antenna for orbital angular momentum multiplexing,” IEEE Transactions on Antennas and Propagation, vol. 69, no. 8, pp. 4289–4303, Aug. 2021. 17. Y. Chen, S. Zheng, Y. Li, X. Hui, X. Jin, H. Chi, and X. Zhang, “A flat-lensed spiral phase plate based on phase-shifting surface for generation of millimeter-wave OAM beam,” IEEE Antennas and Wireless Propagation Letters, vol. 15, pp. 1156–1158, Nov. 2016. 18. W. Cheng, X. Zhang, and H. Zhang, “QoS-aware power allocations for maximizing effective capacity over virtual-MIMO wireless networks,” IEEE Journal on Selected Areas in Communications, vol. 31, no. 10, pp. 2043–2057, Oct. 2013. 19. M. Yousefbeiki and J. Perruisseau-Carrier, “Towards compact and frequency-tunable antenna solutions for MIMO transmission with a single RF chain,” IEEE Transactions on Antennas and Propagation, vol. 62, no. 3, pp. 1065–1073, Mar. 2014. 20. M. Cheng, L. Guo, J. Li, S. Liu, and M. Cheng, “Effects of atmospheric turbulence on mode purity of orbital angular momentum millimeter waves,” in 2017 IEEE International Symposium on Antennas and Propagation USNC/URSI National Radio Science Meeting, July 2017, pp. 1845–1846. 21. L. Liang, W. Cheng, W. Zhang, and H. Zhang, “Joint OAM multiplexing and OFDM in sparse multipath environments,” IEEE Transactions on Vehicular Technology, vol. 69, no. 4, pp. 3864– 3878, April 2020.
Orbital-Angular-Momentum-Embedded Massive-MIMO: Achieving. . .
375
22. M. Oldoni, F. Spinello, E. Mari, G. Parisi, C. G. Someda, F. Tamburini, F. Romanato, R. A. Ravanelli, P. Coassini, and B. Thidé, “Space-division demultiplexing in orbital-angularmomentum-based MIMO radio systems,” IEEE Transactions on Antennas and Propagation, vol. 63, no. 10, pp. 4582–4587, Oct. 2015. 23. Z. Zhang, S. Zheng, W. Zhang, X. Jin, H. Chi, and X. Zhang, “Experimental demonstration of the capacity gain of plane spiral OAM-based MIMO system,” IEEE Microwave and Wireless Components Letters, vol. 27, no. 8, pp. 757–759, Aug. 2017. 24. K. Murata, N. Honma, K. Nishimori, N. Michishita, and H. Morishita, “Analog eigenmode transmission for short-range MIMO based on orbital angular momentum,” IEEE Transactions on Antennas and Propagation, vol. PP, no. 99, pp. 1–1, Dec. 2017. 25. B. Thidé, Electromagnetic Field Theory, Mineola, NY:Dover, 2nd edition, 2011. 26. S. Gao, W. Cheng, H. Zhang, and Li Z, “High-Efficient Beam-Converging for UCA based radio vortex wireless communications,” in 2017 IEEE/CIC International Conference on Communications in China (ICCC) (IEEE CIC ICCC 2017), Qingdao, P.R. China, Oct. 2017. 27. I. J. Bahl and P. Bhartia, Microstrip antennas, Bedha, Mass: Artech House, 1980. 28. I. J. Bahl, P. Bhartia, and S. Stuchly, “Design of microstrip antennas covered with a dielectric layer,” IEEE Transactions on Antennas and Propagation, vol. 30, no. 2, pp. 314–318, Mar. 1982. 29. R. D. Javor, X.-D. Wu, and K. Chang, “Design and performance of a microstrip reflectarray antenna,” IEEE Transactions on Antennas and Propagation, vol. 43, no. 9, pp. 932–939, Sep. 1995. 30. Q. Zhu, T. Jiang, Y. Cao, K. Luo, and N. Zhou, “Radio vortex for future wireless broadband communications with high capacity,” IEEE Wireless Communications, vol. 22, no. 6, pp. 98– 104, Dec. 2015. 31. Y. Yuan, Z. Zhang, J. Cang, H. Wu, and C. Zhong, “Capacity analysis of UCA-based OAM multiplexing communication system,” in 2015 International Conference on Wireless Communications Signal Processing (WCSP), Oct. 2015, pp. 1–5. 32. N. M. Temme, An Introduction to the Classical Functions of Mathematical Physics, John Wiley and Sons, Inc., New York, 1996. 33. Y. et al. Yan, “High-capacity millimetre-wave communications with orbital angular momentum multiplexing,” Nat. Commun. 5:4876, 2014. 34. B. S. Guru and H. R. Hiziroglu, Electromagnetic Field Theory Fundamentals, Cambridge University Press, 2009. 35. R. Anderson, H, Fixed broadband wireless system design, USA: John Wiley & Sons., 2003. 36. R. D. Oliveira and M. Helier, “Closed-form expressions of the axial step and impulse responses of a parabolic reflector antenna,” IEEE Transactions on Antennas and Propagation, vol. 55, no. 4, pp. 1030–1037, Apr. 2007. 37. S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004. 38. A. J. Goldsmith, Wireless Communications, Cambridge University Press, 2005. 39. D. Tse and P. Viswanath, Fundamentals of Wireless Communication, Cambridge University Press, 2005. 40. A. J. Paulraj, D. A. Gore, R. U. Nabar, and H. Bolcskei, “An overview of MIMO communications - a key to gigabit wireless,” Proceedings of the IEEE, vol. 92, no. 2, pp. 198–218, Feb. 2004.
Integrated Sensing and Communications for Emerging Applications in 6G Wireless Networks Zhen Du and Fan Liu
1 Backgrounds and Motivations Radio detection and ranging (RADAR) and wireless communications as the most popular applications of the modern electromagnetism are evolving toward dualfunctional wireless networks [1]. As a matter of fact, they used to be developed separately for many decades. Firstly, criteria of radar and communications are distinctly different. The basic functionalities of radar include target detection, parameter estimation, trajectory tracking, imaging, etc. In contrast, wireless communications aim to effectively transfer the useful information between two or more nodes and improve the channel capacity, by exploiting modulation/demodulation and channel coding/decoding technologies. Secondly, their operating frequency bands were generally not overlapped and thereby causing little mutual interference to each other. For a more detailed introduction, the remainder of this section provides brief historical views of radar and communications, thereby elaborating on general motivations of integrated sensing and communications (ISAC) in the next-generation (6G) wireless networks.
Z. Du Nanjing University of Information Science and Technology, Nanjing, China e-mail: [email protected] F. Liu (O) Southern University of Science and Technology, Shenzhen, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_14
377
378
Z. Du and F. Liu
1.1 Evolution Path of Radar The word of radar firstly appeared from the US army in 1939. The early radar system worked in a mechanical scanning manner to search and track military targets. To overcome its primary drawbacks like poor flexibility and anti-interference function, the phased-array radar was invented in the 1950s, which can be deployed in groundbased, sea-based, airborne-based, and space-based platforms. The phased-array radar can generate multiple beams simultaneously, in order to track multiple targets [2]. Afterward, the multi-input-multi-output (MIMO) array was also introduced into the radar community [3]. Compared to the phased-array radar, the MIMO radar has more degrees of freedom (DoFs) and better sensing performance, at the cost of higher hardware complexity. In view of this, the concept of phased-MIMO radar was further proposed for achieving a tradeoff between the conventional MIMO radar and phased-array radar [4–6], wherein a small number of radio frequency (RF) chains are connected with multiple antennas through a well-designed low-cost phase-shifter network.
1.2 Evolution Path of Communications Inspired by the multi-antenna radar system, the first patent on MIMO communications was granted in 1994 [7], which straightforwardly promoted the developments of 3G, 4G, and 5G wireless networks [8, 9]. In 2010, the massive MIMO system was firstly proposed, which has become the core technology of 5G wireless network and is also expected as a key enabling technology for the upcoming 6G era [10]. Moreover, the feasibility of using millimeter wave (mmWave) bands for communications has been validated [11]. In contrast to the congested spectrum in sub-6 GHz band, the mmWave band can provide a remarkably larger bandwidth for wireless communications, thereby significantly improving the overall channel capacity. Moreover, the antenna array operated in mmWave bands can be implemented with smaller size, which is beneficial to miniaturization of the hardware platform. Meanwhile, the mmWave signal has the larger path loss due to the higher frequency relative to the conventional sub-6 GHz signal. Fortunately, associating with the massive MIMO array enables the compensation for the severe path loss owing to the significantly increased array gain. On top of the above, the hybrid analog-digital (HAD) [12–14] structure was proposed to reduce the high cost and the hardware complexity of the massive MIMO array, wherein fewer digital RF chains are connected to massive antennas through a well-designed analog phase-shifter. In essence, the HAD communication structure has the same structure as the phased-MIMO radar, which further lays a solid foundation of ISAC.
Integrated Sensing and Communications for Emerging Applications in 6G. . .
379
1.3 General Motivations of ISAC Current commercial communication systems principally rely on the sub-6 GHz bands with the signaling bandwidth in the region of 20 MHz to 100 MHz, which significantly restricts the potential of using wireless signaling for sensing, in terms of the low range resolution on the order of a meter. As a consequence, sub6 GHz bands, together with conventional radar bands such as L band (1–2 GHz), S band (2–4 GHz), and C band (4–8 GHz), result in the congested spectrum among military/civilian radars and cellular/WLAN networks as summarized in Table 1. Moreover, both the radar and communications are developed toward the tendency of miniaturized antennas and large bandwidth, thus urging the exploration of mmWave band (30 GHz–300 GHz) and even Terahertz band (0.1 THz–10 THz). Compared with the conventional sub-6 GHz bands, higher-frequency bands are beneficial to achieve more accurate localization, higher data rate, lower latency, and smaller equipment volume. As an example in Fig. 1, Huawei Technologies has realized an ISAC-THz prototype [15], which is able to support millimeter-level imaging and 240 Gbps data rate. In view of the intertwined evolution paths between sensing and communications, ISAC has attracted numerous attentions both in academic and industrial communiTable 1 Typical frequency bands and applications of radar and communications Frequency bands L band (1–2 GHz) S band (2–4 GHz) C band (4–8 GHz) MmWave band (30–300 GHz) Terahertz band (0.1–10 THz)
Radar applications Air traffic control Airborne early warning radar Weather radar Autonomous drive Through-wall imaging
Communication applications FDD-LTE cellular network TDD-LTE cellular network WLAN networks Cellular-vehicle-to-everything Terahertz communications
Fig. 1 Huawei ISAC-THz prototype, which is based on 140 GHz carrier frequency, 8 GHz bandwidth, and MIMO array of 4 transmit antennas and 16 receive antennas [15]
380
Z. Du and F. Liu
ties on account of technological trends and commercial requirements of the future 6G wireless networks, which are summarized in the following. • Technological trends: Radio sensing and communications are evolving toward miniaturization thanks to larger antenna arrays operated in higher-frequency bands (e.g., mmWave/THz bands). In such frequency bands, existing radar and communication systems have increasingly similar hardware architectures, channel characteristics, and signal processing methodologies. Therefore, merging sensing and communications in a single system is expected to achieve considerable reduction of the equipment size, spectral and power consumptions, hardware and signal processing costs, etc. Other than a simple functionality combination, interrelated sensing and communications may also be co-designed for acquiring mutual assistance in a collaborative manner. In other words, sensing can be aided by communications, and communications can also be assisted by sensing. • Commercial requirements: Current 5G networks are incapable of supporting the active sensing mode, which greatly limits numerous emerging applications requiring the contactless sensing. Typical examples include vehicle-toinfrastructure, perceptive mobile network, and WLAN sensing, which will be discussed in Sect. 3. To that end, the future 6G wireless networks require to jointly design sensing and communication functionalities in a single system. In addition, ubiquitous sensing may become a basic service in future 6G wireless networks, which provides services of localization, mapping, and imaging in many environment-aware scenarios. Finally, merging sensing and communications can naturally avoid the adverse competition for spectral bands, which straightforwardly generates significant economic benefits. Given the above two perspectives, ISAC has been envisioned as an important enabler in the next-generation wireless networks, which bridges the physical and digital worlds. In this chapter, we divide the fundamental structure of ISAC into four separated layers in Fig. 2, involving underlying spectral coexistence, medial
Level 4
Level 3
Level 2
Level 1
S&C Mutual Assistance
CommunicationsAssisted Sensing & Sensing-Assisted Communications
Joint Signaling and Processing
Co-located Hardware
Spectral Coexistence
Fig. 2 The pyramid structure of ISAC
Joint Waveform Design & Signal Processing
Multi-funcational RF Front-End
Radar & Communications Coexistence
Integrated Sensing and Communications for Emerging Applications in 6G. . .
381
co-located hardware, and joint signaling and processing, together with top-level mutual assistance between sensing and communications. Based on such a pyramid structure, two potential benefits of ISAC can be further identified [1, 16], namely, (i) the integration gain attained by the shared radio resources for dual-functional sensing and communications to alleviate duplication of transmissions, devices, and infrastructure and (ii) the coordination gain attained from the mutual assistance between sensing and communications. In the remaining contents of this chapter, we will introduce the fundamental theory of ISAC waveform design and emerging applications of ISAC.
2 ISAC Waveform Design 2.1 Non-overlapped Resource Allocation ISAC can be implemented by scheduling sensing and communication functionalities in orthogonal wireless resources (e.g., time, frequency, spatial and coding resources), yielding four categories of resource allocation schemes, i.e., timedivision ISAC [17, 18], frequency-division ISAC [19], spatial-division ISAC [20, 21], and code-division ISAC [22, 23]. For example, a time-division-based ISAC waveform resource allocation is given in Fig. 3, where the chirp signal and the OFDM signal are transmitted in turn, in order to implement sensing and communications, respectively. In view of the non-overlapped resource allocation, zero mutual interference between sensing and communications can be achieved at the prices of poor spectral and energy efficiencies. In addition to the shared hardware platform to implement dual-functional sensing and communications, the integration gain can be further improved by devising unified resource allocation schemes, so as to improve the temporal, spectral, spatial, and code efficiencies, respectively.
frequency
chirp
OFDM
bandwidth
……
time sensing pulse width
data block length
Sensing pulse repetition interval
Fig. 3 Sensing and communications in a time-division manner
382
Z. Du and F. Liu
2.2 Unified Resource Allocation To further improve the resource efficiency, a more advanced approach to implement ISAC is exploiting unified waveform to realize sensing and communications simultaneously, e.g., explicitly utilizing the communication signal for sensing, inserting communication symbols into the conventional radar signal, and devising the dedicated waveform with joint functionalities.
2.2.1
Communication-Centric
The current communication-centric-based ISAC waveforms mainly rely on orthogonal frequency division multiplexing (OFDM)-based format, which is the base signal of 4G and 5G wireless networks. The OFDM signal has the following worthwhile attributes. • • • •
Multiple data streams transmission in parallel Easy synthesis of a large-bandwidth signal A thumbtack-shaped ambiguity function with high-range Doppler resolutions Special discrete Fourier transformation (DFT) structure which is readily generated by fast algorithms, i.e., fast Fourier transformation (FFT)/inverse FFT (IFFT)
Given the above characteristics, OFDM has become the most popular ISAC unified waveform, which has been applied in a variety of sensing areas such as detection [24], tracking [25], and synthetic aperture radar (SAR) imaging [26]. Please note that some similar multi-carrier waveforms like OFDM-Chirp [27, 28], multicarrier complementary phase-coded (MCPC) [29], and orthogonal time frequency space (OTFS) modulation [30] signals have also been extensively studied as the candidates for ISAC realization. Below we mainly introduce from a perspective of parameter estimation (distance, velocity, angle of arrival, etc.). Using OFDM communication signal for delay-Doppler estimation was firstly proposed in [31], and the corresponding universal software radio peripheral (USRP)-based experiment prototype was constructed in [31]. Thanks to the DFT structure of OFDM, the receiver can establish pulse compression in the delayDoppler domain via FFT/IFFT, after canceling symbols via an element-wise division of the matrices imposed on the echoes. However, the OFDM echo meets the problem of intercarrier interference (ICI) considering the different Doppler shifts of each subcarrier signal. Although as stated in [32] that the ICI can be approximately ignored if the Doppler shift is smaller than a tenth of the subcarrier bandwidth, this constraint does not always hold and may not be beneficial to an ISAC OFDM parametrization. To that end, a Doppler correction method named as all-cell Doppler correction (ACDC) was proposed in [33], followed by FFT/IFFT used for accurate pulse compression after the Doppler correction. Nevertheless, the constraint of ACDC is that the OFDM symbol matrix must be rank one, which considerably
Integrated Sensing and Communications for Emerging Applications in 6G. . .
OFDM:
Data payload sharing Sensing
Parallerto-serial converter
FFT
…
Symbol demapper
…
Parallerto-serial converter
+
Serial-toparallel converter
X
D/A
exp
OFDM Chirp: exp(
…
Communication
IFFT
…
Symbol mapper
…
Serial-toparallel converter
…
Symbol streams
2
~
) Local oscillator
A/D
383
ISAC Tx
Comms Rx
Sensing Rx
Target
X
Fig. 4 ISAC signal processing flow chart for OFDM/OFDM-Chirp
limits the date rate. In [34], the radar delay-Doppler estimation is formulated as a joint carrier frequency offset (CFO) and channel estimation problem via the amplitude and phase estimation (APES) spatial filtering approach, which releases the constraint of rank-one limitation. In addition to OFDM signaling, some researchers have been exploring the possibility of leveraging other unified waveforms for ISAC realization. For instance, OFDM-Chirp has been considered as a candidate in [27], thanks to its lower peak to average power ratio (PAPR). The ISAC signal processing flow chart for OFDM and OFDM-Chirp is shown in Fig. 4, where the principal difference between OFDM and OFDM-Chirp lies in the frequency conversion process, i.e., the input of the local oscillator. Here, .fc represents the carrier frequency, while K denotes the subchirp rate in each hopping interval [27]. A USRP-based OFDM-Chirp testbed was also set up in [28] to verify the basic functionalities of joint sensing and communications. The main difficulty for ISAC application with OFDM-Chirp lies in its higher hardware complexity and cost for the signaling generation in the local oscillator. Another unified waveform is the well-known MCPC signal [29, 35], which is still based on OFDM format with additional phase-coded contents like P3/P4 codes. MCPC has the advantages like better resolution, good capability of rejecting interference, and low probability of intercept (LPI). Moreover, compared with conventional OFDM, MCPC has higher range resolution in the same pulse width, better Doppler tolerance in the same bandwidth, and lower sidelobe level of autocorrelation function. The recently proposed OTFS modulation [30] can accommodate the channel dynamics by modulating data in the delay-Doppler (DD) domain rather than in the time-frequency domain of conventional OFDM modulation, thereby providing a strong Doppler resilience to support more reliable communications in high-mobility environments [36]. Besides, the OTFS signal can also be utilized for sensing targets [37, 38], which exhibits the inherent advantages of multi-carrier modulation as well as additional estimation benefits when the Doppler spread of echoes is severe [37]. Therefore, OTFS constitutes a promising waveform candidate for ISAC realization in 6G wireless networks.
384
2.2.2
Z. Du and F. Liu
Sensing-Centric
Conventional radar signals are incapable of exchanging data information since they transmit periodic pulses unlike the random communication signals. It is also feasible to enable radar signals to communicate by associating the communication symbols with existing radar formats via suitable embedding operations, e.g., timefrequency domain embedding [39, 40], spatial domain embedding [41–43], and index modulation [44]. Time-Frequency Domain Embedding Early sensing-centric information embedding schemes focus on the operations imposed on the chirp signal in time-frequency domain [39, 40]. Chirp signals have superior sensing properties like constant envelope, large time-bandwidth product, high Doppler tolerance, etc., which can be seen in Fig. 5. On top of this, the communication functionality can be attached to the chirp signal by adjusting its amplitude, initial phase, carrier frequency, and chirp slop, [1]. Given this reason, different modulation formats, e.g., amplitude-shift keying (ASK), frequency-shift keying (FSK), and phase-shift keying (PSK), can be straightforwardly used. Spatial Domain Embedding The high DoFs and waveform diversity of the MIMO radar make it possible to allocate radar and communication resources in spatial domain by designing a proper beamforming matrix. A classical approach is to represent communication data with different sidelobe levels of radar beampattern, with the mainlobe dedicated to radar sensing [41–43]. This methodology has two key deficiencies. Firstly, despite the almost lossless sensing performance in the surveillance area of the mainlobe, the communication rate on the order of pulse repetition frequency (PRF) is relatively low and unable to support high-rate data transmission. Secondly, although the mainlobe is nearly identical to that of the desired radar beampattern, the sensing performance in fact has a loss considering the varying levels of sidelobes, thereby leading to the higher probability of false alarm. (a)
0.5 0 -0.5 -1 -1
-0.5
0
(b)
30
Amplitude
Amplitude
1
0.5
20 10 0 -10
1
Time (c)
-5
0
5
10
Frequency (d)
2
5
1
Delay
Frequency
Fig. 5 Chirp signal: (a) in time domain, (b) in frequency domain, (c) in time-frequency domain, and (d) ambiguity function in delay-Doppler domain
0
0 -1
-5
-2 -1
-0.5
0
Time
0.5
1
-10
0
Doppler
10
Integrated Sensing and Communications for Emerging Applications in 6G. . .
2.2.3
385
Joint Design
Fig. 6 A conceptual diagram of performance tradeoff between sensing and communications via joint design
Sensing metric
As introduced above, communication-centric and sensing-centric schemes can realize nearly lossless communication/sensing performance, while the corresponding sensing/communication performance would be strictly limited. For example, communication symbols are usually inserted based on radar inter-pulse modulation, arising the extremely low data rate in the slow time domain. Moreover, the sidelobe modulation-based ISAC scheme can only support line-of-sight (LoS) communication. Besides, they cannot flexibly formulate a scalable tradeoff between sensing and communications such as that shown in Fig. 6. Given these drawbacks, a more promising method arises by jointly designing ISAC system, while the waveform design does not have to rely on existing radar or communication waveforms. With this method, the symbols are represented on the basis of the radar intra-pulse modulation in the fast time domain, while the communication channel is not necessarily restricted to the LoS path. That is to say, jointly designed ISAC waveform is capable of providing extra DoFs and flexibility and thereby offering the potential of improving sensing and communication capabilities simultaneously, relative to existing communication-centric and sensing-centric schemes. To that end, ISAC beamforming is considered for MIMO radar and multiuser MIMO (MU-MIMO) communication system in the non-line-of-sight (NLoS) channel, wherein the communication signal itself is directly leveraged for sensing [20]. In such a system, a single communication symbol is represented by a fast time sample for supporting the high-rate transmission. Designing an ISAC beamforming matrix yields a beampattern which can support the sensing functionality and simultaneously satisfy several signal-to-interference-plus-noise-ratio (SINR) constraints of downlink users. This topic has been extended in [45] by further considering constant envelope and similarity constraints of radar waveforms, which reveals the tradeoff among the radar detection probability and the average
SensingOptimal
Scalable Tradeoff
CommsOptimal
Communication metric
386
Z. Du and F. Liu
achievable rate. Although the imposed constant envelope constraint yields the non-convex optimization problem, the global optimal solution can be solved with the corresponding Karush-Kuhn-Tucker (KKT) equations [45]. More literature of jointly designed ISAC system can be referred to [46–48].
3 Emerging Applications 3.1 Intelligent Connected Vehicles: Sensing-Assisted Communication Intelligent connected vehicles play an important role in constructing the intelligent transportation systems, which requires frequent information exchanges among vehicles and infrastructures [49]. Current cellular vehicle-to-everything (C-V2X) schemes like LTE-V2X are able to support localization services on the order of 10 meters and the latency on the order of 1 second [50]. However, future intelligent connected vehicles demand localization services at the accuracy of a centimeter and the latency of a millisecond, for ensuring the safety of autonomous vehicles [51]. Thanks to the powerful driving force of 5G communication system, especially the massive MIMO and mmWave technologies, high accuracy localization and low latency services become feasible on account of the following reasons. First, the massive MIMO array generates pencil-like beams with high array gains, which compensates for the mmWave path loss and greatly improves the angular resolution and communication rate. Second, a large amount of spectrum resources at mmWave band are beneficial to the higher range resolution and channel capacity. In addition, the sparse mmWave channel involving much fewer NLoS components and clutter is significantly superior to the conventional sub-6 GHz band. Given the above favorable properties, it is expected to equip the system with sensing and communication functionalities simultaneously, such that ISAC has been taken into consideration toward the feasible applications in V2X networks. In Fig. 7, the early ISAC scheme in V2X networks is to exploit the extra radar sensors deployed on the roadside unit (RSU), and then the dedicated sensing module
radar beam
communication beam Fig. 7 A scenario diagram of the extra radar-aided beam tracking
Integrated Sensing and Communications for Emerging Applications in 6G. . .
387
Fig. 8 A scenario diagram of the general ISAC beam tracking in V2I networks
is capable of aiding the communication in terms of beam training, beam alignment, and tracking [52]. To verify its performance benefits, experiments are carried out in [53], where the overhead of beam training is tested in a phased-array-based analog mmWave platform with 64 antennas at the RSU and 16 antennas at the vehicle. To be specific, the communication protocol-based beam training [54] needs 1024 beam pairs to establish and retain a suitable beam pair. In contrast, with the aid of Global Navigation Satellite System (GNSS) signaling, only 475 beam pairs are necessary. Consequently, the number of beam pairs can be reduced to 32 with the assistance of radar sensor. Despite the achieved coordination gain by reducing the training overheads, the extra hardware cost of sensing systems results in the loss of integration gain. We further note the fact that the transmitted communication signals can be reflected by the vehicles, where the state information are involved in the echo signals received by the RSU as shown in Fig. 8. By collecting the echoes, the RSU is capable of recovering the vehicles’ state information and track them with the classic Kalman filtering [55]. In this manner, the RSU is able to attain both the coordination gain and the integration gain. Pioneering literature like [56, 57] constructs the framework of sensing-assisted beam tracking in V2I networks, where the induced prediction functionality enables a predictive beamformer. In general, such an idea of ISAC beam tracking has the following superiorities. • No dedicated beam training/tracking pilots and uplink feedbacks are needed. According to the 5G new radio (NR) protocol [58], the overheads of channel state-information reference signals (CSI-RS) and uplink feedbacks can be released, which can improve the spectrum efficiency compared with the conventional communication-based beam training/tracking. Note that the demodulation reference signals (DMRS) are still necessary for the coherent demodulation. A comparison of frame structures is shown in Fig. 9 to vividly illustrate this.
388
Z. Du and F. Liu
Beam Training
Uplink Feedback
Beam Training or Beam Tracking
Communication-Only Beam
ISAC Beam
Demodulation Reference Signal
Data Block
Data Block
Data Block
Conventional Beam Training or Beam Tracking ISAC Block Predicted Angle
ISAC Block
ISAC Block
Predicted Angle
Sensing Assisted Beam Tracking
Fig. 9 Frame structures of the conventional beam training/tracking method and sensing-assisted beam tracking
• Significant matched-filtering gain. The entire data block is exploited for sensing vehicles and transmitting the data simultaneously, such that the matchedfiltering gain is much more significant than the conventional communicationbased beam training/tracking which only employs limited pilot symbols. • Additional DoFs for more accurate localization. Conventional communicationbased beam training/tracking only utilizes the information of angle with the aid of the beam sweeping and refinement. However, the ISAC beam tracking approach is determined by the angle and the distance simultaneously, which is beneficial to localize targets more precisely. Although the ISAC beam tracking approaches developed in [56, 57] have shown its great potential, the assumption of point-like vehicles is impractical especially with the massive MIMO array in the mmWave band. Specifically, the vehicle in fact exhibits extension contents in both the angle and the distance domains, thanks to the high angle and distance resolutions caused by large amount numbers of transmit antennas and the rich spectral resources. That is to say, the approach in [56, 57] cannot well handle the beam tracking in this case, since the position of the communication receiver equipped on the vehicle platform may not be precisely localized. To solve this problem, the accurate modeling of extended vehicle target is necessary, which has been considered in [59]. In brief, with the help of dynamic beamwidth, the beampattern is adjusted in real time such that the transmit beam can always cover the entire vehicle. Thereby the position of communication receiver can be inferred with the revolved scatterers solved by matched-filtering and multiple signal classification (MUSIC). In addition to the above, the authors in [59] also proposed a more advanced scheme by transmitting the wide beam with the dynamic beamwidth and the narrow beam with the massive MIMO configuration alternately. In this manner, the average achievable rate can be significantly improved. Moreover, the inherent tradeoff inside this block splitting scheme is revealed by a convex optimization model. The detailed flow chart of this ISAC-alternant beam (ISACAB) scheme is shown in Fig. 10. The typical beam tracking schemes in V2I networks are briefly summarized in Table 2 with their key features, advantages, and drawbacks.
Integrated Sensing and Communications for Emerging Applications in 6G. . .
communication channel
communication receiver
Estimation Matched filtering
……
measurement
Prediction
MUSIC
K scatterers
sensing channel
389
Infer the position of communication receiver
Kalman filtering
MUSIC
estimated values ISAC block
predicted values Data block
Fig. 10 The flow chart of the ISAC-AB scheme
Table 2 Typical schemes for beam tracking in V2I networks Schemes Communication beam training and tracking [54] Radar-aided beam tracking [52, 53] ISAC beam tracking [56, 57, 59]
Advantages Key features Based on communication Low complexity, high accuracy in protocol low-mobility scenario With extra radar sensor Low pilot overheads Using unified waveform for communication assisted by sensing
With prediction functionality, low pilot overheads, reliable tracking in high-mobility scenario
Drawbacks High pilot overheads, failed tracking in high-mobility scenario Higher hardware cost Full-duplex and self-interference cancellation techniques are necessary [60, 61]
It is worth mentioning that some important challenges remain unsolved. • First, the above literature only takes the straight road into consideration, while the road trajectory may be irregular. More recently, some researchers have devoted to ISAC beam tracking in the complex road geometry in V2I networks, where the road is expressed as a curvilinear coordinate system and the corresponding state evolution model is constructed [62]. • Second, the multi-target beam tracking is still a challenging issue which necessitates the data association strategy. Despite the pencil-sharp beam, multiple vehicles may also be illuminated simultaneously when the vehicles are far away from the RSU. • Third, more advanced tracking algorithms may be more suitable to handle the non-linear and non-Gaussian models, while the Kalman filter is not qualified to achieve optimal minimum mean square error (MMSE) performance in such cases. Given this reason, other filters like the particle filter can be a potentially promising solution. • Fourth, the analysis on how much percent of overheads can be reduced with ISAC beam tracking should be considered in the practical NR protocol. This may be analyzed according to the specific scenarios, requirements, and system parameter settings.
390
Z. Du and F. Liu
3.2 Perceptive Mobile Network: Communication-Assisted Sensing The sensing functionality is expected to be involved in the future perceptive mobile network [63–65], i.e., a general cellular network with the ability of sensing, such that sensing services like localization, recognition, and imaging can be supported. In the following, we show that the perceptive mobile network can be implemented by exploiting the NR waveform for sensing and NR network architecture for sensing.
3.2.1
NR Waveform Sensing
Note that the sensing receiver needs a known reference signal to recover the target information from echoes. For the mono-static sensing, it is feasible since the transmitter and the receiver are deployed in the same device, e.g., the base station (BS). However, a full-duplex mode is indispensable considering the selfinterference inside. Although the conventional pulsed radar mechanism can avoid the self-interference, it is impractical for the BS on account of the spectral efficiency since the communication signal is continuously transmitted. For the passive bistatic sensing, an extra reference channel in addition to the surveillance channel is necessary to acquire the estimated counterpart of the reference signal, which may arise considerable performance loss. Such a methodology is similar to the “sensingafter-decoding” strategy [66]. According to the 5G NR protocol [58], each frame consisting of several subframes is composed of the pilot payload and the data payload, both of which can be utilized for sensing. Firstly, exploiting data payload for sensing needs an elementwise division to fully remove the data symbols from the reflected OFDM echoes, such that the coherent accumulation can be achieved with FFT/IFFT. This has been discussed in Sect. 2.2, whose major challenge is that the ICI destroys the orthogonal DFT structure of OFDM echoes. Note that the pilot overhead has zero contribution to data transmission. On the contrary, the pilot payload can also be utilized for sensing since its signal format is known both to the BS and the user end. On the basis of 3GPP Technical Specification 38.211, Release 15 [67], the pilot payload consisting of demodulation reference signals (DMRS), CSI reference signals (CSIRS), and uplink sounding reference signals (SRS), can be utilized to improve the general sensing performance.
3.2.2
NR Network Architecture for Sensing
The network topology of the ISAC system is a crucial issue for ISAC design. Given this background, the networked sensing has been frequently discussed in the areas of wireless sensor network [68], multi-static radar [69], and distributed MIMO radar
Integrated Sensing and Communications for Emerging Applications in 6G. . .
391
Clutter
RRU 1
BBU Pool
Downlink Monostatic Sensing
Fronthaul
RRU 2
Downlink Passive Sensing
RRU 3
Uplink Sensing
C-RAN Enabled Architecture Fig. 11 C-RAN architecture for networked sensing
[70]. Relative to the single sensor node, the sensing performance can be significantly improved by jointly processing the raw data or mature sensing parameters from multiple sensed nodes in the data fusion unit, owing to the waveform diversity and more DoFs. Overall, jointly processing the gathered data of different nodes leads to better sensing performance than a simple data fusion observed at each node independently, at the cost of much higher complexity. This is more severe with the development of the massive antennas. Fortunately, the cloud radio access network (C-RAN) architecture [71] with the capability of exploiting dense cell infrastructures to construct a perceptive mobile network is able to support a flexible and reconfigurable framework that enables several sensing modes, like downlink mono-static sensing, downlink bi-static sensing, uplink bi-static sensing, and uplink distributed sensing. In such a C-RAN enabled network architecture as shown in Fig. 11, a pool of base band units (BBUs) act as the data fusion unit, while massive remote radio units (RRUs) are leveraged as active or passive radar sensors [63]. We also highlight that the self-interference cancellation in mono-static sensing of C-RAN is a challenging problem, since the transmit waveform is continuous while the transmitter and the receiver of mono-static node are collocated. Fortunately, current technological level can achieve more than 100 dB self-interference suppression [72], via the joint processing of array isolation, radio frequency (RF), and digital cancelers.
392
Z. Du and F. Liu
3.3 WLAN Sensing Recently, there have been new advances in the industrial standard relevant to ISAC systems. IEEE 802.11 sets up a new official Task Group IEEE 802.11bf [73] in 2020, which is devoted to enhancing sensing capabilities based on existing WiFi standards. WiFi-based sensing is a key enabling technology in constructing smart home by accurately acquiring human features, without the constraint of NLoS, as shown in Fig. 12. In addition, WiFi-based approaches are with the ubiquitous availability and cost efficiency [74]. Based on the above advantages relative to the vision-based sensing, it has attracted numerous attentions in recent years. Despite the popular application of GNSS technology in outdoor environments, its performance significantly degrades for the indoor localization. In contrast, WiFibased positioning system (WPS) is able to handle this problem with a low cost [75]. For a technical implementation, the wireless router collects the signal transmitted by the user end and infers its position according to the analyzed signal features, such as the time of arrival (ToA), angle of arrival (AoA), and the received signal strength (RSS). Behavior recognition is also a promising application based on WLAN sensing [76]. In general, customized devices can receive the human reflected signal, which is transmitted by the WiFi access point (WiFi AP). This in fact constructs a configuration of bi-static sensing like the passive radar. Typically, software defined radio (SDR) systems are chosen as customized devices to acquire raw data. Then the customized signal processing methodology is used for extracting CSI and recognizing the human micro-motion [77, 78]. This enables the great potentials in
WiFi AP
wall
human Fig. 12 The through-wall behavior sensing based on WiFi measurement
Integrated Sensing and Communications for Emerging Applications in 6G. . .
393
residential healthcare like the elderly fall detection [79, 80], breathing detection [76], and the heartbeat detection [81], together with the gesture recognition [82], etc. The applications of WLAN sensing are not limited to building indoor smart home. In outdoor environments, for instance, WiFi-based inverse synthetic aperture radar (ISAR) for high-resolution profiling of slowly moving vehicles has been tested [83], both in the active mono-static sensing or passive bi-static sensing cases. In such cases, the Doppler separating is helpless in extracting the slowly moving targets from the strong surrounding clutter. Therefore, the key problem is to devise appropriate clutter cancellation schemes.
4 Conclusion In this chapter, we overview the basic motivations of ISAC owing to the similar frequency bands, channel characteristics, signal processing algorithms, and hardware architectures between radar and communications, as well as the induced coordination gain and integration gain. Waveform design as the core of ISAC system is discussed in detail from two perspectives, i.e., non-overlapped resource allocation and unified resource allocation. In addition, three emerging applications with ISAC technology have been introduced with the mutual assistance between sensing and communications, i.e., intelligent connected vehicles, perceptive mobile network, and WLAN sensing. ISAC is profoundly recognized as one of enabling technologies in 6G era. Beyond that, we believe that ISAC can bridge the physical and the digital worlds in future 6G wireless networks, where the sensed everything can be intelligently connected.
References 1. F. Liu, Y. Cui, C. Masouros, J. Xu, T. X. Han, Y. C. Eldar, and S. Buzzi, “Integrated sensing and communications: Towards dual-functional wireless networks for 6g and beyond,” IEEE Journal on Selected Areas in Communications, 2022. 2. H. Griffiths, “The mammut phased array radar: Compulsive hoarding,” in 2019 International Radar Conference (RADAR). IEEE, 2019, pp. 1–5. 3. E. Fishler, A. Haimovich, R. Blum, D. Chizhik, L. Cimini, and R. Valenzuela, “MIMO radar: An idea whose time has come,” in Proceedings of the 2004 IEEE Radar Conference (IEEE Cat. No. 04CH37509). IEEE, 2004, pp. 71–78. 4. J. P. Browning, D. R. Fuhrmann, and M. Rangaswamy, “A hybrid MIMO phased-array concept for arbitrary spatial beampattern synthesis,” in 2009 IEEE 13th digital signal processing workshop and 5th IEEE signal processing education workshop. IEEE, 2009, pp. 446–450. 5. D. R. Fuhrmann, J. P. Browning, and M. Rangaswamy, “Signaling strategies for the hybrid MIMO phased-array radar,” IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 1, pp. 66–78, 2010.
394
Z. Du and F. Liu
6. A. Hassanien and S. A. Vorobyov, “Phased-MIMO radar: A tradeoff between phased-array and MIMO radars,” IEEE Transactions on Signal Processing, vol. 58, no. 6, pp. 3137–3151, 2010. 7. A. J. Paulraj and T. Kailath, “Rincreasing capacity in wireless broadcast systems using distributed transmission/directional reception (DTDR),” Patent, 1994. 8. G. J. Foschini and M. J. Gans, “On limits of wireless communications in a fading environment when using multiple antennas,” Wireless Personal Communications, vol. 6, no. 3, pp. 311–335, 1998. 9. E. Telatar, “Capacity of multi-antenna Gaussian channels,” European Transactions on Telecommunications, vol. 10, no. 6, pp. 585–595, 1999. 10. T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Transactions on Wireless Communications, vol. 9, no. 11, pp. 3590–3600, 2010. 11. T. S. Rappaport, S. Sun, R. Mayzus, H. Zhao, Y. Azar, K. Wang, G. N. Wong, J. K. Schulz, M. Samimi, and F. Gutierrez, “Millimeter wave mobile communications for 5G cellular: It will work!” IEEE Access, vol. 1, pp. 335–349, 2013. 12. X. Zhang, A. F. Molisch, and S.-Y. Kung, “Variable-phase-shift-based RF-baseband codesign for MIMO antenna selection,” IEEE Transactions on Signal Processing, vol. 53, no. 11, pp. 4091–4103, 2005. 13. O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, “Spatially sparse precoding in millimeter wave MIMO systems,” IEEE transactions on wireless communications, vol. 13, no. 3, pp. 1499–1513, 2014. 14. T. Gong, N. Shlezinger, S. S. Ioushua, M. Namer, Z. Yang, and Y. C. Eldar, “Rf chain reduction for MIMO systems: A hardware prototype,” IEEE Systems Journal, vol. 14, no. 4, pp. 5296– 5307, 2020. 15. Huawei Technologies, 2021. [Online]. Available: https://www.huawei.com/cn/technologyinsights/future-technologies/6g-isac-th 16. Y. Cui, F. Liu, X. Jing, and J. Mu, “Integrating sensing and communications for ubiquitous IoT: Applications, trends, and challenges,” IEEE Network, vol. 35, no. 5, pp. 158–167, 2021. 17. L. Han and K. Wu, “Joint wireless communication and radar sensing systems–state of the art and future prospects,” IET Microwaves, Antennas & Propagation, vol. 7, no. 11, pp. 876–885, 2013. 18. Q. Zhang, H. Sun, X. Gao, X. Wang, and Z. Feng, “Time-division ISAC enabled connected automated vehicles cooperation algorithm design and performance evaluation,” IEEE Journal on Selected Areas in Communications, 2022. 19. V. Winkler and J. Detlefsen, “Automotive 24 ghz pulse radar extended by a DQPSK communication channel,” in 2007 European Radar Conference. IEEE, 2007, pp. 138–141. 20. F. Liu, C. Masouros, A. Li, H. Sun, and L. Hanzo, “Mu-MIMO communications with MIMO radar: From co-existence to joint transmission,” IEEE Transactions on Wireless Communications, vol. 17, no. 4, pp. 2755–2770, 2018. 21. R. Fu, S. Mulleti, T. Huang, Y. Liu, and Y. C. Eldar, “Hardware prototype demonstration of a cognitive radar with sparse array antennas,” Electronics Letters, vol. 56, no. 22, pp. 1210–1212, 2020. 22. M. Jamil, H.-J. Zepernick, and M. I. Pettersson, “On integrated radar and communication systems using oppermann sequences,” in MILCOM 2008-2008 IEEE Military Communications Conference. IEEE, 2008, pp. 1–6. 23. X. Chen, Z. Feng, Z. Wei, P. Zhang, and X. Yuan, “Code-division OFDM joint communication and sensing system for 6G machine-type communication,” IEEE Internet of Things Journal, vol. 8, no. 15, pp. 12 093–12 105, 2021. 24. S. Sen and A. Nehorai, “Adaptive OFDM radar for target detection in multipath scenarios,” IEEE Transactions on Signal Processing, vol. 59, no. 1, pp. 78–90, 2010. 25. ——, “OFDM mimo radar with mutual-information waveform design for low-grazing angle tracking,” IEEE Transactions on Signal Processing, vol. 58, no. 6, pp. 3152–3162, 2010. 26. T. Zhang and X.-G. Xia, “OFDM synthetic aperture radar imaging with sufficient cyclic prefix,” IEEE Transactions on Geoscience and Remote Sensing, vol. 53, no. 1, pp. 394–404, 2014.
Integrated Sensing and Communications for Emerging Applications in 6G. . .
395
27. M. Li, W.-Q. Wang, and Z. Zheng, “Communication-embedded OFDM chirp waveform for delay-Doppler radar,” IET Radar, Sonar & Navigation, vol. 12, no. 3, pp. 353–360, 2018. 28. W. Jia, W.-Q. Wang, Y. Hou, and S. Zhang, “Integrated communication and localization system with OFDM-chirp waveform,” IEEE Systems Journal, vol. 14, no. 2, pp. 2464–2472, 2019. 29. N. Levanon, “Multifrequency complementary phase-coded radar signal,” IEE ProceedingsRadar, Sonar and Navigation, vol. 147, no. 6, pp. 276–284, 2000. 30. R. Hadani, S. Rakib, M. Tsatsanis, A. Monk, A. J. Goldsmith, A. F. Molisch, and R. Calderbank, “Orthogonal time frequency space modulation,” in 2017 IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 2017, pp. 1–6. 31. C. Sturm and W. Wiesbeck, “Joint integration of digital beam-forming radar with communication,” in 2009 IET International Radar Conference. IET, 2009, pp. 1–4. 32. ——, “Waveform design and signal processing aspects for fusion of wireless communications and radar sensing,” Proceedings of the IEEE, vol. 99, no. 7, pp. 1236–1259, 2011. 33. G. Hakobyan and B. Yang, “A novel intercarrier-interference free signal processing scheme for OFDM radar,” IEEE Transactions on Vehicular Technology, vol. 67, no. 6, pp. 5158–5167, 2017. 34. M. F. Keskin, H. Wymeersch, and V. Koivunen, “MIMO-OFDM joint radar-communications: Is ICI friend or foe?” IEEE Journal of Selected Topics in Signal Processing, vol. 15, no. 6, pp. 1393–1408, 2021. 35. M. Bic˘a and V. Koivunen, “Generalized multicarrier radar: Models and performance,” IEEE Transactions on Signal Processing, vol. 64, no. 17, pp. 4389–4402, 2016. 36. Z. Wei, W. Yuan, S. Li, J. Yuan, G. Bharatula, R. Hadani, and L. Hanzo, “Orthogonal time-frequency space modulation: A promising next-generation waveform,” IEEE Wireless Communications, vol. 28, no. 4, pp. 136–144, 2021. 37. P. Raviteja, K. T. Phan, Y. Hong, and E. Viterbo, “Orthogonal time frequency space (OTFS) modulation based radar system,” in 2019 IEEE Radar Conference (RadarConf). IEEE, 2019, pp. 1–6. 38. L. Gaudio, M. Kobayashi, G. Caire, and G. Colavolpe, “On the effectiveness of OTFS for joint radar parameter estimation and communication,” IEEE Transactions on Wireless Communications, vol. 19, no. 9, pp. 5951–5965, 2020. 39. M. Roberton and E. Brown, “Integrated radar and communications based on chirped spreadspectrum techniques,” in IEEE MTT-S International Microwave Symposium Digest, 2003, vol. 1. IEEE, 2003, pp. 611–614. 40. G. N. Saddik, R. S. Singh, and E. R. Brown, “Ultra-wideband multifunctional communications/radar system,” IEEE Transactions on Microwave Theory and Techniques, vol. 55, no. 7, pp. 1431–1437, 2007. 41. A. Hassanien, M. G. Amin, Y. D. Zhang, and F. Ahmad, “Dual-function radar-communications: Information embedding using sidelobe control and waveform diversity,” IEEE Transactions on Signal Processing, vol. 64, no. 8, pp. 2168–2181, 2015. 42. A. Hassanien, M. G. Amin, Y. D. Zhang, F. Ahmad, and B. Himed, “Non-coherent PSK-based dual-function radar-communication systems,” in 2016 IEEE Radar Conference (RadarConf). IEEE, 2016, pp. 1–6. 43. E. BouDaher, A. Hassanien, E. Aboutanios, and M. G. Amin, “Towards a dual-function MIMO radar-communication system,” in 2016 IEEE Radar Conference (RadarConf). IEEE, 2016, pp. 1–6. 44. T. Huang, N. Shlezinger, X. Xu, Y. Liu, and Y. C. Eldar, “Majorcom: A dual-function radar communication system using index modulation,” IEEE Transactions on Signal Processing, vol. 68, pp. 3423–3438, 2020. 45. F. Liu, L. Zhou, C. Masouros, A. Li, W. Luo, and A. Petropulu, “Toward dual-functional radarcommunication systems: Optimal waveform design,” IEEE Transactions on Signal Processing, vol. 66, no. 16, pp. 4264–4279, 2018. 46. X. Liu, T. Huang, N. Shlezinger, Y. Liu, J. Zhou, and Y. C. Eldar, “Joint transmit beamforming for multiuser MIMO communications and MIMO radar,” IEEE Transactions on Signal Processing, vol. 68, pp. 3929–3944, 2020.
396
Z. Du and F. Liu
47. F. Liu, C. Masouros, A. Li, H. Sun, and L. Hanzo, “Mu-MIMO communications with MIMO radar: From co-existence to joint transmission,” IEEE Transactions on Wireless Communications, vol. 17, no. 4, pp. 2755–2770, 2018. 48. L. Chen, F. Liu, W. Wang, and C. Masouros, “Joint radar-communication transmission: A generalized pareto optimization framework,” IEEE Transactions on Signal Processing, vol. 69, pp. 2752–2765, 2021. 49. G. Dimitrakopoulos and P. Demestichas, “Intelligent transportation systems,” IEEE Vehicular Technology Magazine, vol. 5, no. 1, pp. 77–84, 2010. 50. Z. Xiao and Y. Zeng, “An overview on integrated localization and communication towards 6G,” Science China Information Sciences, vol. 65, no. 3, pp. 1–46, 2022. 51. H. Wymeersch, G. Seco-Granados, G. Destino, D. Dardari, and F. Tufvesson, “5G mmWave positioning for vehicular networks,” IEEE Wireless Communications, vol. 24, no. 6, pp. 80–86, 2017. 52. N. Gonzalez-Prelcic, R. Méndez-Rial, and R. W. Heath, “Radar aided beam alignment in mmWave v2i communications supporting antenna diversity,” in 2016 Information Theory and Applications Workshop (ITA). IEEE, 2016, pp. 1–7. 53. A. Ali, N. Gonzalez-Prelcic, R. W. Heath, and A. Ghosh, “Leveraging sensing at the infrastructure for mmWave communication,” IEEE Communications Magazine, vol. 58, no. 7, pp. 84–89, 2020. 54. D. Zhu, J. Choi, and R. W. Heath, “Auxiliary beam pair enabled AoD and AoA estimation in closed-loop large-scale millimeter-wave MIMO systems,” IEEE Transactions on Wireless Communications, vol. 16, no. 7, pp. 4770–4785, 2017. 55. R. Faragher, “Understanding the basis of the kalman filter via a simple and intuitive derivation [lecture notes],” IEEE Signal Processing Magazine, vol. 29, no. 5, pp. 128–132, 2012. 56. F. Liu, W. Yuan, C. Masouros, and J. Yuan, “Radar-assisted predictive beamforming for vehicular links: Communication served by sensing,” IEEE Transactions on Wireless Communications, vol. 19, no. 11, pp. 7704–7719, 2020. 57. W. Yuan, F. Liu, C. Masouros, J. Yuan, D. W. K. Ng, and N. González-Prelcic, “Bayesian predictive beamforming for vehicular networks: A low-overhead joint radar-communication approach,” IEEE Transactions on Wireless Communications, vol. 20, no. 3, pp. 1442–1456, 2020. 58. E. Dahlman, S. Parkvall, and J. Skold, 5G NR: The next generation wireless access technology. Academic Press, 2020. 59. Z. Du, F. Liu, W. Yuan, C. Masouros, Z. Zhang, and G. Caire, “Integrated sensing and communications for v2i networks: Dynamic predictive beamforming for extended vehicle targets,” vol. 22, no. 6, pp. 3612–3627, 2023. 60. A. Sabharwal, P. Schniter, D. Guo, D. W. Bliss, S. Rangarajan, and R. Wichman, “Inband full-duplex wireless: Challenges and opportunities,” IEEE Journal on Selected Areas in Communications, vol. 32, no. 9, pp. 1637–1652, 2014. 61. Z. Xiao and Y. Zeng, “Waveform design and performance analysis for full-duplex integrated sensing and communication,” IEEE Journal on Selected Areas in Communications, vol. 40, no. 6, pp. 1823–1837, 2022. 62. X. Meng, F. Liu, C. Masouros, W. Yuan, Q. Zhang, and Z. Feng, “Vehicular connectivity on complex trajectories: Roadway-geometry aware isac beam-tracking,” arXiv preprint arXiv:2205.11749, 2022. 63. A. Zhang, M. L. Rahman, X. Huang, Y. J. Guo, S. Chen, and R. W. Heath, “Perceptive mobile networks: Cellular networks with radio vision via joint communication and radar sensing,” IEEE Vehicular Technology Magazine, vol. 16, no. 2, pp. 20–30, 2020. 64. M. L. Rahman, J. A. Zhang, X. Huang, Y. J. Guo, and R. W. Heath, “Framework for a perceptive mobile network using joint communication and radar sensing,” IEEE Transactions on Aerospace and Electronic Systems, vol. 56, no. 3, pp. 1926–1941, 2019. 65. Z. Ni, J. A. Zhang, X. Huang, K. Yang, and J. Yuan, “Uplink sensing in perceptive mobile networks with asynchronous transceivers,” IEEE Transactions on Signal Processing, vol. 69, pp. 1287–1300, 2021.
Integrated Sensing and Communications for Emerging Applications in 6G. . .
397
66. N. Mehrotra and A. Sabharwal, “On the degrees of freedom region for simultaneous imaging & uplink communication,” IEEE Journal on Selected Areas in Communications, vol. 40, no. 6, pp. 1768–1779, 2022. 67. X. Lin, J. Li, R. Baldemair, J.-F. T. Cheng, S. Parkvall, D. C. Larsson, H. Koorapaty, M. Frenne, S. Falahati, A. Grovlen et al., “5G new radio: Unveiling the essentials of the next generation wireless access technology,” IEEE Communications Standards Magazine, vol. 3, no. 3, pp. 30–37, 2019. 68. Y. Shen and M. Z. Win, “Fundamental limits of wideband localization—Part I: A general framework,” IEEE Transactions on Information Theory, vol. 56, no. 10, pp. 4956–4980, 2010. 69. S. Kay, “Waveform design for multistatic radar detection,” IEEE Transactions on Aerospace and Electronic Systems, vol. 45, no. 3, pp. 1153–1166, 2009. 70. A. M. Haimovich, R. S. Blum, and L. J. Cimini, “MIMO radar with widely separated antennas,” IEEE signal processing magazine, vol. 25, no. 1, pp. 116–129, 2007. 71. M. Peng, Y. Sun, X. Li, Z. Mao, and C. Wang, “Recent advances in cloud radio access networks: System architectures, key techniques, and open issues,” IEEE Communications Surveys & Tutorials, vol. 18, no. 3, pp. 2282–2308, 2016. 72. C. B. Barneto, T. Riihonen, M. Turunen, L. Anttila, M. Fleischer, K. Stadius, J. Ryynänen, and M. Valkama, “Full-duplex OFDM radar with LTE and 5G NR waveforms: Challenges, solutions, and measurements,” IEEE Transactions on Microwave Theory and Techniques, vol. 67, no. 10, pp. 4042–4054, 2019. 73. F. Restuccia, “IEEE 802.11 bf: Toward ubiquitous Wi-Fi sensing,” arXiv preprint arXiv:2103.14918, 2021. 74. Y. He, Y. Chen, Y. Hu, and B. Zeng, “WiFi vision: Sensing, recognition, and detection with commodity MIMO-OFDM WiFi,” IEEE Internet of Things Journal, vol. 7, no. 9, pp. 8296– 8317, 2020. 75. C. Yang and H.-R. Shao, “WiFi-based indoor positioning,” IEEE Communications Magazine, vol. 53, no. 3, pp. 150–157, 2015. 76. B. Tan, Q. Chen, K. Chetty, K. Woodbridge, W. Li, and R. Piechocki, “Exploiting WiFi channel state information for residential healthcare informatics,” IEEE Communications Magazine, vol. 56, no. 5, pp. 130–137, 2018. 77. V. C. Chen, F. Li, S.-S. Ho, and H. Wechsler, “Micro-doppler effect in radar: phenomenon, model, and simulation study,” IEEE Transactions on Aerospace and electronic systems, vol. 42, no. 1, pp. 2–21, 2006. 78. Y. Kim and H. Ling, “Human activity classification based on micro-doppler signatures using a support vector machine,” IEEE Transactions on Geoscience and Remote Sensing, vol. 47, no. 5, pp. 1328–1337, 2009. 79. M. G. Amin, Y. D. Zhang, F. Ahmad, and K. D. Ho, “Radar signal processing for elderly fall detection: The future for in-home monitoring,” IEEE Signal Processing Magazine, vol. 33, no. 2, pp. 71–80, 2016. 80. H. Wang, D. Zhang, Y. Wang, J. Ma, Y. Wang, and S. Li, “Rt-fall: A real-time and contactless fall detection system with commodity WiFi devices,” IEEE Transactions on Mobile Computing, vol. 16, no. 2, pp. 511–526, 2016. 81. X. Wang, C. Yang, and S. Mao, “PhaseBeat: Exploiting CSI phase data for vital sign monitoring with commodity WiFi devices,” in 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). IEEE, 2017, pp. 1230–1239. 82. H. Abdelnasser, M. Youssef, and K. A. Harras, “Wigest: A ubiquitous wifi-based gesture recognition system,” in 2015 IEEE Conference on Computer Communications (INFOCOM). IEEE, 2015, pp. 1472–1480. 83. F. Colone, D. Pastina, P. Falcone, and P. Lombardo, “WiFi-based passive ISAR for highresolution cross-range profiling of moving targets,” IEEE Transactions on Geoscience and Remote Sensing, vol. 52, no. 6, pp. 3486–3501, 2013.
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to Simultaneous Transmission and Reflection (STAR) Xidong Mu, Jiaqi Xu, and Yuanwei Liu
1 Introduction The sixth-generation (6G) wireless communication system is expected to be transformative and revolutionary from the rate-centric enhanced mobile broadband (eMBB), ultra-reliable, low-latency communication (URLLC), and massive machine-type communication (mMTC) services, which are the targets of designing the fifth-generation (5G) communication systems, to 6G goals of data-driven, instantaneous, ultra-massive, and ubiquitous wireless connectivity, as well as connected intelligence [1, 2]. The massive multiple-input multiple-output (MIMO) technology, which equips base stations (BSs) with an array of active antennas, improves the spectrum efficiency of next-generation systems. A related concept, namely, reconfigurable intelligent surfaces (RISs), comprises an array of reflecting elements for reconfiguring the propagation of the incident signals. Thus, RISs can be regarded as the pathway toward massive MIMO 2.0 [3]. Owing to their capability of proactively modifying the wireless communication environment, RISs have become a focal point in the wireless communications research field for mitigating a wide range of challenges encountered in diverse wireless networks [4, 5]. The advantages of RISs are listed as follows: • Easy to deploy: An RIS is a passive two-dimensional surface, which is made of electromagnetic material. RISs can be deployed on several structures, including but not limited to building facades, indoor walls, aerial platforms, roadside billboards, highway polls, vehicle windows, as well as pedestrians’ clothes due to their low cost.
X. Mu · J. Xu · Y. Liu (O) School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK e-mail: [email protected]; [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_15
399
400
X. Mu et al.
• Spectrum efficiency enhancement: An RIS is capable of reconfiguring the wireless propagation environment by compensating for the power loss over long distances. Virtual line-of-sight (LoS) links between BSs and mobile users can be formed via passively reflecting their received signals. The throughput enhancement becomes more significant when the LoS link between BSs and users is blocked by high-rise buildings with high probability. Due to the intelligent deployment and design of RISs, a software-defined wireless environment may be constructed, which, in turn, provides potential enhancements in the received signal-to-interference-plus-noise ratio (SINR). • Environment friendly: In contrast to the conventional relaying system, e.g., amplify-and-forward (AF) and decode-and-forward (DF), an RIS is capable of amplifying and forwarding the incoming signal by controlling the phase shift of each reflecting element instead of employing any power amplifier [6, 7]. Thus, deploying an RIS is more energy-efficient and environment-friendly than conventional AF and DF systems. • Compatibility: RISs support full-duplex and full-band transmission, since they only passively modify the electromagnetic waves. Additionally, RIS-assisted wireless networks are compatible with the standards and hardware of existing wireless networks. Due to the aforementioned attractive characteristics, RISs are recognized as an effective solution for mitigating a wide range of challenges in commercial and civilian applications. Figure 1 illustrates the application of RISs in diverse wireless communication networks. As illustrated, an RIS is deployed for bypassing
Fig. 1 RIS enhanced 6G cellular networks
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
401
the obstacles between BSs and users. Thus, the service quality in heterogeneous networks and latency performance in mobile edge computing (MEC) networks is improved [8]. On the other hand, an RIS can act as a signal reflection hub to support massive connectivity via interference mitigation in device-to-device (D2D) communication networks [9], or an RIS can cancel the undesired signals by smartly designing the passive beamforming in the context of physical layer security (PLS) networks [10]. Additionally, an RIS can be deployed to strengthen the received signal power of cell-edge users and mitigate interferences from neighbor cells [11], while the significant power loss over long distances can also be compensated in simultaneous wireless information and power transfer (SWIPT) networks [12]. This chapter will provide an overview of RIS-assisted wireless communication toward 6G, with a particular focus on the new concept of simultaneously transmitting and reflecting RIS (STAR-RIS). The reminder of this chapter is organized as follows: Sect. 2 first introduces the basics of conventional RISs which work in the reflection mode. Then, Sect. 3 presents the new STAR-RISs which can achieve the 360.◦ coverage, where the key differences between conventional reflectingonly RISs and STAR-RISs are identified from the different perspectives. Section 4 further discusses the possible hardware implementations and hardware modelling for STAR-RISs. Moreover, Sect. 5 presents the basic communication signal models and three practical operating protocols of STAR-RISs, as well as numerical case studies for revealing the superiority of STAR-RISs. Section 6 put forward a range of promising application scenarios for integrating STAR-RISs into next-generation wireless networks. Finally, Sect. 7 concludes this chapter.
2 The Basics of RISs An RIS is a two-dimensional (2D) material structure that is reconfigurable in terms of its EM wave response. In contrast to conventional wireless communications, deploying the RIS can control and configure the channel between the transmitters and the receivers. In this section, we introduce the RIS basics, including reflection coefficient models, path loss models, and typical functions in communication design.
2.1 Reflection Coefficient Models The EM characteristics of an RIS, such as the phase discontinuity, can be reconfigured by tuning the surface impedance, through various mechanisms. Apart from electrical voltage, other mechanisms can be applied, including thermal excitation, optical pump, and physical stretching. Among them, electrical control is the most convenient choice, since the electrical voltage is easier to be quantized and
402
X. Mu et al.
Fig. 2 Schematic diagram of the varactor RIS
controlled by field-programmable gate array (FPGA) chips. The choice of RIS materials includes semiconductors [13] and graphene [14]. Regardless of the tuning mechanisms, we focus our attention on the patch-arraybased RISs in the following text. The general geometry layout of this type of RIS can be modeled as a periodic (or quasi-periodic in the most general case) collection of unit cells integrated on a substrate. For ease of description, in addition, we limit our discussion to RISs that are based on a local design, in which the cells do not interact with each other. A local design usually results in the design of sub-optimal RISs. A comprehensive discussion about local and non-local designs can be found in [4]. To characterize the tunability of the RIS, the method of equivalent lumpedelement circuits can be adopted. As shown in Fig. 2, the unit cell is equivalent to a lumped-element circuit with a load impedance .Zl . Particularly, the equivalent load impedance can be tuned by changing the bias voltage of the varactor diode. When modelling patch-array RISs in wireless communication systems, we can characterize, under a local design, each of its unit cells through an equivalent reflection coefficient. For example, the reflection coefficient of the n-th cell can be modeled as: rM = βm · ej θm
.
(1)
where .βm and .θm correspond to the amplitude response and the phase shift response, respectively, which generally follow .βm ∈ [0, 1] and .θm ∈ [0, 2π ). In particular, the following three models of the amplitude response and the phase shift response are widely adopted in current research contributions. • Continuous amplitude and phase shift: In this case, it is assumed that the amplitude and phase shift of each RIS element can be adjusted continuously, which results in the following feasible set. F1 = {βm , θm |βm ∈ [0, 1] , θm ∈ [0, 2π )} .
.
(2)
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
403
• Constant amplitude and continuous phase shift: In this case, it is assumed that the amplitude and phase shift of each RIS element are fixed, e.g., .βm = 1, and can be adjusted continuously, respectively. The corresponding feasible set is given by F2 = {βm , θm |βm = 1, θm ∈ [0, 2π )} .
.
(3)
• Constant amplitude and discrete phase shift: In this case, it is assumed that the amplitude and phase shift of each RIS element are fixed, e.g., .βm = 1, and can be adjusted based on a discrete set of values, respectively. The feasible set can be expressed as F3 = {βm , θm |βM = 1, θm ∈ D} ,
.
(4)
} { 2π and B denotes the number of candidate , · · · , − 1) where .D = 0, 2π (B B B phase shifts. It is worth noting that due to the hardware constraints in practice, it is quite challenging to realize continuous amplitude and phase shift control in the first two models. However, the first two models can be used to characterize the theoretical performance upper bounds of RISs. Besides the above three models, there are also some other signal models, namely, discrete amplitude and phase shift, where both .βm and .θm are discrete values, and coupled amplitude and phase shift, where .βM and .θM in (1) are usually not independent of each other, i.e., .βm = f (θm ).
2.2 Path Loss Models Existing research contributions [4, 15] on the path loss models of RISs showed that the power scattered by an RIS is usually formulated in terms of an integral that accounts, by leveraging the Huygens principle, for the impact of the entire surface in the free-space scenario, where scattering, shadowing, and reflection are ignored. Closed-form expressions of the integral are, on the other hand, difficult to obtain, except for some asymptotic regimes, which correspond to viewing the RIS as electrically small and electrically large (with respect to the wavelength and the transmission distances). It is worth mentioning, in addition, that the path loss model depends on the particular phase gradient applied by the RIS. Notably, the scaling laws can be different if the RIS operates as an anomalous reflector and as a focusing lens. In the following, we briefly discuss two scaling laws that have recently been reported for RISs that operate as anomalous reflectors. • Electrically Small RISs: In this asymptotic regime, the RIS is assumed to be relatively small in size compared with the transmission distances. In this regime, the RIS can be approximated as a small-size scatterer. In general, the path loss
404
X. Mu et al.
scales with the reciprocal of the product of the distance between the transmitter and the center of the RIS and the distance between the center of the RIS and the receiver. In addition, the received power usually increases with the size of the RIS. The received power is usually maximized in the direction of anomalous reflection, where the path loss through the RIS the “product of distances” models, which can be formulated as L(dSR , dRU ) ≈ λS (dSR dRU )−1 ,
.
(5)
where .λS denotes the coefficient of the electrically small scenario and .dSR and dRU represent the distance of source-RIS and RIS-user links, respectively. A detailed discussed is given in [15, Sections IV-A, IV-B, IV-C]. • Electrically Large RISs: In this asymptotic regime, the RIS is assumed to be large (ideally infinitely large) in size compared with the transmission distances and the wavelength. In this regime, the RIS can be approximated as a large flat mirror. Let us denote by .x0 the point of the RIS (if it exists) at which the first-order derivative of the total phase response of the combined incident signal, reflected signal, and the surface reflection coefficient of the RIS is equal to zero. In general, the path loss asymptotically scales with the reciprocal of a weighted sum of the distance between the transmitter and .x0 and the distance between .x0 and the receiver. In addition, the received power is not dependent on the size of the RIS since it is viewed as asymptotically infinite. This result substantiates the fact that the power scaling law of the RIS is physically correct, since it does not grow to infinity as the size of the RIS goes to infinity. This is because the scaling law and the behavior of the RIS are different with respect to the electrically small regime. In this case, the path loss through the RIS follows the “sum of distances” models, which can be approximated as .
L(dSR , dRU ) ≈ λL (dSR + dRU )−1 ,
.
(6)
where .λL denotes the coefficient of the electrically large scenario. A detailed discussed is given in [15, Sections IV-A, IV-B, IV-C].
2.3 Typical Functions in Communication Designs Let us consider a single-antenna BS that communicates with a single-antenna user with the aid of an RIS with M elements. Here, let .h ∈ C1×1 , .g ∈ CM×1 , and .r ∈ CM×1 denote of the BS-user,) BS-RIS, and RIS-user links, respectively. ( thej θchannels jθ jθ .o = diag β1 e 1 , β2 e 2 , . . . , βM e M denotes the reflection-coefficient matrix of the RIS, where .{β1 , β2 . . . , βM } and .{θ1 , θ2 . . . , θM } represent the amplitude response and phase shift response of all the RIS elements, respectively. By employing the smart radio environment provided by the RIS, the following two typical functions can be achieved in communication designs.
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
405
• RIS-Assisted Signal Enhancement Designs: By coherently combining the signal of the direct link and the reflection link, the received signal by the user can be significantly enhanced, which leads to the following optimization problem: | |2 | | max |rH og + h| s.t. β1 , · · · , βM ∈ [0, 1] ,
.
(7)
θ1 , · · · , θM ∈ [0, 2π ) . In order to further enhance the spectrum efficiency of RIS-assisted networks, multiple antenna techniques can be employed at both BS and users. • RIS-Assisted Signal Cancellation Designs: Another application of deploying the RIS is signal cancellation, where the reflected signals and the direct signals can be destructively combined. The corresponding optimization problem can be formulated as follows: | |2 | | min |rH og + h| .
s.t. β1 , · · · , βM ∈ [0, 1] ,
(8)
θ1 , · · · , θM ∈ [0, 2π ) . By doing so, some promising applications can be realized, i.e., RIS-assisted interference cancellation and physical layer security.
3 From Conventional Reflecting-Only RISs to STAR-RISs Although some recent studies have considered both transmissive and reflective metasurfaces for wireless communications (as shown in Fig. 3a and b), the existing contributions mainly focus on RISs whose only function is to reflect an incident signal; hence both the source and the destination have to be at the same side of the RISs [16, 17], i.e., within the same half-space of the SRE. This topological constraint limits the flexibility of employing conventional RISs, and to address this issue, this chapter introduces the new concept of simultaneously transmitting and reflecting RISs (STAR-RISs) [18], where the incident wireless signals can be reflected within the half-space of the SRE at the same side of the RIS, but they can also be transmitted into the other side of the RIS, as shown in Fig. 3c. As a result, a full-space SRE can be created by STAR-RISs. The employment of STAR-RISs has the following advantages in wireless communication systems: (1) Thanks to their capability of simultaneously transmitting and reflecting the incident signals, the coverage of STAR-RISs is extended to the entire space, thus serving both half-spaces using a single RIS, which is not possible
X. Mu et al.
Reflection space
Transmission space
(a)
Reflection space
Surface
Surface
Surface
406
Transmission space
(b)
Reflection space
Transmission space
(c)
Fig. 3 Illustration of three types of signal propagation: (a) full reflection (conventional reflectingonly RIS), (b) full transmission, (c) simultaneous transmission and reflection
for conventional reflecting-only RISs. (2) STAR-RISs provide enhanced degrees-offreedom (DoFs) for signal propagation manipulation, which significantly increases the design flexibility in satisfying stringent communication requirements. (3) Since STAR-RISs can be designed to be optically transparent [19], they are aesthetically pleasing and readily compatible with existing building structures, such as windows. Therefore, STAR-RISs will have no undesired aesthetic effect, which is of vital importance for practical implementations. In the following, we discuss the key differences between conventional reflecting-only RISs and the proposed STARRISs from their hardware design, physics principles, and communication system design perspectives, respectively. We highlight that STAR-RISs rely on substrates, which are transparent at radio frequency and have elements, which support magnetic currents. These structural properties allow STAR-RISs to achieve simultaneous and independent control of their transmission and reflection coefficients.
3.1 Hardware Design Differences Reflecting-only RISs and STAR-RISs are different both in terms of their equipped elements and substrates. The following analogy illustrates the structural differences between reflecting-only RISs and STAR-RISs. For reflecting-only RISs, the reconfigurable elements on the substrate are like biscuits placed on a metal plate, as illustrated in the left side of Fig. 4, while, for STAR-RISs, the reconfigurable elements are like ice cubes in a glass of water, as illustrated in the right side of Fig. 4. To elaborate, the substrates of reflecting-only RISs are opaque for wireless signals at their operating frequency. The opaque substrate serves as a bed, on which the tunable elements are integrated. It also prevents the wireless signals from penetrating the RIS so that no energy is leaked into the space behind the RIS. By contrast, the substrates of STAR-RISs have to be transparent for wireless signals at their operating frequency. Naturally, for facilitating simultaneous transmission and reflection, STAR-RISs rely on a more complex design, since their elements have to
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
407
Fig. 4 Conceptual comparison between reflecting-only RISs and STAR-RISs
support both electric and magnetic currents. A beneficial practical hardware design is to employ equivalent principle inspired tunable metasurfaces [20], where each element is composed of a parallel resonant LC tank and small metallic loops to provide the required electric and magnetic surface reactance. Moreover, the electric and magnetic reactance of each element can be adjusted by applying different bias voltages to the integrated varactors, thus achieving an independent control of transmission and reflection.
3.2 Physics Principle Differences Again, compared to conventional reflecting-only RISs, STAR-RISs must have elements, which support both electric polarization currents .J p and magnetization currents .J b [21, 22]. The physical principles behind STAR-RISs can be summarized in three steps, namely, induction, production, and radiation, as illustrated in Fig. 4. • Firstly, the elements are polarized by the incident field. The patch elements of pure reflecting RISs only respond to the electric component of the incident field,
408
X. Mu et al.
and a polarization density .P is induced. By contrast, the elements of STARRISs respond to both the electric and magnetic components of the incident field. Hence, both polarization density .P and magnetization density .M are induced. The strengths of the polarization and magnetization densities depend on the electric susceptibility .χe and magnetic susceptibility .χm , respectively. The tunable parameters of the elements can be used to adjust the values of these susceptibilities within a certain quantization error. • Secondly, the oscillating polarization and magnetization densities produce timevarying electric polarization and magnetization currents on the surface, respectively. • Lastly, these time-varying currents radiate both the transmitted and reflected fields back into free space, producing phase differences between the incident field and the transmitted or reflected fields. As illustrated in Fig. 4, reflecting-only RISs having non-magnetic elements can only support surface electric polarization currents. If the elements consist of only single-layered metallic scatters (not considering the substrate), the radiated fields on both sides of the RIS are identical [23]. Thus, this symmetry limitation of nonmagnetic RISs does not facilitate the independent control of the transmitted and reflected signals. On the contrary, by also supporting magnetic currents, STAR-RISs break this symmetry limitation and can achieve simultaneous control of both the transmitted and reflected signals [22]. As illustrated in Fig. 4, assuming the electric and magnetic susceptibilities of each element are constant, the magnetization density .M introduces extra DoFs by enabling the independent adjustment of the phase shift for transmission.
3.3 Communication System Design Differences From the perspective of the communication system design, the benefits of supporting surface magnetic currents in STAR-RISs can be exploited as follows: • Adjustable energy ratio: The amplitudes of the transmitted and reflected waves of each STAR-RIS element can be dynamically adjusted. Since the overall energies of the transmitted and reflected signals are determined by the magnitudes of the respective complex-valued sums of the contributions of all elements, the energy ratio between the transmitted and reflected signals can be conveniently controlled. • Transmission and refelction beamforming: Again, the introduction of surface magnetic currents enables phase shift control for the transmitted and reflected signals. As a result, STAR-RISs allow two beamformings for both the transmitted and reflected signals within two half-spaces, thus improving the flexibility of communication system design.
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
409
4 STAR-RISs: Hardware Implementations and Models In this section, we summarize four promising techniques that may be used for implementing STAR-RISs and three hardware models which characterize the tuning capability of STAR-RISs at different levels of accuracy.
4.1 Hardware Implementations for STAR-RISs How the STAR concept can be implemented based on practical hardware designs is one of the most pivotal questions in the research of STAR-RISs. To address this issue, in this section, we survey and categorize the possible implementation options of STAR-RISs to achieve independent control of the reflected and transmitted signals. There are various tunable surface designs which are potential candidates for realizing STAR-RISs. In [24], the authors pointed out an intuitive difference between natural and artificial materials (RISs in general), namely, that natural materials exhibit uniform EM properties along their tangential directions, while artificial materials exhibit either a periodic or quasi-periodic nature. In terms of the periodic structure, we can loosely classify the hardware implementations of STAR-RISs into two categories, namely, the patch-array-based implementations and the metasurface-based implementations. As illustrated in Fig. 5, the patch-
(a) PIN Diode Patch-Array Empowered Based (b) STAR-RISs
The Phase-Shift Model or The Load Impedance Model
Antenna Empowered (c) DOCOMO’s Smart Glass (d)
Metasurface Graphene based Empowered Implementations
The GSTC Model
Hardware Models
Fig. 5 Framework for analyzing STAR-RISs. (a) PIN diode empowered STAR-RISs [25], (b) antenna empowered STAR-RISs [26], (c) smart glass empowered STAR-RIS [19], (d) graphene empowered STAR-RISs
410
X. Mu et al.
array-based implementations consist of periodic cells having sizes on the order of a few centimeters. Because of their relatively large sizes, each cell (patch) can be made tunable by incorporating either PIN diodes or delay lines. By contrast, the matasurface-based implementations have periodic cells on the order of a few millimeters, possibly micrometers or even molecular sizes. Hence they require more sophisticated controls of their EM properties, such as conductivity and permittivity. Below, we provide a brief overview of a pair of patch-array-based implementations and two metasurface-based implementations. All these hardware implementations have had successful prototypes built or rely on strong theoretical evidence in support of their feasibility.
4.1.1
Patch-Array-Based STAR-RISs
1. PIN Diode Empowered Implementations: For patch-array-based implementations, both the phase shift and amplitude response can be tuned by applying different bias voltages to the positive intrinsic negative (PIN) diodes. In [25], the authors presented a STAR-RIS prototype relying on the PIN diode empowered implementation. This implementation is the most popular design for both RISs and STAR-RISs since PIN diodes are of low cost and are voltage-controlled. The drawback of this implementation is that since PIN diodes only have two states, namely, “on” or “off,” this implementation can only support a finite-cardinality reflection and transmission coefficient set. Moreover, for a given state of all the PIN diodes, the reflection and transmission coefficients are coupled. As a result, the PIN diode empowered implementation struggles to mimic independent control of both reflection and transmission unless a sufficiently high number of PIN diodes are used for each patch element. 2. Antenna Empowered Implementations: The concept of phased-array antennas may be readily extended to STAR-RISs with some minor modifications. As illustrated in Fig. 5b, according to [26], each antenna empowered STAR-RIS cell actually consists of two antenna elements, which are connected by a tunable delay line (waveguide). The antenna elements facing the incident wave operate similarly to the reflecting-only RIS elements, but a certain fraction of the incident energy is transferred along the delay lines, and it is re-radiated into the transmission space. The phase of the transmission coefficient of each cell is determined by the length of the delay line. Thus, the phase shifts of both the reflected and the transmitted signals can be independently adjusted. However, the drawback of this implementation is that the delay line may impose a considerable energy loss. If the desired phase shift of the transmitted signal is high, the amplitude of the transmission coefficient will be reduced. Thus, the amplitude and phase of the transmission coefficient are correlated in this implementation.
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
4.1.2
411
Metasurface-Based STAR-RISs
1. Smart Glass Empowered Implementations: A popular prototype of the metasurface-based STAR-RISs is the transparent dynamic metasurface designed by researchers at NTT DOCOMO, Japan (Fig. 5c). According to [19], the metasurface supports the manipulation of 28 GHz (5G) radio signals. It allows dynamic control of both of the signal’s reflection and transmission while maintaining the transparency of the window. By adjusting the width of the dielectric material and its distance between substrates, the smart glass can be switched into different modes, such as full penetration (transmission), full reflection, and partial reflection. DOCOMO revealed that they are working on more sophisticated tuning techniques and will use them in future trials. The main advantage of DOCOMO’s smart glass is that owing to its transparency in the visible light frequency range, it can be aesthetically integrated into buildings. However, its drawback is that it does not have the ability to dynamically reconfigure itself as the PIN diode empowered implementations. Moreover, adjusting the distance between substrates may affect the reflection coefficient of the entire surface instead of only reconfiguring a particular element. 2. Graphene Empowered Implementations: It has been widely exploited that a single graphene layer has extraordinary properties, including a beneficial EM wave response that may also be used for building STAR-RISs. More importantly, to realize futuristic envisions such as smart surface-assisted visible light wireless transmission and wearable skin-like smart surfaces, we might have to rely on graphene empowered technologies. Indeed, there are already experimental graphene-based RF devices [27]. To achieve reconfigurability, a single layer of graphene has tunable reflection and transmission coefficients by adjusting its conductivity. Moreover, a periodic stack of graphene layers is capable of acting as a tunable spectrally selective mirror. We may summarize that for graphene empowered STAR-RISs, even the separation of a combined signal might become feasible based on the different carrier frequencies or polarizations. In a nut shell, graphene empowered implementations might open extraordinary possibilities for the design of smart radio environments once their fabrication process becomes more economical.
4.1.3
STAR-RIS Implementations and Operating Frequencies
Naturally, almost all designs can only operate as desired within a certain frequency range. This is because in order for the STAR-RIS to apply the desired phase shifts and wave-front transformations both to the transmitted and reflected signals, the length of periodicity in STAR-RISs has to match the wavelength of the wireless signal. For the patch-array-based implementations, the periodicity is usually chosen to be between .0.5λ and .0.7λ, where .λ is the wavelength of the wireless signal [28]. According to this relationship, the patch-array-based STAR-RISs are suitable for assisting wireless communication up to 1 GHz carrier frequency. For wireless signal
412
X. Mu et al.
Table 1 Comparing different implementations and operating frequencies of STAR-RISs Implementation Operating methods frequency Patch-array based Low to high frequency (10 KHz up to 1 GHz) Metasurface based Super high frequency to visible light frequency
STAR-RIS prototypes PIN diode empowered Antenna empowered Smart glass empowered Graphene empowered
Tuning mechanism Bias voltages on PIN diodes Lengths of delay lines Distance between substrates Conductivity of graphene
Independent reflection/transmission control Difficult to achieve Can be achieved Theoretically achievable Can be achieved
having a higher frequency and for visible light communication, metasurface-based STAR-RISs are required. In Table 1, we summarize the operating frequency and tuning mechanism of each STAR-RIS implementation discussed. It is worth noting that the current development of STAR-RIS prototypes is still at an early stage. For future development of the hardware design and manufacture of STAR-RISs, we need to find solutions to make STAR-RISs more scalable while maintaining the tunability of each individual element. This can be achieved by leveraging the recent achievements in metasurface technology and nano-engineering [24].
4.2 Hardware Models for STAR-RISs As discussed above, different STAR-RIS implementations are rather different in terms of their tuning mechanisms. However, we require a unified technique for modelling the effect of these surfaces on the wireless signal. Explicitly, we have to find an accurate hardware model for characterizing the EM wave response of the STAR-RISs. From a tangible physical perspective, modelling a smart surface is equivalent to the problem of studying the boundary conditions of the EM field at the surface. However, the interaction of an arbitrary field with the STAR-RIS is an intrinsically complex problem. All existing hardware models rely on certain levels of approximations based on their own different assumptions. Our next objective is to demonstrate and compare these assumptions, as well as to reveal the physical abstractions behind each model.
4.2.1
The Phase Shift Model
The phase shift model characterizes a smart surface using a collection of phase shift values or applying a specific phase shift as a function of the cell’s position on the surface. This function is also often referred to as the phase profile, phase
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
413
Fig. 6 Conceptual comparison between different models for the patch-array-based STAR-RIS implementations, where .Ym and .Zm are the surface electric and magnetic impedances, .χab is the surface susceptibility dyadics, and .(E, H) are the electric and magnetic components of the corresponding EM fields. (a) The phase shift model. (b) The load impedance model. (c) The GSTC model
discontinuity, or phase shift matrix, depending on the context. As illustrated in Fig. 6a, the physical abstractions laying the foundation of the phase shift model are as follows: The STAR-RIS can be regarded as a periodic array of either metallic or dielectric particles. Regardless of the specific geometric and electromagnetic properties of these particles, the reflected or transmitted field radiated from the STAR-RIS can be characterized by the superposition of waves radiated from different particles, each having a phase delay induced by the corresponding particle. As a result, the only hardware features of this model are the positions of the particles, i.e., the STAR-RIS elements and their corresponding phase shifts associated with the reflection and transmission, i.e., .AθR and .AθT in Fig. 6a. The phase shift model is widely adopted and convenient to use. However, it is an over-simplified representation of the actual physical process. As a result, it cannot accurately characterize either the energy flow at the surface or the non-local power transfer effects [24].
4.2.2
The Load Impedance Model
In the load impedance model, each element is modeled as a lumped circuit having surface-averaged electric and magnetic impedances of .Ym and .Zm . As illustrated in Fig. 6b, the physical processes of wave reflection and transmission may be portrayed as follows: each element of the STAR-RIS is excited by the incident wave. After being excited, both electrical and magnetic currents are induced, whose intensity depends on the effective voltage of the incident wave and the equivalent load impedance of the circuit element. Finally, the currents induced generate an EM field, which is radiated toward both sides of the STAR-RIS. As a result, the hardware features of the model are the position and load impedances of each passive element. The problem of determining the field radiated by the currents flowing through each element is left to deal with by the channel model. The load impedance model can be reduced to the phase shift model by incorporating some further idealized
414
X. Mu et al.
simplifying assumptions because both the reflection and transmission coefficients of the surface can be formulated as a function of the surface impedances. Specifically, the reflection and transmission coefficients of the mth element are defined as the ratio between electric fields, which may be represented by complex numbers [29]. The argument of the reflection and transmission coefficients for each element corresponds to the phase delay values in the phase shift hardware model. In light of this, the phase shift model can be regarded as a simplified version of the load impedance model.
4.2.3
The Generalized Sheet Transition Conditions Model
The generalized sheet transition conditions (GSTC) model [24] is the most general one of the three hardware models discussed because it is based on a continuous distribution of the electric and magnetic polarization densities of the surface, instead of relying on a finite number of impedance values. The GSTC model uses the electric and magnetic susceptibilities as a function of position on the surface for characterizing the smart surface. According to Maxwell’s equations, the conventional boundary conditions at the surface describe the discontinuity of the EM field in terms of the surface electric and magnetic polarization densities. In the GSTC model, these surface electric and magnetic polarization densities are induced by the incident field and depend on the polarizability densities of the material. As illustrated in Fig. 6c, the reflected and transmitted fields can be formulated using only material-dependent surface susceptibility dyadics, .χab (x, y), as a function of the position on the STAR-RIS [24]. Moreover, the surface susceptibility dyadics are determined by the electric and magnetic polarizabilities of the scatterers, which are microscopic properties of the material. Thus, the GSTC model is capable of describing the metasurface-based STAR-RIS implementations relying on small periodic structures. At the same time, the GSTC model can also be used for modelling patch-array-based implementations by taking the surface average of the electric and magnetic polarizabilities within each element. In Table 2, we summarize the characteristics of the three STAR-RIS hardware models discussed. The phase shift and load impedance models best represent the patch-array-based implementations, while the GSTC model accurately mimics the metasurface-based implementations. At the time of writing, the existing papers on modelling and analyzing STAR-RISs have adopted only the phase shift model [5– 8], which is a far-field, ray-optics approximation of the actual physical process. Future directions for STAR-RIS channel hardware modelling include proposing more physics-compliant models exploiting the abovementioned load impedance and GSTC models. Moreover, based on different STAR-RIS hardware implementations, the correlation between the phase shifts of the reflected and transmitted signals should also be considered during hardware modelling.
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
415
Table 2 Comparing different hardware models for STAR-RISs Hardware Properties used models for modelling Phase shift model Phase shift (delay) values Load impedance Surface averaged model impedances GSTC model Surface susceptibility dyadics
Apply to Patch-array-based STAR-RISs Patch-array-based STAR-RISs Metasurface-based STAR-RISs
Advantages Compact and easy to use Compact and accurate General and accurate
Disadvantages Oversimplified Not general Complicated
5 STAR-RISs: Signal Model and Operating Protocols In this section, we introduce the basic signal model for STAR-RISs and then propose three practical operating protocols for integrating STAR-RISs into wireless communication systems while identifying their respective advantages and disadvantages.
5.1 Basic Signal Model As shown in Fig. 7, a schematic representation of the structure of STAR-RISs is presented. Based on the field equivalence principle [5], as the STAR-RIS elements are excited by the incident signal, the transmitted and reflected signals can be equivalently treated as waves radiated from the time-varying surface equivalent electric currents .J p and magnetic currents .J b (also referred to as the bound currents). Within each element, the strengths and distribution of these surface equivalent currents are determined by the incident narrowband signal .sm as well as the local surface averaged electric and magnetic impedances .Ym and .Zm . Assume that the STAR-RIS produces both transmitted and reflected signals with the same polarization. At the mth element, these signals can be expressed as T R sm = Tm sm , sm = Rm sm ,
.
(9)
where .Tm and .Rm are the transmission and reflection coefficients of the mth element, respectively. According to the law of energy conservation, for passive STARRIS elements, the following constraint on the local transmission and reflection coefficients must be satisfied: |Tm |2 + |Rm |2 ≤ 1.
.
(10)
416
X. Mu et al.
Fig. 7 Schematic illustration of the STAR-RIS
According to electromagnetic theory, the phase delays of both the transmitted and reflected fields are related to .Ym and .Zm . In Fig. 7, the reconfigurability of the element is reflected in the change of the surface impedances, since the transmission and reflection coefficients of the mth element are related to the surface 2(η02 Ym −Zm ) 0 Ym impedances as .Tm = 2−η 2+η0 Ym − Rm , and .Rm = − (2+η0 Ym )(2η0 +Zm ) , where .η0 is the impedance of free space [30]. From the perspective of metasurface design, supporting the magnetic currents is the key to achieve independent control of both the transmitted and reflected signals. According to [22], single-layered RISs with non-magnetic elements can only produce identical radiation on different sides, which is referred to as the symmetry limitation. By introducing the equivalent surface electric and magnetic currents into the model, the proposed hardware model is able to independently characterize the transmission and reflection of each element. To facilitate the design of the STAR-RISs in wireless communication systems, we rewrite these narrowband frequency-flat coefficients in the form of their amplitudes and phase shifts as follows: Tm =
.
/
T
T ej θm , R = βm m
/
R
R ej θm , βm
(11)
T , β R ∈ [0, 1] are real-valued coefficients satisfying .β T + β R ≤ 1 and where .βm m m m T R .θm , θm ∈ [0, 2π ) are the phase shifts introduced by element m for the transmitted T and .β R ) and reflected signals, .∀m ∈ {1, 2, · · · , M}. Note that the amplitudes (.βm m are always coupled due to the law of energy conservation. For the phase shifts (i.e., T R .θm and .θm ), there are two widely adopted models, namely, (1) independent phase shift model, where .θmT and .θmR can be adjusted independently, and phase | (2) coupled | shift model, where .θmT and .θmR have to satisfy the condition of .|θmT − θmR | = π2 , 3π 2 . Note that by adjusting the amplitude coefficients for transmission and reflection, each STAR-RIS element can be operated in full transmission mode (referred to as T mode), full reflection mode (referred to as R mode), and simultaneous transmission and reflection mode (referred to as T&R mode). Based on the above basic signal model of STAR-RISs, in the following, we propose three practical protocols for operating STAR-RISs in wireless networks, namely, energy splitting (ES), mode switching (MS), and time switching (TS), as illustrated in Fig. 8.
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
T&R mode
T mode
R mode
417
Pending R period T or R?
Reflected Signal
Incident Signal
Incident Signal
Switch
Reflected Signal Reflected Signal
T period
Incident Signal
Transmitted Signal
Transmitted Signal Transmitted Signal
(a)
(b)
(c)
Fig. 8 Illustration of three practical protocols for operating STAR-RISs. (a) Energy splitting (ES). (b) Mode switching (MS). (c) Time switching (TS)
5.2 Operating Protocols 5.2.1
Energy Splitting
For ES, all elements of the STAR-RIS are assumed to operate in T&R mode, as shown in Fig. 8a. For given transmission and reflection amplitude coefficients, the signals incident upon each element are split into transmitted and reflected signals having different energy. In a practical implementation, the amplitude and phase shift coefficients of each element for transmission and reflection can be jointly optimized for achieving diverse design objectives in wireless networks.
5.2.2
Mode Switching
In MS, all elements of the STAR-RIS are partitioned into two groups. Specifically, one group contains the elements that operate in T mode, while the other group contains the elements operating in R mode. As shown in Fig. 8b, an MS STARRIS can be viewed as being composed of a conventional reflecting-only RIS and a transmitting-only RIS of reduced sizes. For this protocol, the element-wise mode selection and the corresponding transmission and reflection phase shift coefficients can be jointly optimized. The resulting “on-off” type of protocol (i.e., transmission or reflection) makes MS easy to implement. However, the drawback is that MS generally cannot match the transmission and reflection gain of ES, since only a subset of the elements are selected for transmission and reflection, respectively.
418
X. Mu et al.
Table 3 Summary of proposed protocols for operating STAR-RISs Protocols Optimization variables ES Amplitude and phase shift coefficients of each element for transmission and reflection MS Mode selection of each element Transmission phase shift coefficients for T mode elements Reflection phase shift coefficients for R mode elements TS Time allocation Transmission phase shift coefficients of each element during T period Reflection phase shift coefficients of each element during R period
5.2.3
Advantages High flexibility
Disadvantages Large number of design variables
Easy to implement
Reduced transmission and reflection gain
Independent T and R design
High hardware implementation complexity
Time Switching
By contrast, the STAR-RIS employing the TS protocol periodically switches all elements between the T mode and R mode in orthogonal time slots (referred to as T period and R period), as illustrated in Fig. 8c. This is like switching “venetian blinds” in different time slots. The fraction of time allocated to fully transmitting and fully reflecting signals can be optimized to strike a balance between the communication qualities of the front and back sides. Compared to ES and MS, the advantage of TS is that, for a given time allocation, the transmission and reflection coefficients are not coupled; hence they can be optimized independently. Nevertheless, periodically switching the elements imposes stringent time synchronization requirements, thus increasing the implementation complexity compared to the ES and MS. In Table 3, we summarize the unique ES, MS, and TS optimization variables and identify their respective advantages and disadvantages.
5.3 Numerical Examples In this subsection, we present numerical examples to characterize the performance of STAR-RISs employing the proposed operating protocols and to compare them with other baselines. More specifically, we consider the case where a two-antenna AP communicates with two single-antenna users with the aid of a STAR-RIS having M elements. One of the users is assumed to be located in the STAR-RIS’s transmission half-space, referred to as T user, and the other user is assumed to be located in the reflection half-space, termed as R user. The direct AP-user links are assumed to be blocked, and only the STAR-RIS transmission/reflection-side AP-
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
15
Required transmit power (dBm)
Required transmit power (dBm)
10 Conventional-RISs 5
0
-5 STAR-RISs -10
-15 10
419
ES MS TS baseline
15
20
25
30
Number of elements (M)
(a)
35
40
Conventional RISs 10 STAR-RISs 5
0
-5 10
ES MS TS baseline
15
20
25
30
35
40
Number of elements (M)
(b)
Fig. 9 Transmit power versus the number of elements in a STAR-RIS-aided downlink MISO network. The target rates of the users are set to 1 bit/s/Hz and 3.46 bit/s/Hz in the unicast and multicast scenarios, respectively. (a) Unicast communication. (b) Multicast communication
user links are available, which are assumed to obey the Rician fading channel model having a path loss exponent of 2.2. For the considered setup, we investigate both unicast and multicast scenarios [31]. In particular, the AP sends different messages to different users in the unicast scenario while conveying a common message to both users in the multicast scenario. The minimum transmit power required by the AP for satisfying target user rates versus the number of STAR-RIS elements is studied. For comparison, one baseline is considered, where the full-space coverage is achieved by employing one conventional reflecting-only RIS and one transmitting-only RIS, each of which has / .M 2 elements. Figure 9a and b shows that, for STAR-RISs with different operating protocols and those baseline schemes in both scenarios, the minimum transmit power required by the AP decreases upon increasing M, since a higher transmission/reflection gain can be achieved. Regarding the performance of STAR-RISs, it is interesting to observe that TS achieves the best performance in the unicast scenario, whereas ES is preferable in the multicast scenario. This is because TS achieves interferencefree communication for each user in the unicast scenario. By contrast, since the multicast scenario does not introduce inter-user interference, ES exploits the entire available communication time, while TS cannot. It is also observed that ES always outperforms MS. This is indeed expected, since, from a theoretical point of view, MS can be regarded as a special case of ES with binary amplitude coefficients for each element. These results highlight the importance of employing different operating protocols for satisfying different communication objectives. Furthermore, regarding the performance comparison, it can be observed that, independent of the adopted operating protocols, STAR-RISs always outperform conventional RISs in the unicast scenario of Fig. 9a. Since conventional RISs employ fixed element-based mode selection and each omni-surface element employs identical transmission and reflection coefficients, the two schemes cannot fully exploit the DoFs available at each element to enhance the desired signal
420
X. Mu et al.
strength and mitigate the inter-user interference as STAR-RISs can, and thus yielding the worst performance for unicast. For the multicast scenario of Fig. 9b, conventional RISs only outperform TS STAR-RIS for small M. As the conventional RISs setup is a special case of MS STAR-RISs, unlike TS, it can fully exploit the available communication time. This advantage allows conventional RISs to achieve a higher performance than TS when M is small, i.e., the case where the available DoFs are limited for both schemes and using the entire available communication time dominates the achieved performance. However, when M increases, it is expected that conventional RISs become the worst option again due to the significant loss of DoFs. This drawback also causes the performance gap between conventional RISs and STAR-RISs to become more pronounced as M increases. The provided performance comparisons confirm the effectiveness of employing the proposed STAR-RISs in wireless networks.
6 Promising Applications of STAR-RISs in 6G Having presented practical protocols for operating STAR-RISs, in this section, we discuss several attractive applications of STAR-RISs in next-generation networks for both outdoor and indoor environments.
6.1 Outdoor, Outdoor-Indoor, and Indoor Coverage Extension One of the most promising applications of STAR-RISs is to improve the coverage area/quality of wireless networks, especially when the links between the base stations (BSs) or access points (APs) and users are severely blocked by obstacles (e.g., trees along roads, buildings, and metallic shells of vehicles). In outdoor communications, similar to conventional reflecting-only RISs, STARRISs can be mounted on building facades and roadside billboards to create an additional communication link. More innovatively, STAR-RISs can also be accommodated by the windows of vehicles (e.g., cars, aircraft, and cruise ships) to enhance the signal strength received inside by exploiting their transmission capability, thus extending the coverage area/quality of BSs and satellites. For outdoor-to-indoor communications, the severe penetration loss caused by building walls gravely restricts the coverage provided by outdoor BSs, especially in mmWave and THz communications. In fact, STAR-RISs constitute an efficient technique for creating an outdoor-to-indoor bridge. For indoor communications, STAR-RISs are more appealing than conventional reflecting-only RISs. As conventional reflectingonly RISs merely achieve half-space coverage, the signals emerging from the AP may require multi-hop bounces for reaching the target user. However, by exploiting both transmission and reflection, the resultant full-space coverage may reduce the propagation distance, thus increasing the received signal power. In a nutshell, STAR-
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
421
RISs substantially outperform conventional reflecting-only RISs, since they do not only possess the same capabilities as conventional reflecting-only RISs but also support additional design options due to their transmission capability.
6.2 Transmission-Reflection NOMA Non-orthogonal multiple access (NOMA) is a promising next-generation candidate facilitating flexible resource allocation, high spectrum efficiency, and supporting massive connectivity. For NOMA to achieve a large performance gain over orthogonal multiple access (OMA), it is important to pair users having different channel conditions. However, for conventional reflecting-only RISs, the benefits of NOMA may not be fully reaped since the channel conditions of users in the local reflected space are generally similar. Exploiting STAR-RISs enables a novel communication framework, namely, transmission-reflection NOMA, where a pair of users at the transmission- and reflection-oriented side can be grouped together for facilitating NOMA. By carefully optimizing the element-based energy splitting ratio of the proposed ES protocol or the element-based mode selection of the proposed MS protocol, sufficiently different transmitted and reflected channel conditions can be achieved. As a result, the proposed STAR-RIS-aided transmission-reflection NOMA framework is capable of achieving higher gain over conventional reflecting-only RIS-aided NOMA.
6.3 Full-Space Physical Layer Security RISs are also capable of improving the physical layer security (PLS), where the channel conditions of the eavesdroppers can be degraded by degrading their signal propagation. However, for conventional reflecting-only RIS-aided secure communication, the legitimate users and eavesdroppers are assumed to be located at the same side of the RISs, even though this idealized simplifying assumption may not hold in practice. Fortunately, STAR-RISs come to the rescue.
6.4 Indoor Localization and Sensing By overcoming signal blockages and providing full-space coverage, STAR-RISs are capable of improving both the localization and sensing capability of wireless networks, especially in indoor environments. The employment of STAR-RISs in smart factories improves the positioning of mobile robots and the data rate of control links.
422
X. Mu et al.
There are also other promising application scenarios for STAR-RISs in 6G networks, such as STAR-RIS-aided SWIPT, STAR-RIS-assisted visible light communications (VLCs), STAR-RIS-aided mmWave/THz communications, and STARRIS-augmented robotic communications. These applications constitute interesting future research directions.
7 Conclusions In this chapter, reconfigurable intelligent surface (RIS)-assisted wireless communication was overviewed with a particular focus on the new concept of STAR-RIS. First, the basics of conventional reflecting-only RISs were introduced. To achieve a full-space SRE, STAR-RISs were put forward by identifying their key differences compared with conventional reflecting-only RISs from the perspectives of hardware design, physics principles, and communication system design. Four possible hardware implementations and three hardware models for STAR-RISs were further presented. Moreover, the basic signal model of STAR-RISs was introduced, and three practical protocols were proposed for their operation, followed by presenting numerical case studies for showing the superiority of STAR-RISs over conventional RISs. Finally, several promising application scenarios for STAR-RISs in 6G have been identified both for outdoor and indoor environments.
References 1. K. B. Letaief, W. Chen, Y. Shi, J. Zhang, and Y.-J. A. Zhang, “The roadmap to 6G: AI empowered wireless networks,” IEEE Commun. Mag., vol. 57, no. 8, pp. 84–90, 2019. 2. W. Saad, M. Bennis, and M. Chen, “A vision of 6G wireless systems: Applications, trends, technologies, and open research problems,” IEEE Network, vol. 34, no. 3, pp. 134–142, 2020. 3. L. Sanguinetti, E. Björnson, and J. Hoydis, “Towards massive MIMO 2.0: Understanding spatial correlation, interference suppression, and pilot contamination,” IEEE Trans. Commun., vol. 68, no. 1, pp. 232–257, 2020. 4. M. D. Renzo, M. Debbah, D.-T. Phan-Huy, A. Zappone, M.-S. Alouini, C. Yuen, V. Sciancalepore, G. C. Alexandropoulos, J. Hoydis, H. Gacanin et al., “Smart radio environments empowered by reconfigurable AI meta-surfaces: An idea whose time has come,” EURASIP Journal on Wireless Communications and Networking, vol. 2019, no. 1, pp. 1–20, 2019. 5. Y. Liu, X. Liu, X. Mu, T. Hou, J. Xu, M. Di Renzo, and N. Al-Dhahir, “Reconfigurable intelligent surfaces: Principles and opportunities,” IEEE Commun. Surv. Tutor., vol. 23, no. 3, pp. 1546–1577, 2021. 6. M. Di Renzo, K. Ntontin, J. Song, F. H. Danufane, X. Qian, F. Lazarakis, J. De Rosny, D.T. Phan-Huy, O. Simeone, R. Zhang, M. Debbah, G. Lerosey, M. Fink, S. Tretyakov, and S. Shamai, “Reconfigurable intelligent surfaces vs. relaying: Differences, similarities, and performance comparison,” IEEE Open J. Commun. Soc., vol. 1, pp. 798–807, 2020. 7. E. Björnson, Ö. Özdogan, and E. G. Larsson, “Intelligent reflecting surface vs. decode-andforward: How large surfaces are needed to beat relaying?” IEEE Wireless Commun. Lett., vol. 9, no. 2, pp. 244–248, 2020.
Reconfigurable Intelligent Surfaces Toward 6G: From Reflection Only to. . .
423
8. T. Bai, C. Pan, Y. Deng, M. Elkashlan, and A. Nallanathan, “Latency minimization for intelligent reflecting surface aided mobile edge computing,” IEEE J. Sel. Areas Commun., vol. 38, no. 11, pp. 2666–2682, 2020. 9. Y. Cao, T. Lv, W. Ni, and Z. Lin, “Sum-rate maximization for multi-reconfigurable intelligent surface-assisted device-to-device communications,” IEEE Trans. Commun., vol. 69, no. 11, pp. 7283–7296, 2021. 10. Z. Chu, W. Hao, P. Xiao, and J. Shi, “Intelligent reflecting surface aided multi-antenna secure transmission,” IEEE Wireless Commun. Lett., vol. 9, no. 1, pp. 108–112, 2020. 11. C. Pan, H. Ren, K. Wang, W. Xu, M. Elkashlan, A. Nallanathan, and L. Hanzo, “Multicell MIMO communications relying on intelligent reflecting surfaces,” IEEE Trans. Wireless Commun., vol. 19, no. 8, pp. 5218–5233, 2020. 12. Q. Wu and R. Zhang, “Weighted sum power maximization for intelligent reflecting surface aided SWIPT,” IEEE Wireless Commun. Lett., vol. 9, no. 5, pp. 586–590, 2020. 13. B. O. Zhu, J. Zhao, and Y. Feng, “Active impedance metasurface with full 360 reflection phase tuning,” Scientific reports, vol. 3, p. 3059, 2013. 14. N. K. Emani, A. V. Kildishev, V. M. Shalaev, and A. Boltasseva, “Graphene: a dynamic platform for electrical control of plasmonic resonance,” Nanophotonics, vol. 4, no. 1, pp. 214– 223, 2015. 15. M. Di Renzo, F. Habibi Danufane, X. Xi, J. de Rosny, and S. Tretyakov, “Analytical modeling of the path-loss for reconfigurable intelligent surfaces – anomalous mirror or scatterer ?” in Proc. IEEE 21st Int. Workshop Signal Process. Adv. Wireless Commun. (SPAWC), 2020, pp. 1–5. 16. C. Huang, A. Zappone, G. C. Alexandropoulos, M. Debbah, and C. Yuen, “Reconfigurable intelligent surfaces for energy efficiency in wireless communication,” IEEE Trans. Wireless Commun., vol. 18, no. 8, pp. 4157–4170, 2019. 17. I. F. Akyildiz, C. Han, and S. Nie, “Combating the distance problem in the millimeter wave and terahertz frequency bands,” IEEE Commun. Mag., vol. 56, no. 6, pp. 102–108, 2018. 18. Y. Liu, X. Mu, J. Xu, R. Schober, Y. Hao, H. V. Poor, and L. Hanzo, “STAR: Simultaneous transmission and reflection for 360 coverage by intelligent surfaces,” IEEE Trans. Wireless Commun., vol. 28, no. 6, pp. 102–109, 2021. 19. N. DOCOMO. “DOCOMO conducts world’s first successful trial of transparent dynamic metasurface”. [Online]. Available: www.nttdocomo.co.jp/english/info/media_center/pr/2020/ 0117_00.html 20. B. O. Zhu, K. Chen, N. Jia, L. Sun, J. Zhao, T. Jiang, and Y. Feng, “Dynamic control of electromagnetic wave propagation with the equivalent principle inspired tunable metasurface,” Sci. Rep., vol. 4, no. 1, pp. 1–7, 2014. 21. C. Pfeiffer and A. Grbic, “Metamaterial huygens’ surfaces: tailoring wave fronts with reflectionless sheets,” Physical review letters, vol. 110, no. 19, p. 197401, 2013. 22. L. La Spada, C. Spooner, S. Haq, and Y. Hao, “Curvilinear metasurfaces for surface wave manipulation,” Sci. Rep., vol. 9, no. 1, pp. 1–10, Feb. 2019. 23. N. Mohammadi Estakhri, C. Argyropoulos, and A. Alù, “Graded metascreens to enable a new degree of nanoscale light management,” Phil. Trans. R. Soc. A., vol. 373, no. 2049, p. 20140351, 2015. 24. M. Di Renzo, A. Zappone, M. Debbah, M. S. Alouini, C. Yuen, J. de Rosny, and S. Tretyakov, “Smart radio environments empowered by reconfigurable intelligent surfaces: How it works, state of research, and the road ahead,” IEEE J. Sel. Areas Commun., vol. 38, no. 11, pp. 2450– 2525, 2020. 25. H. Zhang, S. Zeng, B. Di, Y. Tan, M. Di Renzo, M. Debbah, Z. Han, H. V. Poor, and L. Song, “Intelligent omni-surfaces for full-dimensional wireless communications: Principles, technology, and implementation,” IEEE Commun. Mag., vol. 60, no. 2, pp. 39–45, 2022. 26. M. Nemati, B. Maham, S. R. Pokhrel, and J. Choi, “Modeling RIS empowered outdoor-toindoor communication in mmwave cellular networks,” IEEE Trans. Commun., vol. 69, no. 11, pp. 7837–7850, 2021.
424
X. Mu et al.
27. W. Zhu, D. B. Farmer, K. A. Jenkins, B. Ek, S. Oida, X. Li, J. Bucchignano, S. Dawes, E. A. Duch, and P. Avouris, “Graphene radio frequency devices on flexible substrate,” Appl. Phys. Lett., vol. 102, no. 23, p. 233102, 2013. 28. R. J. Mailloux, Phased array antenna handbook. Artech house, 2017. 29. J. Xu, Y. Liu, X. Mu, and O. A. Dobre, “STAR-RISs: Simultaneous transmitting and reflecting reconfigurable intelligent surfaces,” IEEE Commun. Lett., vol. 25, no. 9, pp. 3134–3138, 2021. 30. N. M. Estakhri and A. Alu, “Wave-front transformation with gradient metasurfaces,” Phys. Rev. X, vol. 6, no. 4, p. 041008, 2016. 31. X. Mu, Y. Liu, L. Guo, J. Lin, and R. Schober, “Simultaneously transmitting and reflecting (STAR) RIS aided wireless communications,” IEEE Trans. Wireless Commun., vol. 21, no. 5, pp. 3083–3098, 2022.
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems Ian P. Roberts and Himal A. Suraweera
1 Introduction For more than a century, wireless communication systems have almost exclusively operated in a half-duplex fashion, where transmission and reception of radio waves have typically been separated—or orthogonalized—in the time domain, frequency domain, or both. Put simply, signals transmitted or received by a traditional halfduplex system exist in different frequency bands or at different times, referred to as frequency-division duplexing (FDD) and time-division duplexing (TDD), respectively. Half-duplex operation is necessitated by the manifestation of selfinterference (SI) when a transceiver attempts to receive signals while simultaneously transmitting in the same spectrum. In most cases, SI is many orders of magnitude stronger than a relatively weak signal-of-interest (or desired receive signal), which has presumably propagated tens or hundreds of meters. This makes it virtually impossible to accurately recover the desired receive signal from their combination without taking additional measures to mitigate the effects of SI [1, 2]. By receiving in neighboring frequency spectrum or on a separate time slot as its transmissions, a half-duplex transceiver can avoid inflicting SI onto a desired receive signal, hence the usage of FDD and TDD. By their nature, both FDD and TDD consume radio resources by dedicating timefrequency resources to either transmission or reception. Of course, this would not be
I. P. Roberts () Department of Electrical and Computer Engineering, University of California, Los Angeles, CA, USA e-mail: [email protected] H. A. Suraweera Department of Electrical and Electronic Engineering, University of Peradeniya, Peradeniya, Sri Lanka e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_16
425
426
I. P. Roberts and H. A. Suraweera
an issue if practical systems were not resource-constrained. In reality, all practical wireless communication systems operate on limited time-frequency resources. At the very least, most systems are confined to certain frequency spectrum by regulatory bodies, such as the Federal Communications Commission (FCC) in the United States. The consumption of radio resources by half-duplex operation has motivated researchers to explore in-band full-duplex operation1 [1, 3–7]. Starting in the late 2000s, researchers began heavily investigating and developing means to mitigate SI and bring full-duplex to life. Since then, full-duplex has matured and has recently found new life in millimeter-wave (mmWave) networks [8, 9], in joint communication and sensing [10], and through advancements in machine learning [11]. In this chapter, we introduce full-duplex operation and highlight its enhancements. Then, we overview full-duplex solutions for traditional radios and those for modern and next-generation wireless systems. We conclude with a look ahead at the future of full-duplex to motivate and steer its research, development, and deployment.
1.1 What Is Full-Duplex? Full-duplex capability allows a transceiver to concurrently transmit and receive over the same frequency spectrum. In other words, both transmission and reception can make use of the full available frequency spectrum all the time (Fig. 1). As mentioned, when operating in this full-duplex fashion, SI is inflicted onto a desired receive signal. To equip communication systems with full-duplex capability, engineers have developed a variety of ways to mitigate SI using radio frequency (RF) components, analog and digital filters, and other creative means. We will outline a variety of these strategies in this chapter. If SI can be sufficiently mitigated, a full-duplex transceiver can reliably receive while transmitting in-band, unlocking a number of enhancements, which we overview in the next section. As depicted in Fig. 2, a full-duplex base station may transmit and receive concurrently with a neighboring user that also has full-duplex capability. This makes better use of radio resources and, as intuition may suggest, leads to a potential doubling of spectral efficiency, as compared to half-duplex techniques. In other words, radio resources are being used twice as efficiently with full-duplex, since they are used for both transmission and reception, rather than divided as with FDD and TDD, as illustrated in Fig. 1.
1 We use the term “full-duplex” to refer to in-band full-duplex operation, in particular, as opposed to out-of-band full-duplex, which has been used to describe systems capable of simultaneously transmitting and receiving via FDD.
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
427
receive
time
transmit
time
time
transmit transmit and receive
receive
frequency
frequency
frequency
Fig. 1 The time-frequency resources consumed by transmission and reception with (left) FDD; (middle) TDD; and (right) full-duplex. In practice, guard bands are usually necessary for both FDD and TDD, which consume additional radio resources
self-interference
self-interference
full-duplex base station
full-duplex user
Fig. 2 A full-duplex base station transmitting and receiving in-band with a full-duplex user. SI manifests at both devices, requiring each to take measures to sufficiently cancel it
cross-link interference self-interference
half-duplex user
full-duplex base station
half-duplex user
Fig. 3 A full-duplex base station transmitting to one half-duplex user while receiving from another half-duplex user in-band. SI manifests at the base station, whereas cross-link interference is inflicted onto the user receiving by the user transmitting
Figure 3 depicts another full-duplex operating mode, where a full-duplex base station transmits to a half-duplex user while receiving from another half-duplex user. This mode of operation can also potentially double spectral efficiency. It is important to note that cross-link interference is inflicted on the user receiving by the user transmitting, the level of which depends on a number of factors, chiefly the users’ locations and the environment.
428
I. P. Roberts and H. A. Suraweera
Takeaways Full-duplex operation is an exciting alternative to half-duplexing with FDD and TDD since it makes better use of radio resources. To enable full-duplex operation, however, self-interference must be mitigated sufficiently in order to reliably recover desired receive signals while transmitting in-band.
2 Enhancements Introduced by Full-Duplex Compared to half-duplex operation, full-duplex can introduce enhancements to communication systems on a link level [1, 2, 5, 6], in medium access and spectrum sharing [12–17], and at the network level [18, 19]—which we overview in this section.
2.1 Link-Level Analysis The impacts of full-duplex can be readily observed by examining familiar link-level expressions foundational to wireless communication systems. To do so, consider the full-duplex operating mode depicted in Fig. 3, where a full-duplex base station communicates with two half-duplex users. Taking the perspective of the full-duplex base station, we refer to its transmit link and receive link. The full-duplex capacity of the system, in the absence of interference, can be written as Cfd = log2 1 + SNRtx + log2 1 + SNRrx ,
.
(1)
where .SNRtx and .SNRrx are the maximum signal-to-noise ratios (SNRs) of the transmit and receive links, respectively. If employing TDD to duplex transmission and reception instead of full-duplex, the achievable sum spectral efficiency is RTDD = α · log2 1 + SNRtx + (1 − α) · log2 1 + SNRrx ,
.
(2)
where .α is the fraction of time allocated to transmission, with the remainder allocated to reception. If employing FDD, this achievable sum spectral efficiency becomes SNRrx SNRtx (3) .RFDD = α · log2 1 + + (1 − α) · log2 1 + , α 1−α
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
429
where .α is the fraction of the total bandwidth allocated to transmission, with the remainder allocated to reception. Here, since the integrated noise power is proportional to bandwidth, FDD enjoys SNR increases inversely proportional to bandwidth. The expression of .RTDD as presented implicitly assumes an instantaneous transmit power constraint. Under an average transmit power constraint, which is less practical, .RTDD and .RFDD coincide [20]. Under full-duplex operation, the signal-to-interference-plus-noise ratio (SINR) of a desired receive signal (i.e., a signal-of-interest) on a given link is SINR =
.
Pdes SNR = , Pnoise + Pint 1 + INR
(4)
where .Pdes is the power of the desired receive signal; .Pnoise is the noise power; and Pint is the power of interference (i.e., SI on the receive link, cross-link interference on the transmit link). The right-hand side of (4) can be obtained by dividing the numerator and denominator by the noise power .Pnoise , where .SNR is the SNR of the desired receive signal and .INR is the interference-to-noise ratio (INR). The power of SI—and hence .INR on the receive link—depends on the quality of SI mitigation employed at the full-duplex transceiver. For now, we can consider the degree of (residual) SI .PSI as being some level below the transmit power .Ptx at the full-duplex transceiver. For instance, suppose our full-duplex base station is capable of reducing SI to a power level of
.
PSI =
.
Ptx , L
(5)
where L is the total amount of mitigation (i.e., cancellation) achieved by its fullduplex solution. The total amount of SI mitigation L may capture a variety of efforts, including antenna isolation and/or additional SI cancellation filtering, which we will discuss further in the next section. Since the power of a desired receive signal .Pdes can be quite close to the noise floor .Pnoise in practical systems, SI power .PSI must be near or below the noise floor to ensure it does not prohibitively erode full-duplex resource gains. This means is not uncommon for L to be on the order of 100 dB for practical systems. Consider a transmit power of .Ptx = 20 dBm, and a noise floor of .Pnoise = −90 dBm: .L = 110 dB is required for .PSI = Pnoise (i.e., .INR = 0 dB). Achieving this degree of mitigation is precisely what has hindered the adoption of full-duplex since the advent of wireless communications and what has made successful demonstrations of full-duplex so impressive [7, 21–24].
A Summary of Key Power Ratios of the Full-Duplex System in Fig. 3 SNRtx SNRrx
. .
Strength of the desired signal on the transmit link Strength of the desired signal on the receive link (continued)
430
INRtx INRrx .SINRtx .SINRrx . .
I. P. Roberts and H. A. Suraweera
Strength of cross-link interference on the transmit link Strength of SI on the receive link Effective quality of the desired signal on the transmit link Effective quality of the desired signal on the receive link
When full-duplexing transmission and reception, the maximum achievable spectral efficiency when treating residual interference as noise can be expressed as Rfd = log2 1 + SINRtx + log2 1 + SINRrx ≤ Cfd .
.
(6)
When .SINRtx → SNRtx and .SINRrx → SNRrx , then .Rfd → Cfd . These expressions illustrate the potential resource gains offered by full-duplex compared to FDD and TDD, since there are no pre-log fractions; the effectiveness of such depends heavily on the presence of low SI and low cross-link interference, however. In Fig. 4, we compare full-duplex operation against half-duplex operation with FDD and TDD by plotting their rate regions for various levels of SI .INRrx and cross-link interference, where .SNRtx = SNRrx = 10 dB. Here, each line depicts the boundary of its rate region, encompassing all feasible transmit-receive spectral efficiency pairs .(Rtx , Rrx ), and each star (.⋆) indicates the point that maximizes the sum spectral efficiency. The simple time-sharing nature of TDD is shown as the diagonal line connecting the two points of greedy time-sharing. FDD offers improvements over TDD, courtesy of its SNR gains when shrinking bandwidth, as mentioned. When SI power is mitigated to a level equal to the noise power (i.e., Fig. 4 The rate region boundaries for various multiplexing strategies when .SNRtx = SNRrx = 10 dB. Full-duplex, shown as dashed blue and red lines, can outperform TDD and FDD with SI and cross-link interference (shown as CLI) sufficiently mitigated. Stars (.⋆) indicate points that maximize the sum spectral efficiency
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
431
INRrx = 0 dB) and cross-link interference is also at the noise floor, the sum spectral efficiency can clearly exceed FDD and TDD, but only marginally. As SI and crosslink interference are reduced to below the noise floor, however, the achievable rate region of full-duplex expands, approaching that of its capacity region.
.
At lower SNRs, lower INRs are required for appreciable full-duplex operation, since the effects of interference magnify due to the steep nature of .log2 (1 + x) at low x, tightening the requirements of SI and cross-link interference mitigation. In addition, the SNR gains introduced by FDD are magnified since doubling x nearly doubles .log2 (1 + x) ≈ x at low x, reducing the gap between FDD and the full-duplex capacity.
At higher SNRs, the gap between full-duplex and half-duplex grows, and the effects of interference diminish due to the saturating nature of .log2 (1 + x) at high x. Higher INRs can be tolerated at high SNRs, relaxing the requirements on mitigating SI and cross-link interference.
2.2 In Medium Access and Spectrum Sharing Half-duplex transceivers have been ubiquitous in wireless networks, and for good reason, communication standards and practices have been built on this half-duplex assumption. The ability to transmit and receive simultaneously and in-band is a transformative technology that can unlock new approaches to medium access and spectrum sharing that are more efficient than those built on a half-duplex assumption. In turn, full-duplex can facilitate wireless networks that deliver higher throughput, inflict lower interference, and make better use of spectrum. Overcoming the Hidden Node Problem To illustrate this, consider the famous hidden node problem in wireless networks: if two users are distant from one another but both within earshot of a nearby access point, the two users may be unaware when the other is transmitting to the access point. This can lead to collisions at the access point—and hence a waste of radio resources—if not dealt with. Conventional approaches to overcome this use handshaking between users and the access point to grant a user permission to transmit (e.g., request-to-send (RTS) and clear-tosend (CTS) mechanisms) along with random backoffs. By upgrading the access point with full-duplex capability, more efficient approaches to medium access can be employed [12, 15, 16]. For instance, as illustrated in Fig. 5, the full-duplex access point can broadcast a busy tone anytime it is receiving from a user. This busy tone
432
I. P. Roberts and H. A. Suraweera self-interference busy tone
half-duplex user
full-duplex access point
half-duplex user
Fig. 5 While receiving from one user, a full-duplex access point broadcasts a busy tone instructing all other users not to transmit, preventing collisions and overcoming the hidden node problem
can be heard by all users in the network, informing them to not transmit. Compared to conventional approaches to medium access, this approach consumes no additional radio resources and avoids the overhead associated with handshaking between users and the access point and reduces latency. Dynamic Spectrum Access and Cognitive Radios The number of devices with wireless connectivity has grown profoundly over the past two decades and will seemingly continue for years to come. The amount of available spectrum has not grown at a comparable pace, however. In light of this, researchers have proposed spectrum sharing and cognitive radios to dynamically and opportunistically make better use of frequency spectrum when it is free [25]. For instance, a cognitive radio may sense a frequency band to see if it is being used by nearby devices. If there appears to be no activity, the cognitive radio may begin transmitting information. Periodically, the cognitive radio may halt transmission to sense the channel to ensure that it does not inflict interference on incumbents that have rights to the band—a waste of resources if none are detected. Full-duplex can make this process more efficient by empowering the cognitive radio to continuously sense the channel while transmitting [13, 14, 16]. This allows it to more efficiently transmit information, since it does not have to halt transmission to sense the channel and enables the cognitive radio to react more quickly to the presence of incumbents. Without fullduplex capability, the cognitive radio would presumably be unaware of the presence of incumbents until the end of its transmission, potentially causing interference that leads to collisions. Techniques discussed for overcoming the hidden node problem can be applied in this setting, as well, to instruct nearby cognitive radios to not transmit. Channel Sensing to Reduce the Effects of Interference As a final example of full-duplex applied to medium access, we consider the case where an access point serves users in the presence of neighboring nodes that may inflict interference, as explored in [17]. For instance, one can consider two Wi-Fi access points operating independently but near one another, an example of which can be seen in Fig. 6. When one access point transmits to a user, successful reception at that user may
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
channel sensing detects interference
self-interference
reroute resources half-duplex user
433
potential collision full-duplex access point
interference half-duplex user
interferer
Fig. 6 An access point equipped with full-duplex capability can sense the channel while transmitting, allowing it to reroute resources in the presence of interference that would otherwise cause collisions [17]
be corrupted by neighboring interference, requiring the access point to retransmit the data. Naturally, this consumes radio resources and can lead to delays in communication. With full-duplex capability, the access point may sense the channel while transmitting, allowing it to potentially halt transmission to avoid collisions caused by neighboring interference and subsequently redirect resources to another user that may not be impacted by this interference [16, 17]. It is important to note that the requirements on mitigating SI are stricter when decoding data carried by a desired receive signal, compared to those for channel sensing, which is often merely detecting energy levels in a particular frequency band.
2.3 Network-Level Enhancements Upgrading transceivers with full-duplex capability can be felt at the network level in a few ways. Even when only some devices are equipped with full-duplex—while the remainder are half-duplex-capable—a wireless network can enjoy improvements in throughput and latency [18, 19]. In fact, network throughput can magnify beyond the doubling of spectral efficiency we are familiar with at the link level with full-duplex [18, 19]. This can be attributed to the fact that full-duplex can reduce multiplexing delays, reduce overhead associated with medium access control, and enable new ways to manage interference—all of which can improve network throughput. We elaborate more on network-level enhancements of full-duplex in Sect. 4.6.
Other Applications There are applications of full-duplex technology beyond what was highlighted herein, such as in joint communication and sensing [10, 26], secure communications [27], military communications [28, 29], radar [10], and more [5, 6].
434
I. P. Roberts and H. A. Suraweera
3 Self-Interference Cancellation Successfully equipping a device with full-duplex capability relies on mitigating—or cancelling—SI to levels that are sufficiently low [2]. The amount of self-interference cancellation (SIC) needed depends on the particular application. In most settings, full-duplex solutions aim to cancel SI to near or below the receiver noise floor (i.e., roughly .INRrx ≤ 0 dB). This ensures that the full-duplex resource gains are not eroded by the presence of high SI, as highlighted in Fig. 4. The residual SI is that which remains after efforts of SIC. In this section, we outline methods of SIC in both the analog and digital domains. Regardless of domain, the motivation behind SIC can largely be summarized as leveraging the fact that a transceiver has knowledge of its own transmitted signal and can therefore potentially reconstruct SI and subtract it from the received signal, leaving the desired portion virtually free of SI. In many cases, a staged approach to SIC is employed as illustrated in Fig. 7, where a portion of SI is cancelled using an analog filter and a significant portion of the remainder is cancelled using digital filtering. This staged approach is depicted in Fig. 8, where a full-duplex transceiver with separate antennas for transmission and reception employs an analog SIC filter between its antennas at RF and a digital SIC filter.
3.1 Analog Self-Interference Cancellation As illustrated in Fig. 8, analog SIC typically exists as a digitally controlled analog filter placed between the transmitter and receiver of a full-duplex transceiver. Analog SIC filters come in many forms, existing as RF, intermediate frequency, and baseband circuitry and even as optical filters [2, 23, 30–32]. Analog SIC is often driven by tapping off a small portion of the upconverted RF transmit signal. This transmit signal undergoes filtering within the analog SIC filter before being injected at the receiver. The injected signal is an inverted reconstruction of SI, which, when combined with the received signal, destructively combines with SI. After this combining, there is some degree of residual SI due to imperfect
desired receive signal
noise
analog self-interference cancellation
digital self-interference cancellation
+
−
−
self-interference
reconstructed self-interference
reconstructed self-interference
receive processing and decoding
Fig. 7 A received signal undergoes analog and digital SIC before undergoing conventional receive processing to recover desired receive data
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
435
Transmit Signal Digital Transmitter
Transmit Data
RF
Self-Interference Digital SIC
Analog SIC
Digital Receiver
Receive Data
+
RF
Desired Receive Signal
Fig. 8 SI manifests when a full-duplex transceiver attempts to simultaneously transmit and receive using the same frequency spectrum. Here, separate antennas are used for transmission and reception, with analog and digital SIC used to reconstruct and subsequently cancel SI incurred at the receiver.
x0
×
x1
τ
τ
τ
input signal
× +
x2
× +
xN
× +
output signal
Fig. 9 An example analog SIC filter (an N-tap FIR filter) with tunable tap weights .{xi } and fixed, uniform tap delay .τ
reconstruction, which may stem from estimation errors, hardware limitations, and hardware imperfections. By tapping off the transmit signal after the transmit chain, analog SIC will inherently incorporate transmit-side impairments unbeknownst to baseband, such as power amplifier (PA) nonlinearities, which have proven to be a dominant factor in SIC [24, 33]. Other approaches, sometimes called digitally assisted approaches, use a dedicated transmit chain to drive analog SIC, as opposed to tapping off the transmit signal directly [2]. This approach cannot as well capture transmit-side impairments present in SI, however, since this second transmit chain naturally will not contain all artifacts of the true transmit chain. Configuring an Analog Self-Interference Cancellation Filter Tuning an analog SIC filter to effectively cancel SI largely consists of measuring SI and then configuring the filter to reconstruct its inverse. Analog SIC can be implemented as a time-domain filter (as seen in Fig. 9) or as a frequency-domain filter, meaning particular methods may vary, but all tackle the same goal of reconstructing SI [2]. One method of time-domain analog SIC is to estimate the impulse response of the SI channel and then configure the analog filter to produce this (inverted) impulse response estimate, effectively equalizing SI. Estimation of the SI channel is typically executed by transmitting a pilot signal during a quiet period, when no desired receive signal is present. One difficulty with practically executing this method lays in the fact that estimation of the SI channel takes place digitally,
436
I. P. Roberts and H. A. Suraweera
meaning estimation of the channel of interest for analog SIC may be complicated by artifacts of the transmit and receive chains before and after analog SIC. This can be further complicated by the fact that an analog filter may not have an ideal impulse response itself, making it difficult to reliably produce the desired impulse response. As an attempt to overcome these challenges, another approach is to measure SI and then measure the impulse response of the analog filter. Then, the filter can be configured to produce an inverted reconstruction of SI. For instance, consider a column vector of measured SI time-domain samples .y (during a quiet period) and a matrix .A whose i-th column is the measured impulse response of the i-th tap of the filter. Analog FIR filter weights .x can be computed to minimize the error in reconstructing an inverted copy of the measured SI as x⋆ = argmin ‖−y − Ax‖22 ,
.
(7)
x
which has the well-known closed-form solution .x⋆ = −(A∗ A)−1 A∗ y. This approach has shown to be fairly robust, since it accounts for the imperfect impulse response of each of the filter’s taps. While it may seem fairly straightforward to implement analog SIC, it is practically quite challenging in most cases, especially outside of well-controlled lab settings. Most notably, there is small margin for error in SIC due to the overwhelming strength of SI, reinforcing the need for extremely accurate, adaptable, and low-overhead SI reconstruction. Another challenging aspect is the need to miniaturize analog SIC filters into form-factors that integrate into devices such as cell phones, laptops, wireless routers, base stations, and the like [2]. Miniaturization is especially challenging in settings where the delay spread of SI has the potential to be high, since propagation delays need to be physically realized within the analog SIC filter.
3.2 Digital Self-Interference Cancellation Cancelling SI through digital signal processing is naturally an attractive option in addition to analog SIC. The flexibility and sophistication of digital filtering can be applied to estimate and cancel SI and have had impressive success [24, 33, 34]. As depicted in Figs. 7 and 8, digital SIC is executed after analog SIC and therefore aims to cancel residual SI that remains after prior efforts of cancellation. Naturally, one may ask whether digital SIC can cancel all SI, rendering analog SIC unnecessary. In general, this is not possible for a few reasons, stemming from hardware limitations and nonidealities. Limited Dynamic Range of Analog-to-Digital Converters With reasonable resolution and appropriate gain control before analog-to-digital conversion, quantization
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
437
noise is rarely an issue in traditional half-duplex systems. In full-duplex systems, on the other hand, the strength of SI has the potential to saturate analog-to-digital converters (ADCs), even with ideal automatic gain control (AGC) and a reasonable number of bits [1, 33]. Since AGC acts on the combination of a desired signal plus SI and noise at the ADC input, the strength of quantization noise is dictated largely by that of SI. In such cases, only a portion of the ADC’s total dynamic range is effectively used to quantize a desired signal. Consequently, even if SI could be completely reconstructed and cancelled digitally, its effects may remain in the form of increased quantization noise, which can severely and irreversibly degrade the quality of a desired receive signal. This reinforces the need to sufficiently cancel SI before it reaches the ADC input, which is often most reliably done via analog SIC. Transceiver Nonidealities Practical transceivers introduce nonidealities in the transmit and receive chains, such as amplifier nonlinearities, I/Q imbalance, transmitter thermal noise, and phase noise, which complicate digital SIC since the digital domain does not have knowledge of these imperfections [24]. When these nonidealities are not negligible, this requires digital SIC to accurately estimate and subsequently cancel them, which can be computationally complex. To reduce this burden, analog SIC can be well positioned to cancel transmit-side impairments using the RF transmit signal as input to its cancellation filter, which inherently will include nonidealities introduced by transmit PAs and transmit thermal noise, for example. In addition, analog SIC can importantly ensure that the power of residual SI is sufficiently low such that it does not overwhelm and saturate receive-chain components such as low noise amplifiers (LNAs), which practically have a limited dynamic range that can be exceeded by SI [33]. Recent Breakthroughs Using Machine Learning In addition to classical signal processing approaches for digital SIC, solutions based on machine learning have been gaining traction and have shown impressive results [11, 35–40]. The main motivation for the use of machine learning over classical approaches for digital SIC is to capture transceiver nonidealities with reduced complexity. Classical signal processing approaches have been able to effectively estimate and account for transceiver impairments when reconstructing and subsequently cancelling SI. This is done by modeling transceiver impairments with established, parameterized models, but the estimation of model parameters is computationally expensive with classical methods. Machine learning has shown to be able to offer comparable performance as classical methods in capturing transceiver impairments when reconstructing SI but with reduced complexity. Experimental validation of these digital SIC solutions based on machine learning has proven their effectiveness [11, 35, 37]. In addition, machine learning can be used to reduce the complexity of SIC in multiantenna systems. Rather than merely replicating single-antenna SIC solutions as an extension to multi-antenna systems, researchers have shown that machine learning can reduce the size and complexity of digital SIC by learning correlations between antennas [39].
438
I. P. Roberts and H. A. Suraweera desired receive signal transmit signal
circulator
transmitter
1
leakage
2
3
reflection
receiver
environment
Fig. 10 A circulator can be used to establish isolation between a transmitter and receiver sharing an antenna. Leakage through the circulator and reflections off the environment give rise to SI, however
3.3 Circulators and Antenna Isolation An RF component known as a circulator has been used in many monostatic radar and communication applications as a duplexer when a single antenna is used for simultaneous transmission and reception of RF signals [1, 2]. In its simplest form, a circulator is a three-port, passive device where a signal entering a given port is “circulated” to the next port in the rotation. An example of this device being used with a single antenna shared by transmission and reception can be seen in Fig. 10. Transmit signals enter Port 1 of the circulator and exit at Port 2, where they are radiated by the antenna. Signals received by the antenna enter Port 2 and are circulated to Port 3, where they exit the circulator and enter the receive chain. This establishes isolation between the transmitter and receiver of a full-duplex device using a single antenna for transmission and reception. One may reason that with an ideal circulator and with two radios operating in free space, full-duplex operation is trivial since perfect isolation is achieved between a radio’s transmit signal and a desired receive signal. In reality, a circulator effectively offers limited RF isolation between its ports, which introduces SI at the receiver. This is due to a number of factors, most notably the leakage between ports, reflections caused by imperfect matching at the antenna, and reflections off the environment. Circulators with small form-factors that offer high isolation for full-duplex are an active area of research with immense potential [41, 42]. Nonetheless, analog and digital SIC can be used in conjunction with a circulator for single-antenna full-duplex transceivers. In such cases, SIC aims to cancel circulator leakage, as well as reflections off the environment and from the antenna.
Takeaways Digital SIC is an effective and flexible route to enabling fullduplex, but it is bottlenecked in practice by imperfections and limitations of hardware. As such, it is often used in conjunction with analog SIC, which can inherently account for hardware impairments, relax the requirements of digital SIC, and prevent SI from saturating receive-chain components. Circu(continued)
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
439
lators and other duplexers can provide isolation between the transmitter and receiver of a full-duplex transceiver when using a single antenna, effectively weakening SI that analog and digital SIC must tackle.
4 A New Frontier: Full-Duplex Millimeter-Wave Systems To meet the ever-growing demand for high-rate wireless access, cellular networks have turned to mmWave carrier frequencies, typically classified as ranging from around 30 GHz to 100 GHz [43]. Fifth-generation (5G) cellular networks and IEEE 802.11ad/ay, for instance, leverage frequency bands that span hundreds of megahertz. These wide swaths of spectrum facilitate higher data rates and enable new applications in entertainment, industry, and sensing. The widespread deployment of mmWave networks has faced hurdles thus far, but it is expected they will see greater success through the end of the 2020s. Both mmWave communication systems and full-duplex technology were explored concurrently during the 2010s and were proposed as core technologies for next-generation wireless networks. The combination of the two—full-duplex mmWave communication systems—has not been explored as extensively. Only recently has this topic garnered noteworthy attention from industry and academia [8, 9, 44, 45].
4.1 Prelude: Full-Duplex MIMO Transceivers With multiple antennas at a transmitter and a receiver, there is the potential to multiplex more than one data stream over the resulting multiple-input multiple-output (MIMO) wireless channel via spatial signal processing [20]. MIMO communication transformed wireless networks forever by offering multiplicative rate gains over traditional single-input single-output communication systems. Given their prominence, the extension of full-duplex to MIMO transceivers was imperative. With multiple antennas at the transmitter and receiver of a full-duplex transceiver, SI is inflicted onto each receive antenna by each transmit antenna, leading to a MIMO SI channel. There is a quite natural extension of analog and digital SI to full-duplex MIMO transceivers [24, 46, 47]. Perhaps more exciting, though, is the potential to mitigate SI through precoding and combining (i.e., spatial processing) at the transmitter and receiver of the full-duplex transceiver [48–51]. By strategically transmitting energy into the SI channel and receiving energy from it, SI can be mitigated spatially while still communicating desired signals in a MIMO fashion. As the number of antennas grows—and especially in the massive MIMO regime—the prospects of spatial cancellation are even more promising. There is extensive literature on the
440
I. P. Roberts and H. A. Suraweera
subject of full-duplex MIMO systems; we encourage interested readers to explore [46–52] and references therein for more details. In the remainder of this section, we consider a particular class of full-duplex MIMO transceivers: those at mmWave frequencies. Solutions for full-duplex mmWave systems draw inspiration from those for traditional full-duplex MIMO systems at sub-6 GHz but face unique challenges and are subject to new transceiver- and network-level factors.
4.2 What’s New at Millimeter-Wave Frequencies? Communication at mmWave is more than a mere shift in carrier frequency, as elegantly stated in [53]. In general, path loss increases with carrier frequency, which necessitates the use of dense antenna arrays to supply high beamforming gains that can deliver link margins that sustain high-rate communication. Antenna arrays on mmWave network infrastructure are typically equipped with 64–256 antennas, whereas user equipment may be equipped with 4–16 elements. Fortunately, antenna footprint shrinks as carrier frequency increases, allowing dense antenna arrays to fit in convenient form-factors. The severe path loss and susceptibility to blockage at mmWave frequencies, coupled with highly directional beamforming, reduce inter-user interference and facilitate base station deployments much denser than traditional sub-6 GHz macrocell deployments. All of this has led to new transceiverlevel and network-level challenges and solutions at mmWave.
4.3 Exciting Potential to Tackle Self-Interference via Beamforming In Sect. 4.1, we mentioned that a multi-antenna full-duplex system can design its transmit precoder and receive combiner to reduce SI coupled over the MIMO SI channel. At mmWave, the prospect of such spatial SIC notably improves for a few reasons [8, 9, 44]. First of all, path loss and blockage increase at mmWave frequencies, compared to sub-6 GHz frequencies, which presumably weakens SI as it couples from a transmitter to receiver, both directly over the air and due to reflections off the environment; reflectivity increases at mmWave, however, which may lead to more SI from the environment. Second, with denser antenna arrays, the degrees of freedom available to cancel SI increases, compared to traditional sub6 GHz MIMO systems, which typically only have 2–8 antennas. It is important to note that, as the number of antennas has increased immensely at mmWave, the number of data streams communicated has remained of similar order. Hybrid analog/digital beamforming and analog-only beamforming architectures, which are ubiquitous thus far in mmWave communication systems, fundamentally limit the number of data streams to the number of digital-to-analog converters (DACs) and
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems Full-Duplex Capacity
Transmit Zero-Forcing
BFC
BFC + Analog SIC
ue nc g
in lex
in
up
l ex
D
up
n
D
sio
n
ivi
sio
D
ivi
e-
D
m
y-
Ti
Receive Zero-Forcing
g
Receive Link Spectral Efficiency
Imperfect Analog SIC
eq Fr
Fig. 11 The rate region boundaries for TDD, FDD, and full-duplex using analog SIC, beamforming cancellation (shown as BFC), and the combination of the two. Beamforming cancellation consumes spatial resources, which introduces a gap between it and the full-duplex capacity. This gap can be reduced via analog and/or digital SIC. Reproduced from [9] with permission
441
Transmit Link Spectral Efficiency
ADCs, which is comparable to that in conventional sub-6 GHz MIMO systems. With one or few data streams communicated and tens or hundreds of antennas, many degrees of freedom are potentially available for cancelling SI via strategic design of transmit and receive beamformers at a full-duplex mmWave transceiver. Excitedly, if beamforming can cancel SI sufficiently, the need for analog and/or digital SIC vanishes, meaning there is the potential for full-duplex mmWave systems to operate without any additional hardware or complex signal processing. In other words, analog and digital SIC may not be needed for full-duplex mmWave systems with sufficient spatial SIC. Plenty of existing work has highlighted this by designing transmit and receive beamformers in such a way that SI is mitigated while maintaining downlink and uplink to users [8, 9, 44, 48, 52, 54–65]. Interestingly, unlike traditional analog and digital SIC solutions, transmit beamforming introduces the unique opportunity to reduce the degree of SI that ever reaches the receive antennas. Like analog SIC, transmit and receive beamforming can be used to prevent SI from saturating receive chain components. For instance, in [64], a hybrid beamforming design is presented that guarantees SI is below some power level at each LNA and each ADC at the receiver of the full-duplex mmWave device, ensuring they do not saturate. In addition to using beamforming alone to mitigate SI, researchers have also considered analog and/or digital SIC in conjunction, which relaxes the cancellation requirements of beamforming, allowing it to better serve uplink and downlink at the cost of added hardware or signal processing [47, 65–71]. The rate region boundaries with spatial cancellation via beamforming (sometimes termed beamforming cancellation) and analog SIC are shown in Fig. 11 [9].
442
I. P. Roberts and H. A. Suraweera
Fig. 12 A full-duplex mmWave transceiver transmits to a downlink user while receiving from an uplink user in-band. By strategically constructing its analog beamformers .f and .w, it can reduce the level of SI coupled across the MIMO channel .H while delivering service to the users.
DL
Nt
f
Transmitter
bphs
Self-Interference Channel
bamp
Full-Duplex Transceiver
Cross-Link Interference
H Nr
w
Receiver
bamp
UL
bphs
4.4 Example Beamforming Design Problems We now overview two sample spatial SIC design problems for full-duplex mmWave systems, with the goal of introducing readers to the design objectives and considerations surrounding such research problems and those taking a similar form. Analog Beamforming Design Consider the system illustrated in Fig. 12. Suppose the full-duplex mmWave base station equipped with analog-only beamforming— as opposed to hybrid digital/analog beamforming [53]—is serving an uplink user and a downlink user simultaneously and in-band, both of which are single-antenna devices. It is currently practical to consider the use of separate, independently controlled transmit and receive arrays at the base station equipped with .Nt and .Nr antennas, respectively [9]. Let .f ∈ CNt ×1 and .w ∈ CNr ×1 be the transmit and receive beamforming vectors used at the full-duplex base station. Let .H ∈ CNr ×Nt be the MIMO SI channel matrix manifesting between the transmit and receive arrays. Let ∗ ∈ C1×Nt be the channel vector from the base station to the downlink user. .htx Let .hrx ∈ CNr ×1 be the channel vector from the uplink user to the base station. The full-duplex base station will rely solely on beamforming to mitigate SI with no additional analog or digital SIC. Practical analog beamforming networks are comprised of digitally controlled phase shifters. In addition to phase control, some networks also offer quantized amplitude control through digitally controlled attenuators or variable-gain amplifiers (VGAs) (see Sect. 4.5). Quantized phase and amplitude control confines all physically realizable analog beamformers to come from some discrete sets, which can be captured as simply .f ∈ F and .w ∈ W. The transmit link and receive link SNRs are functions of their beamformers and can be expressed as SNRtx (f)e =
.
PtxBS · |htx ∗ f|2 UE Pnoise
,
SNRrx (w) =
PtxUE · |w∗ hrx |2 BS Pnoise
,
(8)
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
443
where .PtxBS and .PtxUE are the transmit powers of the base station and the uplink user, BS and .P UE are the noise powers of the base station and the downlink while .Pnoise noise user. The SI and cross-link interference terms of the system are INRrx (f, w) =
.
PtxBS · |w∗ Hf|2 BS Pnoise
,
INRtx =
PtxUE · |hCL |2 UE Pnoise
,
(9)
where .INRrx is a function of the transmit and receive beams at the full-duplex base station and .INRtx is solely a function of the cross-link channel .hCL from the uplink user to the downlink user. Together, these desired and interference terms form the SINRs of the two links as SINRtx (f) =
.
SNRtx (f) , 1 + INRtx
SINRrx (f, w) =
SNRrx (w) , 1 + INRrx (f, w)
(10)
which determine the achievable spectral efficiencies as Rtx (f) = log2 1 + SINRtx (f) ,
.
Rrx (f, w) = log2 1 + SINRrx (f, w) ,
(11)
when treating interference as noise. Notice that the transmit link quality .SINRtx and therefore its spectral efficiency .Rtx are solely functions of the transmit beam .f. The fate of the receive link, however, is determined by both .w and .f due to SI. It would be desirable to design .f and .w such that they deliver high SNRs and couple low SI. This motivates the following analog beamforming design problem, which aims to maximize the sum spectral efficiency of the system, while requiring the transmit and receive beams to be physically realizable. .
max Rtx (f) + Rrx (f, w). f,w
s.t.f ∈ F, w ∈ W
(12a) (12b)
Several existing works on mmWave full-duplex aim to solve this problem or one of similar form [60, 61, 72]. In general, this problem is difficult to solve due to the non-convexity of the objective from the coupling of .f and .w and the fact that .F and .W are non-convex sets. Researchers typically instead tackle problems that are more readily solved and still yield high sum spectral efficiency. One practical issue with solving analog beamforming problems of this type is that they are executed for each user pair, which can consume prohibitive amounts of radio resources (e.g., for channel estimation and over-the-air feedback) and computational resources. In fact, the time required to solve these sorts of problems may not translate to timescales of practical systems, even with modern computing power. Moreover, many existing solutions rely on unrealistic assumptions, such as real-time downlink/uplink MIMO channel knowledge (i.e., .htx ∗ and .hrx ), which is not obtainable in practical mmWave systems today.
444
I. P. Roberts and H. A. Suraweera
Analog Beamforming Codebook Design In the previous example, we considered the problem of designing transmit and receive beams that maximize sum spectral efficiency. Now, to circumvent and overcome some of the practical challenges mentioned in the previous example, let us consider the goal of designing transmit and receive codebooks that maximize sum spectral efficiency in full-duplex mmWave systems. Let us build on the previous example and the notation used therein. Suppose it is required that the transmit beam .f be selected from some set of .Mtx transmit beams . f1 , f2 , . . . , fMtx , called a transmit codebook, which is common in practical systems. Likewise, suppose it is required that the receivebeam .w be selected from some codebook of .Mrx receive beams . w1 , w2 , . . . , wMrx . Here, .Mtx and .Mrx are on the order of tens or hundreds at most, meaning it is realistically the case that .Mtx ⩽ |F| and .Mrx ⩽ |W|. It is fairly straightforward to design transmit and receive codebooks for traditional half-duplex mmWave systems, typically done by tessellating beams to cover a desired service region to ensure that a user falling in this region can be served with high gain with at least one beam from the codebook. The design of codebooks for full-duplex mmWave systems, on the other hand, is far more involved and has only been investigated in [54, 73] thus far. Consider the following design problem, which aims to design codebook matrices .F ∈ CNt ×Mtx and .W ∈ CNr ×Mrx that maximize the expected sum spectral efficiency across a known user distribution for a given .H, for instance. . max E max Rtx (f) + Rrx (f, w) . (13a) F,W
f,w
s.t. f = [F]:,i , w = [W]:,j .
(13b)
[F]:,i ∈ F ∀ i = 1, . . . , Mtx.
(13c)
[W]:,j ∈ W ∀ j = 1, . . . , Mrx
(13d)
Here, these codebook matrices are structured such that the i-th column of .F is .fi and the j -th column of .W is .wj , both of which are required to be physically realizable beamforming vectors, hence (13c) and (13d). To serve each user pair, some transmit and receive beams .f and .w are selected from their respective codebooks. With random user placement, any transmit and receive beam may be chosen from the codebooks, meaning desirable codebooks .F and .W would offer low .INRrx (f, w) for all possible .f and .w while still capable of delivering high beamforming gains. This is a difficult problem to solve, largely due to the daunting objective of aiming to maximize average sum spectral efficiency, but provides good direction for desirable codebooks .F and .W. In [54, 73], a similar problem is tackled, as shown below in problem (14). Here, the objective is instead to minimize average SI coupled between possible transmit and receive beams across the channel .H, effectively minimizing average .INRrx (f, w). In doing so, high beamforming gain and broad coverage over some transmit and receive coverage regions are enforced by (14b) and (14c), where .σtx 2
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
445
Fig. 13 (left) A broadside beam from two conventional beamforming codebooks. (right) Beams created by [54, 73] that span a coverage region with high gain while coupling low SI Fig. 14 A mmWave base station conducts beam alignment by sweeping a codebook of candidate beams, selecting that which maximizes SNR to serve a given user
Beam Alignment
BS User
Beam Sweep
Codebook of Beams
Base Station
and .σrx 2 are design parameters that throttle the so-called coverage variance of each codebook. In essence, these constraints ensure codebooks can reliably deliver high .SNRtx and high .SNRrx , while the objective aims to minimize .INRrx . .
2 min W∗ HF F. F,W
2 s.t. Nt · 1 − diag Atx ∗ F 2 ≤ σtx2 · Nt2 · Mtx.
Nr · 1 − diag Arx ∗ W 2 ≤ σ 2 · N 2 · Mrx. rx r 2
(14a) (14b) (14c)
[F]:,i ∈ F ∀ i = 1, . . . , Mtx.
(14d)
[W]:,j ∈ W ∀ j = 1, . . . , Mrx
(14e)
A codebook of beams output by this design is shown in Fig. 13. Compared to traditional beams, the beams produced by [54, 73] make use of side lobes to cancel SI spatially while providing adequate coverage across the service region from .−60◦ to .60◦ . The codebooks designed with this framework proved to offer far greater robustness to SI and similar beamforming gain when compared to conventional codebooks, which allowed them to deliver sum spectral efficiencies .Rtx + Rrx that approach the full-duplex capacity without analog or digital SIC. Designs like this
446
I. P. Roberts and H. A. Suraweera
are particularly exciting because they have the potential to accommodate codebookbased beam alignment while also mitigating SI through beamforming. For more details and more extensive evaluation of this design, please see [54, 73].
4.5 Key Practical Challenges and Considerations We now outline important considerations when designing solutions for practical full-duplex mmWave systems, some of which have already been touched on in this chapter. Digitally Controlled Analog Beamforming Networks Unlike digital beamforming, which takes place in software/logic, analog beamforming is executed using phase shifter components, potentially along with attenuators and/or amplifiers, all of which are digitally controlled. In other words, some finite number of bits are dedicated to realizing the phase and amplitude of each beamforming weight. For instance, the discrete set of physically realizable phase settings by phase shifters with settings uniformly distributed between 0 and .2π with resolution .bphs bits can be expressed as
(i − 1) · 2π bphs .P = θi = : i = 1, . . . , 2 , 2bphs
(15)
meaning it is practically required, for the i-th beamforming weight .wi , that angle(wi ) ∈ P for all i. In practice, it should be noted that both phase and amplitude control typically have some error associated with them, which is generally frequency-dependent. Some phased arrays employ phase shifters based on a vector modulator architecture, where the in-phase and quadrature components of a signal can be scaled independently to produce a desired phase shift, in which case phase shifter settings are presumably no longer uniformly spaced. Note that such an architecture also offers a means to scale the amplitude of the output signal. Practical beamformingbased full-duplex solutions for mmWave systems should account for the limitations imposed by a particular analog beamforming architecture. Quantized control of each beamforming weight leads to non-convex sets that are difficult to optimize over. Blindly projecting a solution onto the set of physically realizable beamforming vectors can be detrimental, as small errors in this full-duplex setting are magnified by the sheer strength of SI relative to a desired signal. This motivates the use of highresolution phase shifters in full-duplex mmWave systems and/or ways to handle the non-convexity posed by quantized phase shifters.
.
Accommodating Codebook-Based Beam Alignment Codebook-based analog beamforming is a critical component of mmWave communication systems [53, 74]. Rather than measure a high-dimensional MIMO channel and subsequently configure an analog beamforming network, modern mmWave systems instead rely on beam
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
447
alignment procedures to identify promising beamforming directions, typically via exploration of a codebook of candidate beams [53, 74, 75]. This offers a simple and robust way to configure an analog beamforming network without downlink/uplink MIMO channel knowledge a priori, which is not obtainable in practice. ¯ and .W ¯ be transmit and receive beamforming codebooks used at a fullLet .F duplex device. Executing beam alignment on each link independently (in a halfduplex fashion) would aim to solve (or approximately solve) the following problems or ones taking a similar form. f⋆ = argmax SNRtx (f),
.
f∈F¯
w⋆ = argmax SNRrx (w) ¯ w∈W
(16)
If selecting the transmit and receive beams independently to maximize each link’s SNR, the selected beam pair may couple high SI when using traditional beamforming codebooks. In other words, .INRrx (f⋆ , w⋆ ) may be much greater than 0 dB. In fact, we show this has been confirmed by recent measurements [76], which we cover shortly in Sect. 4.7. Therefore, one can imagine it would be preferable from a full-duplex perspective to jointly select transmit and receive beams that deliver high .SNRtx and .SNRrx and also couple low .INRrx . This is precisely the motivation for [77], which is also introduced in Sect. 4.7. If codebooks could be designed such that all possible .(f⋆ , w⋆ ) couple sufficiently low SI, then beam alignment may be conducted on the transmit and receive links independently, as shown in (16), with guarantees of low SI regardless of which beams are selected—the motivation for [54, 73] and the codebook design problem introduced in Sect. 4.3. Creating solutions like these that accommodate beam alignment will be critical to the deployment of full-duplex mmWave systems. Self-Interference Channel Estimation and Limited Channel Knowledge As mentioned just previously, current practical mmWave systems circumvent downlink/uplink channel estimation via beam alignment, meaning they do not have knowledge of the transmit and receive MIMO channels .htx and .hrx . Practical beamforming-based solutions for full-duplex mmWave systems should account for this. Efficient and accurate estimation of the SI MIMO channel .H is a research problem still in its infancy. This is perhaps most largely due to the fact that modeling .H is still an open research problem itself, whose outcomes may inspire strategies for its estimation. MIMO channel estimation in mmWave transceivers is complicated by the sheer size of these channels and the fact that DACs and ADCs observe the channel through the lens of analog beamformers [53]. Routes to reduce estimation overhead would be valuable contributions, potentially by leveraging static portions of the SI channel (e.g., the direct coupling between the arrays) and/or by accurate channel modeling. Nonetheless, whatever SI channel estimation strategies are developed will naturally be imperfect to some degree, suggesting that practical designs should be robust to channel estimation error. Robustness is especially critical in full-duplex settings, since small errors in mitigating SI can be detrimental due to its overwhelming strength.
448
I. P. Roberts and H. A. Suraweera
Leveraging User Selection A full-duplex mmWave base station will likely serve multiple downlink users and uplink users over many time slots. As assumed thus far, let us consider the case where the base station can serve a single downlink-uplink user pair in a full-duplex fashion at any given time, multiplexing user pairs in time. The degree of SI coupled at the full-duplex base station depends on the transmit and receive beams when serving a particular downlink-uplink user pair. In addition, the degree of cross-link interference depends on the two users being served. When given a pool of user pairs needing service, one can therefore imagine that strategically selecting which user pair to serve has the potential to be a powerful tool to improve full-duplex performance [9]. This concept has not been fully fleshed out in existing literature and deserves future study. Interesting future work, for instance, would be the design of intelligent schedulers that incorporate full-duplex user selection to improve link-level spectral efficiency and gains in network-level throughput.
4.6 Full-Duplex Integrated Access and Backhaul A particularly motivating application of full-duplex in mmWave cellular networks is in integrated access and backhaul (IAB) [45, 78, 79], where a fiber-connected base station serves nearby users and is responsible for maintaining a wireless backhaul to one or more nearby base stations, as illustrated in Fig. 15. By using the same pool of mmWave spectrum for access and wireless backhauling, dense mmWave networks can be deployed with fewer dedicated fiber connections, which reduces the cost, time, and permitting associated with deployment. Like other multi-hop wireless networks, however, integrated access and backhaul (IAB) networks have faced scaling challenges due to degraded throughput and higher latency as the network grows. Recent work has investigated the use of full-duplex to overcome these obstacles [19, 71]. Inter-sector and Intra-sector Full-Duplex An IAB node can operate in a fullduplex fashion in two main ways, depending on scope. Consider a sectorized IAB
IAB
Donor Backhaul Self-Interference
Ac ces s Cross-Link Interference
UE Fiber
Fig. 15 A full-duplex mmWave IAB node receives wireless backhaul from a fiber-connected donor on one sector while transmitting access to a downlink user on another sector
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
BS
IAB
receive
449
transmit
transmit and receive
transmit and receive
Fig. 16 (left) An inter-sector full-duplex IAB node full-duplexes transmission and reception across sectors. (right) An intra-sector full-duplex IAB node full-duplexes transmission and reception within each sector
node with three sectors, each equipped with a transceiver serving a .120◦ field of view. The first potential full-duplex operating mode, which we refer to as intersector full-duplex, allows each sector to either transmit or receive, meaning SI may be inflicted by one sector onto one or both of the other sectors. Figure 15 depicts inter-sector full-duplex, for instance, where one sector receives while another transmits in-band. With inter-sector full-duplex, sectors are no longer required to collectively transmit or collectively receive but rather can be scheduled independently, and the transceiver on each sector can be merely half-duplex-capable. This full-duplex mode unlocks scheduling opportunities that are otherwise not available, allowing the network to achieve higher throughput, as we will highlight shortly. The other potential full-duplex operating mode we refer to is intra-sector fullduplex, where each sector is equipped with a full-duplex transceiver, and we illustrate in Fig. 16. In this case, each sector may transmit and receive from a downlink-uplink user pair within its field of view. Notice that, when each sector operates simultaneously and in-band as its neighboring sectors, SI is inflicted from each transmitting sector onto each receiving sector. Naturally, intra-sector fullduplex has the potential to outperform inter-sector full-duplex, but the gains of such have not been thoroughly investigated. Early deployments of full-duplex mmWave base stations, especially for IAB, will likely be of the inter-sector form such as the one shown in Fig. 17, since SI is likely less severe and half-duplex transceivers can be used per sector. Recent Progress Validating Full-Duplex IAB Full-duplex IAB networks are studied in [19] to characterize the throughput and latency gains when upgrading IAB nodes from half-duplex to full-duplex transceivers. Note that, in [19], users were kept as half-duplex devices, which is a realistic assumption for the foreseeable future. The authors show through analysis and simulation that, with full-duplex IAB nodes, user latency can reduce four-fold and user throughput can improve eight-fold for fourth-hop users—far transcending the familiar doubling of spectral efficiency offered by full-duplex at the link level. In general, [19] shows that users further from the donor enjoy greater performance improvements with full-duplex. This can be explained by the fact that full-duplex IAB networks can meet latency and throughput targets that half-duplex IAB networks cannot, yielding relative gains that can tend to infinity. Ultimately, the gains offered by full-duplex are the following, thanks to
450
I. P. Roberts and H. A. Suraweera
Fig. 17 The 28 GHz phased array platform used for measurements of SI in [76, 77, 80]. Transmit array on right; receive array on left. Received SI power depends on the steering direction of the transmit and receive beams. The multi-panel triangular platform shown here is a relevant deployment option for mmWave small-cell base stations and IAB nodes. Reproduced from [76] with permission
the scheduling opportunities it unlocks: certain links that must be orthogonalized with half-duplex IAB nodes need not be with full-duplex, allowing packets to more quickly propagate through the multi-hop network. Compared to their halfduplex counterparts, full-duplex IAB networks can facilitate reduced latency, higher throughput, fairer service, and deeper networks—even with imperfect SIC [19].
4.7 Recent Experimental Research Outcomes The majority of research on full-duplex mmWave systems has been theoretical in nature, using simulation to validate proposed solutions. Recently, there has been work experimentally investigating full-duplex mmWave systems, two efforts of which we introduce herein. Measurements of mmWave Self-Interference SI was studied quite extensively over the past decade or so, largely in the context of sub-6 GHz transceivers. To progress development of full-duplex mmWave systems, a necessary first step is to better understand SI in mmWave systems. Measurements of mmWave SI in [71, 81– 86] provide useful insights but do not offer a means to evaluate proposed mmWave full-duplex solutions since they provide neither a MIMO SI channel model nor adequate beam-based measurements; most of these were taken using horn or lens antennas, not phased ararys. To evaluate beamforming-based mmWave full-duplex solutions thus far, researchers have primarily used highly idealized channel models. To address these shortcomings, a measurement campaign of SI at 28 GHz was conducted in [76, 80] using a multi-panel 16.×16-element phased array platform.
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
451
Fig. 18 The empirical CDF of the nearly 6.5 million measurements of 28 GHz SI using 16.×16 planar arrays in [76], along with a fitted log-normal distribution. Less than 1% of transmit-receive beam pairs yield .INRrx ≤ 0 dB. Highly directional mmWave beams do not necessarily offer sufficient isolation for full-duplex but strategically selecting them can. Reproduced from [76] with permission
In this campaign [76, 80], a spatial inspection of SI was conducted in an anechoic chamber by electronically sweeping the beams of the transmit phased array and receive phased array across a number of combinations in azimuth and elevation. For each transmit direction and receive direction, SI power was measured, for a total of nearly 6.5 million measurements. This work showed that SI indeed tends to be well above the noise floor—even with highly directional mmWave beams— but select transmit-receive beam pairs coupled levels of SI below the noise floor without any additional cancellation, as visible in the CDF of measurements in Fig. 18. These measurements revealed large-scale trends based on steering direction, along with noteworthy small-scale phenomena when beams undergo small shifts (on the order of one degree). The authors provide a statistical characterization of their measurements, allowing researchers to draw realistic realizations of SI and conduct statistical analyses. A key takeaway from this work showed that a commonly used idealized near-field channel model (i.e., the spherical-wave channel model [87]) is not a suitable one for practical mmWave full-duplex systems, summarized by Fig. 19. This motivates the need for a new measurement-backed channel model for SI in full-duplex mmWave systems. Beam Selection for Full-Duplex mmWave Communication Systems A particularly exciting observation from the measurements in [76] was that slightly shifting the transmit and/or receive beams at the full-duplex transceiver (on the order of one degree) could significantly reduce SI, often by 20 dB or more. This motivated the work of [77], in which the authors propose the first beam selection methodology for full-duplex mmWave communication systems. Traditional beam
452
I. P. Roberts and H. A. Suraweera
Fig. 19 (left) The azimuth cut of SI measurements from [76]. (right) The simulated counterpart of (left) using a popular idealized near-field channel model for SI [87]. The stark difference between the two motivates the need for a new, measurement-backed SI channel model. Reproduced from [76] with permission
selection typically selects beams that maximize SNR via codebook-based beam alignment measurements, as highlighted in (16). In [77], the authors propose a measurement-driven beam selection methodology, called STEER, atop conventional beam alignment that incorporates SI into transmit and receive beam selection at a full-duplex mmWave transceiver. Suppose a full-duplex base station serves a downlink user and uplink user as illustrated in Fig. 12. Taking the perspective of a full-duplex base station, let .Atx be a set (a codebook) of .Ntx candidate transmit beam steering directions (azimuthelevation pairs) used during beam alignment, and let .Arx be a codebook of .Nrx candidate receive beam steering directions defined analogously (e.g., see Fig. 14).
(i) (i) θtx , φtx : i = 1, . . . , Ntx .
(j ) (j ) Arx = θrx , φrx : j = 1, . . . , Nrx Atx =
.
(17) (18)
Solving the following beam selection problems through conventional beam alignment yields initial beam selections at the full-duplex base station. .
(i ⋆ ) (i ⋆ ) θtx , φtx = argmax SNRtx (θ, φ).
(19)
(j ⋆ ) ( j ⋆ ) = argmax SNRrx (θ, φ) θrx , φrx
(20)
(θ,φ)∈Atx
(θ,φ)∈Arx
While these initial beam selections may yield high SNRs, they are likely to couple high levels of SI, shown by measurements in [76]. To identify transmit and receive beams that the base station can use to deliver high SNRs and low SI, the authors of [77] propose STEER.
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems δθ
N (Δθ, Δφ, δθ, δφ)
453
Δθ
δφ
Δφ (Δϑ, Δϕ) Transmit, T (i)
Receive, R(j)
Fig. 20 The spatial neighborhoods surrounding a given transmit direction and receive direction (shown as filled circles). The size of the neighborhoods is dictated by .(Δθ, Δφ) and their resolution by .(δθ, δφ). The sub-neighborhood .(Δϑ, Δϕ) is relevant in problem (26) [77]. Reproduced from [77] with permission
STEER leverages the small-scale variability observed in the measurements of [76] to preserve high .SNRtx and high .SNRrx while reducing .INRrx . To identify attractive steering directions for full-duplex operation, STEER measures the SI incurred when transmitting and receiving around the spatial neighborhoods surrounding the initial transmit and receive steering directions, as described by Fig. 20. Quantifying the size of these spatial neighborhoods, let .Δθ and .Δφ be maximum absolute azimuthal and elevational deviations from the given transmit direction and receive direction. Discretizing these neighborhoods, let .δθ and .δφ be the measurement resolution in azimuth and elevation, respectively, which should not be larger than .(Δθ, Δφ). For instance, .(δθ , δφ) = (1◦ , 1◦ ) while .(Δθ, Δφ) = (2◦ , 2◦ ). The spatial neighborhood .N surrounding a transmit/receive direction can be expressed using the azimuthal neighborhood .Nθ and elevational neighborhood .Nφ defined as Δθ Δθ Nθ (Δθ , δθ ) = m · δθ : m ∈ − . , δθ δθ Δφ Δφ Nφ (Δφ, δφ) = n · δφ : n ∈ − , δφ δφ .
(21) (22)
where .⎿·⏌ is the floor operation and .[a, b] = {a, a + 1, . . . , b − 1, b}. The complete neighborhood is the product of the azimuthal and elevational neighborhoods as N(Δθ , Δφ, δθ , δφ) = (θ, φ) : θ ∈ Nθ (Δθ , δθ ), φ ∈ Nφ (Δφ, δφ) .
.
(23)
⋆ ⋆ The spatial neighborhoods .T(i ) and .R(j ) surrounding the transmit and receive directions output by conventional beam selection are, respectively, written as
i⋆ ⋆ ( ) (i ⋆ ) T(i ) = θtx , φtx + N(Δθ , Δφ, δθ , δφ). ⋆ (j ⋆ ) (j ⋆ ) + N(Δθ , Δφ, δθ , δφ) . R(j ) = θrx , φrx .
initial selection
neighborhood
(24) (25)
454
I. P. Roberts and H. A. Suraweera
Let .INRrx (θtx , φtx , θrx , φrx ) be the receive link INR due to SI when transmitting toward .(θtx , φtx ) and receiving toward .(θrx , φrx ) at the full-duplex base station. STEER solves the following beam selection problem to net a transmit direction ⋆ ⋆ ⋆ ⋆ .(θtx , φtx ) and receive direction .(θrx , φrx ) that the full-duplex transceiver will use. .
⋆ ⋆ ⋆ θtx⋆ , φtx , θrx , φrx = argmin min
(θtx ,φtx ) (Δϑ,Δϕ) (θrx ,φrx )
Δϑ 2 + Δϕ 2.
tgt s.t. INRrx (θtx , φtx , θrx , φrx ) ≤ max INRrx , INRmin rx . i⋆ ( ) (i ⋆ ) + N(Δϑ, Δϕ, δθ , δφ). (θtx , φtx ) ∈ θtx , φtx j⋆ ( ) (j ⋆ ) + N(Δϑ, Δϕ, δθ , δφ). (θrx , φrx ) ∈ θrx , φrx 0 ≤ Δϑ ≤ Δθ , 0 ≤ Δϕ ≤ Δφ
(26a)
(26b) (26c) (26d) (26e)
Here, .INRmin rx is the minimum INR over the entire .(Δθ, Δφ) spatial neighborhood, tgt and .INRrx is an INR target the system aims for. By constraining the distance 2 2 .Δϑ + Δϕ , S TEER minimizes the deviation of the selected beams make from the initial selections and thus can preserve high .SNRtx and .SNRrx . Reducing .INRrx , therefore, leads to SINR improvements over the initial beam selections. In [77], the authors present an algorithm to solve problem (26) with a minimal number of SI measurements. Evaluation of STEER with 28 GHz phased arrays highlights its ability to reduce SI while preserving high SNRs, courtesy of noteworthy variability of SI over small spatial neighborhoods. This can be seen in Fig. 21, which shows STEER’s potential as a full-duplex solution without any supplemental analog or
Fig. 21 (left) The CDF of .INRrx for various neighborhood sizes .(Δθ, Δφ). (right) The CDF of the gap between .SINRrx and its upper bound .SNRrx for various neighborhood sizes .(Δθ, Δφ). STEER reliably reduces .INRrx , as evident in (left), while maintaining high beamforming gain, shifting .SINRrx closer to .SNRrx as shown in (right) [77]. Reproduced from [77] with permission
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
455
digital SIC. SI can be reliably reduced with STEER to below the noise floor with (Δθ, Δφ) = (2◦ , 2◦ ), and by preserving SNR while doing so, it can increase SINR toward its upper bound.
.
5 A Look Ahead: What Is the Future of Full-Duplex? We conclude this chapter by highlighting several key topics that need further research and development from engineers in industry and academia to advance and mature full-duplex technology. Real-Time, Deployment-Ready Full-Duplex Solutions Most full-duplex solutions are validated in simulation, lab settings, or controlled environments. Moreover, most evaluations ignore the overhead associated with configuring a full-duplex solution, which is a key hurdle in practical deployments. In practice, the radio resources consumed to configure a full-duplex solution must not outweigh the gains it offers. It is essential that full-duplex solutions are designed and evaluated with real-time deployments in mind—with strict overhead requirements, with the ability to adapt to dynamic environments, and with minimal form-factors and power consumption. Further Advancement of Full-Duplex mmWave and terahertz Systems While increased attention has been devoted to the research and development of full-duplex mmWave systems, there remain plenty of open problems that need addressing before such systems are brought to life. Continuing to develop means to mitigate SI is certainly welcome, along with characterizing the gains full-duplex can offer mmWave networks and prototyping proofs-of-concept. In addition, identifying and creating applications that are particularly beneficial from full-duplex would offer new directions and requirements for its solutions. For instance, the role of fullduplex in reconfigurable intelligent surface (RIS) applications has been investigated recently as it poses a means to better use RIS resources and aid in the cancellation of SI [88, 89]. Finally, exploring full-duplex terahertz systems, their applications, and how solutions for such may differ from those at mmWave would be valuable future work. Further Advancement of Machine Learning to Enable Full-Duplex To supplement existing work, the prospects of using machine learning for full-duplex are still quite open-ended, especially beyond digital SIC. Using machine learning to configure and adaptively update analog SIC filters, for instance, or to configure beamformers that mitigate SI in full-duplex mmWave systems is a topic that has yet to be fully explored. In addition, machine learning may be able to reduce the effects of transceiver impairments in full-duplex through digital predistortion. There are also the prospects of using machine learning to intelligently schedule users and proactively manage cross-link interference within full-duplex networks.
456
I. P. Roberts and H. A. Suraweera
Network-Level Studies Comparing Full-Duplex to Other Duplexing Technologies To justify the deployment of wireless networks equipped with full-duplex, it is paramount that researchers conduct studies that prove its network-level gains over traditional multiplexing strategies, such as TDD, FDD, and space-division multiple access, as well as the recently proposed cross-division duplexing (XDD) [90]. This has been examined fairly extensively for sub-6 GHz wireless networks but less so for mmWave networks and applications of IAB, in particular. Full-Duplex in Joint Communication and Sensing Systems Joint communication and sensing is expected to play an important role in the next evolution of wireless networks. Full-duplex solutions have the opportunity to facilitate highfidelity sensing while transmitting by eliminating SI [10, 26]. Sensing information may be used to directly improve communication performance [91] or for higherlevel applications. In addition, there are opportunities for full-duplex to enable the sensing and jamming of eavesdroppers to establish more secure communication [27]. Finally, the relationship between SI channel estimation and environmental sensing poses an opportunity for the two to supplement and/or justify one another. For instance, accurate sensing of the environment may yield an estimate of the SI channel and, in turn, enable full-duplex operation. Integrating Full-Duplex into Wireless Standards Wireless networks of today have been built on decades of a half-duplex assumption at each transceiver. Fullduplex offers immediate upgrades to a transceiver, but, to effectively make use of this powerful capability, wireless networks must support it. This motivates the need for research on seamlessly adopting full-duplex operation into existing wireless standards. Work items and studies in the 3rd Generation Partnership Project (3GPP) continue to investigate the merits and standardization of full-duplex in cellular systems, especially in the context of IAB in Releases 17 and 18 [45]. Acknowledgments We would like to thank Jeffrey G. Andrews, Sriram Vishwanath, Aditya Chopra, Thomas Novlan, Manan Gupta, Rajesh K. Mishra, and Hardik B. Jain for the discussions, feedback, and collaborations that contributed to the preparation of this chapter. The work of Ian P. Roberts was supported by the U.S. National Science Foundation under Grant 1610403.
References 1. A. Sabharwal, P. Schniter, D. Guo, D. W. Bliss, S. Rangarajan, and R. Wichman, “In-band fullduplex wireless: Challenges and opportunities,” IEEE J. Sel. Areas Commun., vol. 32, no. 9, pp. 1637–1652, Sep. 2014. 2. K. E. Kolodziej, B. T. Perry, and J. S. Herd, “In-band full-duplex technology: Techniques and systems survey,” IEEE Trans. Microw. Theory Techn., vol. 67, no. 7, pp. 3025–3041, Feb. 2019. 3. S. Chen, M. A. Beach, and J. P. McGeehan, “Division-free duplex for wireless applications,” Electronics Letters, vol. 34, pp. 147–148, Jan. 1998. 4. J. I. Choi, M. Jain, K. Srinivasan, P. Levis, and S. Katti, “Achieving single channel, full duplex wireless communication,” in Proc. ACM MobiCom, Sep. 2010, pp. 1–12.
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
457
5. S. Hong, J. Brand, J. I. Choi, M. Jain, J. Mehlman, S. Katti, and P. Levis, “Applications of selfinterference cancellation in 5G and beyond,” IEEE Commun. Mag., vol. 52, no. 2, pp. 114–121, Feb. 2014. 6. H. Alves, T. Riihonen, and H. A. Suraweera, Full-Duplex Communications for Future Wireless Networks. Singapore: Springer, 2020. 7. M. Chung, M. S. Sim, J. Kim, D. K. Kim, and C.-B. Chae, “Prototyping real-time full duplex radios,” IEEE Commun. Mag., vol. 53, no. 9, pp. 56–63, Sep. 2015. 8. Z. Xiao, P. Xia, and X. Xia, “Full-duplex millimeter-wave communication,” IEEE Wireless Commun., vol. 24, no. 6, pp. 136–143, Dec. 2017. 9. I. P. Roberts, J. G. Andrews, H. B. Jain, and S. Vishwanath, “Millimeter-wave full duplex radios: New challenges and techniques,” IEEE Wireless Commun., vol. 28, no. 1, pp. 36–43, Feb. 2021. 10. C. B. Barneto, S. D. Liyanaarachchi, M. Heino, T. Riihonen, and M. Valkama, “Full duplex radio/radar technology: The enabler for advanced joint communication and sensing,” IEEE Wireless Commun., vol. 28, no. 1, pp. 82–88, Feb. 2021. 11. Y. Kurzo, A. Burg, and A. Balatsoukas-Stimming, “Design and implementation of a neural network aided self-interference cancellation scheme for full-duplex radios,” in Proc. Asilomar Conf. Signals, Sys., and Comput., Oct. 2018, pp. 589–593. 12. Y. Liao, K. Bian, L. Song, and Z. Han, “Full-duplex MAC protocol design and analysis,” IEEE Commun. Lett., vol. 19, no. 7, pp. 1185–1188, Jul. 2015. 13. Y. Liao, T. Wang, L. Song, and Z. Han, “Listen-and-talk: Full-duplex cognitive radio networks,” in Proc. IEEE GLOBECOM, Dec. 2014, pp. 3068–3073. 14. Y. Liao, L. Song, Z. Han, and Y. Li, “Full duplex cognitive radio: a new design paradigm for enhancing spectrum usage,” IEEE Commun. Mag., vol. 53, no. 5, pp. 138–145, May 2015. 15. L. Song, Y. Liao, K. Bian, L. Song, and Z. Han, “Cross-layer protocol design for CSMA/CD in full-duplex WiFi networks,” IEEE Commun. Lett., vol. 20, no. 4, pp. 792–795, Apr. 2016. 16. K. M. Thilina, H. Tabassum, E. Hossain, and D. I. Kim, “Medium access control design for full duplex wireless systems: challenges and approaches,” IEEE Commun. Mag., vol. 53, no. 5, pp. 112–120, May 2015. 17. R. K. Mishra, Y. Chen, and I. P. Roberts, “Collision detection in dense Wi-Fi networks using self-interference cancellation,” in Proc. IEEE ICC Wkshp., Jun. 2020, pp. 1–6. 18. S. Haddad, A. Özgür, and E. Telatar, “Can full-duplex more than double the capacity of wireless networks?” in Proc. IEEE ISIT, Jun. 2017, pp. 963–967. 19. M. Gupta, I. P. Roberts, and J. G. Andrews, “System-level analysis of full-duplex selfbackhauled millimeter wave networks,” IEEE Trans. Wireless Commun., vol. 22, no. 2, pp. 1130–1144, Feb. 2023. 20. R. W. Heath Jr. and A. Lozano, Foundations of MIMO Communication. Cambridge, UK: Cambridge University Press, 2018. 21. M. Duarte, C. Dick, and A. Sabharwal, “Experiment-driven characterization of full-duplex wireless systems,” IEEE Trans. Wireless Commun., vol. 11, no. 12, pp. 4296–4307, Nov. 2012. 22. M. Jain, J. I. Choi, T. Kim, D. Bharadia, S. Seth, K. Srinivasan, P. Levis, S. Katti, and P. Sinha, “Practical, real-time, full duplex wireless,” in Proc. ACM MobiCom, Sep. 2011, pp. 301–312. 23. D. Bharadia, E. McMilin, and S. Katti, “Full duplex radios,” in Proc. ACM SIGCOMM, Aug. 2013, pp. 375–386. 24. D. Korpi, “Full-duplex wireless: Self-interference modeling, digital cancellation, and system studies,” Ph.D. dissertation, Tampere University of Technology, Dec. 2017. 25. S. Haykin, “Cognitive radio: brain-empowered wireless communications,” IEEE J. Sel. Areas Commun., vol. 23, no. 2, pp. 201–220, Feb. 2005. 26. Z. Xiao and Y. Zeng, “Waveform design and performance analysis for full-duplex integrated sensing and communication,” IEEE J. Sel. Areas Commun., vol. 40, no. 6, pp. 1823–1837, Mar. 2022. 27. X. Wang, Z. Fei, J. A. Zhang, and J. Huang, “Sensing-assisted secure uplink communications with full-duplex base station,” IEEE Commun. Lett., vol. 26, no. 2, pp. 249–253, Dec. 2022.
458
I. P. Roberts and H. A. Suraweera
28. G. Zheng, I. Krikidis, J. Li, A. P. Petropulu, and B. Ottersten, “Improving physical layer secrecy using full-duplex jamming receivers,” IEEE Trans. Signal Process., vol. 61, no. 20, pp. 4962– 4974, Oct. 2013. 29. T. Riihonen, D. Korpi, O. Rantula, H. Rantanen, T. Saarelainen, and M. Valkama, “Inband full-duplex radio transceivers: A paradigm shift in tactical communications and electronic warfare?” IEEE Commun. Mag., vol. 55, no. 10, pp. 30–36, Oct. 2017. 30. J. Suarez, K. Kravtsov, and P. R. Prucnal, “Incoherent method of optical interference cancellation for radio-frequency communications,” IEEE J. Quantum Electron., vol. 45, no. 4, pp. 402–408, Mar. 2009. 31. X. Han, B. Huo, Y. Shao, and M. Zhao, “Optical RF self-interference cancellation by using an integrated dual-parallel MZM,” IEEE Photon. J., vol. 9, no. 2, pp. 1–8, Apr. 2017. 32. X. Su, X. Han, S. Fu, S. Wang, C. Li, Q. Tan, G. Zhu, C. Wang, Z. Wu, Y. Gu, and M. Zhao, “Optical multipath RF self-interference cancellation based on phase modulation for full-duplex communication,” IEEE Photon. J., vol. 12, no. 4, pp. 1–14, Jun. 2020. 33. D. Korpi, T. Riihonen, V. Syrjälä, L. Anttila, M. Valkama, and R. Wichman, “Full-duplex transceiver system calculations: Analysis of ADC and linearity challenges,” IEEE Trans. Wireless Commun., vol. 13, no. 7, pp. 3821–3836, Jul. 2014. 34. D. Korpi, L. Anttila, V. Syrjälä, and M. Valkama, “Widely linear digital self-interference cancellation in direct-conversion full-duplex transceiver,” IEEE J. Sel. Areas Commun., vol. 32, no. 9, pp. 1674–1687, Sep. 2014. 35. H. Guo, J. Xu, S. Zhu, and S. Wu, “Realtime software defined self-interference cancellation based on machine learning for in-band full duplex wireless communications,” in Proc. IEEE ICNC, Mar. 2018, pp. 779–783. 36. A. Balatsoukas-Stimming, “Non-linear digital self-interference cancellation for in-band fullduplex radios using neural networks,” in Proc. IEEE SPAWC, Jun. 2018, pp. 1–5. 37. Y. Kurzo, A. T. Kristensen, A. Burg, and A. Balatsoukas-Stimming, “Hardware implementation of neural self-interference cancellation,” IEEE Trans. Emerg. Sel. Topics Circuits Syst., vol. 10, no. 2, pp. 204–216, Jun. 2020. 38. H. Guo, S. Wu, H. Wang, and M. Daneshmand, “DSIC: Deep learning based self-interference cancellation for in-band full duplex wireless,” in Proc. IEEE GLOBECOM, Dec. 2019, pp. 1–6. 39. Y. Chen, R. K. Mishra, D. Schwartz, and S. Vishwanath, “MIMO full duplex radios with deep learning,” in Proc. IEEE ICC Wkshp., Jun. 2020, pp. 1–6. 40. A. Balatsoukas-Stimming, “Joint detection and self-interference cancellation in full-duplex systems using machine learning,” in Proc. Asilomar Conf. Signals, Sys., and Comput., Oct. 2021, pp. 989–992. 41. L. Laughlin, M. A. Beach, K. A. Morris, and J. L. Haine, “Optimum single antenna full duplex using hybrid junctions,” IEEE J. Sel. Areas Commun., vol. 32, no. 9, pp. 1653–1661, Jun. 2014. 42. T. Dinc, M. Tymchenko, A. Nagulu, D. Sounas, A. Alu, and H. Krishnaswamy, “Synchronized conductivity modulation to realize broadband lossless magnetic-free non-reciprocity,” Nature Commun., vol. 8, no. 11, pp. 1–9, Oct. 2017. 43. J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. K. Soong, and J. C. Zhang, “What will 5G be?” IEEE J. Sel. Areas Commun., vol. 32, no. 6, pp. 1065–1082, Jun. 2014. 44. X. Liu, Z. Xiao, L. Bai, J. Choi, P. Xia, and X.-G. Xia, “Beamforming based full-duplex for millimeter-wave communication,” Sensors, vol. 16, no. 7, p. 1130, Jul. 2016. 45. 3GPP, “3GPP TS 38.174: New WID on IAB enhancements,” 2021. [Online]. Available: https:// www.3gpp.org/dynareport/38174.htm 46. D. Bharadia and S. Katti, “Full duplex MIMO radios,” in Proc. USENIX NSDI, Apr. 2014, pp. 359–372. 47. S. Huberman and T. Le-Ngoc, “MIMO full-duplex precoding: A joint beamforming and selfinterference cancellation structure,” IEEE Trans. Wireless Commun., vol. 14, no. 4, pp. 2205– 2217, Apr. 2015. 48. T. Riihonen, S. Werner, and R. Wichman, “Mitigation of loopback self-interference in fullduplex MIMO relays,” IEEE Trans. Signal Process., vol. 59, no. 12, pp. 5983–5993, Aug. 2011.
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
459
49. H. A. Suraweera, I. Krikidis, G. Zheng, C. Yuen, and P. J. Smith, “Low-complexity end-toend performance optimization in MIMO full-duplex relay systems,” IEEE Trans. Wireless Commun., vol. 13, no. 2, pp. 913–927, Jan. 2014. 50. H. Q. Ngo, H. A. Suraweera, M. Matthaiou, and E. G. Larsson, “Multipair full-duplex relaying with massive arrays and linear processing,” IEEE J. Sel. Areas Commun., vol. 32, no. 9, pp. 1721–1737, Jun. 2014. 51. B. P. Day, A. R. Margetts, D. W. Bliss, and P. Schniter, “Full-duplex MIMO relaying: Achievable rates under limited dynamic range,” IEEE J. Sel. Areas Commun., vol. 30, no. 8, pp. 1541–1553, Sep. 2012. 52. E. Everett, C. Shepard, L. Zhong, and A. Sabharwal, “SoftNull: Many-antenna full-duplex wireless via digital beamforming,” IEEE Trans. Wireless Commun., vol. 15, no. 12, pp. 8077– 8092, Dec. 2016. 53. R. W. Heath, N. González-Prelcic, S. Rangan, W. Roh, and A. M. Sayeed, “An overview of signal processing techniques for millimeter wave MIMO systems,” IEEE J. Sel. Topics Signal Process., vol. 10, no. 3, pp. 436–453, Apr. 2016. 54. I. P. Roberts, H. B. Jain, S. Vishwanath, and J. G. Andrews, “Millimeter wave analog beamforming codebooks robust to self-interference,” in Proc. IEEE GLOBECOM, Dec. 2021, pp. 1–6. 55. R. López-Valcarce and N. González-Prelcic, “Beamformer design for full-duplex amplify-andforward millimeter wave relays,” in Proc. ISWCS, Aug. 2019, pp. 86–90. 56. R. López-Valcarce and N. González-Prelcic, “Analog beamforming for full-duplex millimeter wave communication,” in Proc. ISWCS, Aug. 2019, pp. 687–691. 57. J. Palacios, J. Rodríguez-Fernández, and N. González-Prelcic, “Hybrid precoding and combining for full-duplex millimeter wave communication,” in Proc. IEEE GLOBECOM, Dec. 2019, pp. 1–6. 58. K. Satyanarayana, M. El-Hajjar, P. Kuo, A. Mourad, and L. Hanzo, “Hybrid beamforming design for full-duplex millimeter wave communication,” IEEE Trans. Veh. Technol., vol. 68, no. 2, pp. 1394–1404, Feb. 2019. 59. L. Zhu, J. Zhang, Z. Xiao, X. Cao, X. Xia, and R. Schober, “Millimeter-wave full-duplex UAV relay: Joint positioning, beamforming, and power control,” IEEE J. Sel. Areas Commun., vol. 38, no. 9, pp. 2057–2073, Sep. 2020. 60. J. M. B. da Silva, A. Sabharwal, G. Fodor, and C. Fischione, “1-bit phase shifters for largeantenna full-duplex mmWave communications,” IEEE Trans. Wireless Commun., vol. 19, no. 10, pp. 6916–6931, Oct. 2020. 61. R. López-Valcarce and M. Martínez-Cotelo, “Full-duplex mmWave MIMO with finiteresolution phase shifters,” IEEE Trans. Wireless Commun., May 2022, (early access). 62. A. Koc and T. Le-Ngoc, “Full-duplex mmWave massive MIMO systems: A joint hybrid precoding/combining and self-interference cancellation design,” IEEE Open J. Commun. Society, vol. 2, pp. 754–774, Mar. 2021. 63. Y. Cai, Y. Xu, Q. Shi, B. Champagne, and L. Hanzo, “Robust joint hybrid transceiver design for millimeter wave full-duplex MIMO relay systems,” IEEE Trans. Wireless Commun., vol. 18, no. 2, pp. 1199–1215, Feb. 2019. 64. I. P. Roberts, J. G. Andrews, and S. Vishwanath, “Hybrid beamforming for millimeter wave full-duplex under limited receive dynamic range,” IEEE Trans. Wireless Commun., vol. 20, no. 12, pp. 7758–7772, Dec. 2021. 65. I. P. Roberts, H. B. Jain, and S. Vishwanath, “Equipping millimeter-wave full-duplex with analog self-interference cancellation,” in Proc. IEEE ICC Wkshp., Jun. 2020, pp. 1–6. 66. G. C. Alexandropoulos, M. A. Islam, and B. Smida, “Full duplex hybrid A/D beamforming with reduced complexity multi-tap analog cancellation,” in Proc. IEEE SPAWC, May 2020, pp. 1–5. 67. G. C. Alexandropoulos and M. Duarte, “Joint design of multi-tap analog cancellation and digital beamforming for reduced complexity full duplex MIMO systems,” in Proc. IEEE ICC, May 2017, pp. 1–7.
460
I. P. Roberts and H. A. Suraweera
68. V. Singh, S. Mondal, A. Gadre, M. Srivastava, J. Paramesh, and S. Kumar, “Millimeter-wave full duplex radios,” in Proc. ACM MobiCom, Apr. 2020. 69. C. Zhang and X. Luo, “Adaptive digital self-interference cancellation for millimeter-wave fullduplex backhaul systems,” IEEE Access, vol. 7, pp. 175 542–175 553, Dec. 2019. 70. T. Dinc, A. Chakrabarti, and H. Krishnaswamy, “A 60 GHz CMOS full-duplex transceiver and link with polarization-based antenna and RF cancellation,” IEEE J. Solid-State Circuits, vol. 51, no. 5, pp. 1125–1140, May 2016. 71. G. Y. Suk, S.-M. Kim, J. Kwak, S. Hur, E. Kim, and C.-B. Chae, “Full duplex integrated access and backhaul for 5G NR: Analyses and prototype measurements,” IEEE Wireless Commun., vol. 29, no. 4, pp. 40–46, Aug. 2022. 72. R. López-Valcarce and M. Martínez-Cotelo, “Analog beamforming for full-duplex mmWave communication with low-resolution phase shifters,” in Proc. IEEE ICC, Jun. 2021, pp. 1–6. 73. I. P. Roberts, S. Vishwanath, and J. G. Andrews, “LONESTAR: Analog beamforming codebooks for full-duplex millimeter wave systems,” IEEE Trans. Wireless Commun., Jan. 2023, (early access). 74. Y. Heng, J. G. Andrews, J. Mo, V. Va, A. Ali, B. L. Ng, and J. C. Zhang, “Six key challenges for beam management in 5.5G and 6G systems,” IEEE Commun. Mag., vol. 59, no. 7, pp. 74–79, Jul. 2021. 75. J. Wang, Z. Lan, C.-W. Pyo, T. Baykas, C.-S. Sum, M. Rahman, J. Gao, R. Funada, F. Kojima, H. Harada, and S. Kato, “Beam codebook based beamforming protocol for multi-Gbps millimeter-wave WPAN systems,” IEEE J. Sel. Areas Commun., vol. 27, no. 8, pp. 1390–1399, Oct. 2009. 76. I. P. Roberts, A. Chopra, T. Novlan, S. Vishwanath, and J. G. Andrews, “Beamformed selfinterference measurements at 28 GHz: Spatial insights and angular spread,” IEEE Trans. Wireless Commun., vol. 21, no. 11, pp. 9744–9760, Jun. 2022. 77. I. P. Roberts, A. Chopra, T. Novlan, S. Vishwanath, and J. G. Andrews, “STEER: Beam selection for full-duplex millimeter wave communication systems,” IEEE Trans. Wireless Commun., vol. 70, no. 10, pp. 6902–6917, Oct. 2022. 78. M. Cudak, A. Ghosh, A. Ghosh, and J. G. Andrews, “Integrated access and backhaul: A key enabler for 5G millimeter-wave deployments,” IEEE Commun. Mag., vol. 59, no. 4, pp. 88–94, Apr. 2021. 79. C. Dehos, J. L. González, A. D. Domenico, D. Kténas, and L. Dussopt, “Millimeter-wave access and backhauling: the solution to the exponential data traffic increase in 5G mobile communications systems?” IEEE Commun. Mag., vol. 52, no. 9, pp. 88–95, Sep. 2014. 80. A. Chopra, I. P. Roberts, T. Novlan, and J. G. Andrews, “28 GHz phased array-based selfinterference measurements for millimeter wave full-duplex,” in Proc. IEEE WCNC, Apr. 2022, pp. 2583–2588. 81. S. Rajagopal, R. Taori, and S. Abu-Surra, “Self-interference mitigation for in-band mmWave wireless backhaul,” in Proc. IEEE CCNC, Jan. 2014, pp. 551–556. 82. Y. Kohda, K. Takano, D. Nakano, N. Ohba, T. Yamane, and Y. Katayama, “Single-channel full-duplex mmWave link using phased-array for Ethernet,” in Proc. IEEE CCNC, Jan. 2015, pp. 400–405. 83. B. Lee, J. Lim, C. Lim, B. Kim, and J. Seol, “Reflected self-interference channel measurement for mmWave beamformed full-duplex system,” in Proc. IEEE GLOBECOM Wkshp., Dec. 2015, pp. 1–6. 84. H. Yang, Y. He, C. Jen, C. Liu, S. Jou, X. Yin, M. Ma, and B. Jiao, “Interference measurement and analysis of full-duplex wireless system in 60 GHz band,” in Proc. IEEE APCCAS, Oct. 2016, pp. 273–276. 85. Y. He, X. Yin, and H. Chen, “Spatiotemporal characterization of self-interference channels for 60-GHz full-duplex communication,” IEEE Antennas Wireless Propag. Lett., vol. 16, pp. 2220–2223, May 2017. 86. K. Haneda, J. Järveläinen, A. Karttunen, and J. Putkonen, “Self-interference channel measurements for in-band full-duplex street-level backhaul relays at 70 GHz,” in Proc. IEEE PIMRC, Sep. 2018, pp. 199–204.
Full-Duplex Transceivers for Next-Generation Wireless Communication Systems
461
87. J.-S. Jiang and M. A. Ingram, “Spherical-wave model for short-range MIMO,” IEEE Trans. Commun., vol. 53, no. 9, pp. 1534–1541, Sep. 2005. 88. Y. Liu, Q. Hu, Y. Cai, G. Yu, and G. Y. Li, “Deep-unfolding beamforming for intelligent reflecting surface assisted full-duplex systems,” IEEE Trans. Wireless Commun., Dec. 2021, (early access). 89. P. P. Perera, V. G. Warnasooriya, D. Kudathanthirige, and H. A. Suraweera, “Sum rate maximization in STAR-RIS assisted full-duplex communication systems,” in Proc. IEEE ICC, May 2022, pp. 1–6. 90. H. Ji, Y. Kim, K. Muhammad, C. Tarver, M. Tonnemacher, T. Kim, J. Oh, B. Yu, G. Xu, and J. Lee, “Extending 5G TDD coverage with XDD: Cross division duplex,” IEEE Access, vol. 9, pp. 51 380–51 392, Mar. 2021. 91. A. Ali, N. González-Prelcic, R. W. Heath, and A. Ghosh, “Leveraging sensing at the infrastructure for mmWave communication,” IEEE Commun. Mag., vol. 58, no. 7, pp. 84–89, Jul. 2020.
Optical Wireless Communication Iman Tavakkolnia, Hossein Kazemi, Elham Sarbazi, and Harald Haas
1 Introduction The 6G vision is structured around several key performance indicators (KPIs) to deliver significant performance improvements beyond existing technologies. 6G will be human-centric and will transform societies for better and boost global growth, aligned with global sustainable development goals. Optical wireless communication (OWC) is able to significantly enhance network capabilities towards KPIs by targeting data rate, connection density, latency, energy footprint, and security. OWC can be deployed in various scenarios ranging from ultra-dense cells for access networks (i.e. indoor light-fidelity (LiFi) networks) to tera-bit-per-second links for backhaul connectivity. This is possible because of unique characteristics of light waves and the vast available bandwidth in the optical range of the electromagnetic spectrum, i.e. 590 THz in the visible and infrared range compared to the 300 GHz for entire radio frequency (RF) range. The unregulated spectrum available for OWC is the gateway to ultra-high data rate services for 6G in conjunction with efficient and intelligent utilisation of RF technologies. 6G will be developed in a way that hybrid networks co-exist with complementary features to combat issues such as high peak data rates and blockage. For instance, OWC can provide tera-bit-per-second data rates needed for real-time holographic applications, while when line-of-sight blockage happens for OWC the network can switch to RF communication.
I. Tavakkolnia (O) · H. Kazemi · E. Sarbazi · H. Haas LiFi Research and Development Centre, Department of Electronic and Electrical Engineering, University of Strathclyde, Glasgow, UK e-mail: [email protected]; [email protected]; [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_17
463
464
I. Tavakkolnia et al.
For full-duplex OWC network applications, visible light is used for downlink communication as well as the primary functionality of illumination enabled by the development of light emitting diode (LED) arrays. Infrared light is used for uplink communication. It has been shown that giga-bit-per-second data transmission is possible by using off-the-shelves LEDs and an efficient multiplexing technique [1]. Mobile user channel models are derived for typical indoor scenarios, and spatial multiplexing and diversity techniques are used to guarantee a reliable mobile communication system [2]. Moreover, OWC is a proven solution which can offload a significant portion of the data traffic from the already congested RF spectrum. As a matter of fact, OWC has been thriving in recent years and is included in communication standards such as IEEE 802.15.7 and 802.11.bb task groups. LiFi, which was introduced in 2011, is a network architecture based on OWC in visible and infrared spectra and encompasses all aspects of a high speed network [3]. While a wide spectrum is available for OWC, individual devices suffer from a limited electrical bandwidth originated from the slow electrical-to-optical signal conversion response of devices. Therefore, different wavelengths may be used in a wavelength division multiplexing (WDM) system. Laser diodes (LDs) can typically achieve significantly higher bandwidth compared to LEDs, while having narrow emission spectrum useful for dense WDM and spatial mapping of beams. An indoor environment can be covered by a large number of extra small cells using an array of LDs. This can be achieved by using specific optical devices which can guide the light to the required direction (i.e. covering the whole working area) [4] or using intelligent alignment system (i.e. virtually covering the whole working area) [5]. With recent advances in device fabrication, vertical-cavity surface emitting laser (VCSEL) diodes with several GHz of bandwidth have been fabricated. A VCSELbased OWC system also benefits from features such as high power conversion efficiency, low cost, long lifetime, and large array integration. Therefore, the high data rate required for 6G can be achieved through several alternatives depending on the application. When LDs are used, eye-safety should also be carefully considered [6]. It has been shown that tera-bit-per-second data rate is achievable with powers below the eye-safety limit [7]. The performance of digital communication systems is constantly improving, and at the same time, demand and energy consumption are also increasing at a rate greater than performance improvement. Therefore, it is vital to develop novel off-grid communication systems that provide high-speed connectivity without any additional demand for energy from the grid. The dual use of solar cells as energy harvesting devices and data detectors for OWC has been very promising since it will eventually lead to the realisation of self-powered connected devices [8, 9]. The importance of this technology becomes evident knowing that there will be 500 billion Internet-of-Things (IoT) devices by 2030 generating petabyte-scale data in real time [10]. Photovoltaics (PVs), beside their primary energy harvesting functionality, can detect the rapid change of the incident light intensity. Therefore, incorporating PVs in devices will improve both energy efficiency and data rate in future machine-type communication system. It is now the time to imagine an IoT device in a 6G network which is only powered by the energy that it has
Optical Wireless Communication
465
harvested itself from the ambient RF (using antennas) and light (using solar cells) electromagnetic spectra. The harvested energy is used for processing, sensing, and communication. Advanced materials play an important role in the development of this concept as they can enable efficient indoor energy harvesting [11, 12]. Reflective (or reconfigurable) intelligent surface (RIS) technology is a revolutionary concept which applies to both RF and OWC systems. Traditional communication systems rely on estimation of the communication channel, adaptation of transmission systems, and an effective signal processing technique to achieve the required performance levels. However, RIS targets “engineering” the channel to enhance the performance of the communication system [13]. This is realised by using reflective surfaces, such as metasurfaces, in appropriate location which can control the direction, amplitude, and phase of incident electromagnetic waves. Therefore, energy efficiency, spectral efficiency, security, and reliability can be improved significantly. This perfectly matches the 6G paradigm which incorporates some form of intelligence in all aspects of its structure [14]. In OWC, the RIS technology can be used to mitigate the link blockage effect, efficiently exploit the non-line-of-sight (NLoS) rays, establish virtual line-of-sight (LoS) links, improve channel condition for multiple-input-multiple-output transmission, increase secrecy capacity, and reduce power consumption. In this chapter, a brief overview of OWC systems is given with a guided referral to useful references for further reading. Afterwards, three promising aspects of OWC systems, highly relevant to the 6G KPIs, are explored in more details. These include tera-bit-per-second multi beam system design, zero-energy OWC-enabled systems, and LiFi through RIS.
2 Optical Wireless System Architecture An OWC system consists of a transmitter front-end, a receiver front-end, and the channel as it can be seen in Fig. 1. Similar to an RF communication system, the digital data is first processed and converted to an analogue waveform, but in contrast, the resulting signal is modulated on an optical wave. The data is most commonly directly modulated on the intensity of the optical wave. This type of modulation is
Input bits
Output bits
Digital processing
Digital processing
DAC
ADC
Modulation on optical waves (LD or LED)
Optical wireless channel
Signal detection (PD or PV)
Fig. 1 A typical OWC architecture. DAC: digital to analogue converter. ADC analogue to digital converter
466
I. Tavakkolnia et al.
called intensity modulation. It is also possible to utilise both intensity and phase of optical waves to convey the signal (i.e. coherent modulation), but this requires extra components and adds complexity to the system. The transmitted optical signal passes through the wireless channel and is then detected by the receiver front-end as long as the light beams or photons reach the optical detector. The receiver front-end consists of a photo-detector (PD) which basically converts the optical signals into electrical signals. Afterwards, the electrical signals are processed and the digital data is retrieved. Both transmitter and receiver front-ends may include some kind of optical components, such as lenses, filters, dichroic mirrors, concentrators, etc., to guide the optical signal to or from the wireless channel. Therefore, the overall channel impulse response can be represented as h(t) = hTx (t) o hc (t) o hRx (t),
.
(1)
where .hTx (t), .hc (t), and .hRx (t), respectively, show the impulse response of the transmitter, channel, and receiver, and .o is the convolution operator.
2.1 Transmitter Light sources can operate in different spectra, including the visible range (about 400 nm to 700 nm) and infrared range (above 700 nm to 1 mm). LEDs and LDs are the two main types of light sources which are used in different applications suited to their specific properties. Power conversion efficiency, output optical power, output light spectrum, and modulation bandwidth of a LED or LD are considered for deployment of light sources for an OWC system. The modulation bandwidth refers to how fast the change of incoming electrical signals can be applied to the optical wave by the LED or LD. This is determined by the physical characteristics of the device such as charge mobility and structural capacitance [15].
2.1.1
Light-Emitting Diodes
The normalised frequency response of a typical LED can be modelled as a lowpass filter with parameters execrated from experiments and numerical fittings [16]. Such a simplified modelling can be useful for general system design, but the actual usable modulation bandwidth depends on the exact system structure as well as the modulation format. The transfer function of the LED can be expressed as [17] ) ( iω −1 .H (ω) = 1+ , 2π W
(2)
where W is the 3-dB bandwidth of the system corresponding to the frequency at which the power transmitted through the system is reduced to half of its direct
Optical Wireless Communication
467
Fig. 2 LED nonlinear model Output intensity
Linear fit Peak intensity
DC bias point
AC signal
Turn-on voltage
Input voltage
current (DC) value. The 3-dB bandwidth can be related to rise .τr and fall .τf times, defined as the time difference between the 10% and 90% points of the voltage, respectively, when the voltage is increasing and decreasing after applying a square pulse. This relation is expressed as .W = ln 9/π(τr + τf ) [17]. Light sources are typically nonlinear devices as there is a nonlinear relationship between the input electrical signal and the output optical signal [18]. As shown in Fig. 2, the input signal should be kept in the semi-linear region of the LED operation. If any sample of the signal is larger than the peak intensity or smaller than the turn-on voltage, the optical intensity is clipped at the respective value. Therefore, the LED is usually biassed at a point where the effect of clipping at the bottom and top are minimised [19]. Without any kind of optics (e.g. lens), the radiation pattern of the light emitted by the LED is defined by the Lambertian law. This means that the optical power at a certain location (.PΩ ) can be determined from the total output optical power, (.PLED ), and other parameters of the LED, which is expressed as PΩ = ΩPLED
.
m+1 cosm φ, 2π
(3)
where .φ is the radiant angle and .Ω is the radiant solid angle. The parameter m denotes the Lambertian emission order, which is related to the half-power semiangle, .Φ by .m = −1/ log2 cos Φ. This radiation pattern is the basis of calculating the channel gains (see (20)). Phosphor converted white LEDs, multi-chip LEDs (combination of several single wavelength LEDs), micro-LEDs (.μLED), quantum dot LEDs (QLED), and organic LEDs (OLED) are various kinds of LEDs that have been used for OWC. While traditionally the modulation bandwidth of LEDs were limited to several MHz, recent advances in semiconductor fabrication have enabled development of
468
I. Tavakkolnia et al.
LEDs with hundreds of MHz of usable bandwidth [20, 21]. The output light of an LED is divergent and incoherent, which can offer wide coverage areas and easy deployment as dual purpose devices (i.e. communication and illumination). However, the output light of a LD is collimated and coherent, and LDs typically exhibit higher modulation bandwidth. This makes LDs ideal for high speed OWC and point-to-point data transmission [22].
2.1.2
Vertical-Cavity Surface-Emitting Lasers
Among various types of semiconductor LDs, VCSELs are one of the strongest contenders to realise ultra-high transmission rates for 6G OWC systems due to their intrinsically high performance and low cost. In contrast to edge-emitting lasers (EELs), the light emission of a VCSEL is perpendicular to its top surface, and its output laser beam is characterised by a symmetric transverse mode profile. VCSELs have found many applications in data communication and sensing domains such as optical fibre communications, free-space optical (FSO) interconnects, laser printers, atomic clocks, lidar for cellphone cameras, and automotive collision avoidance systems, among others [23, 24]. VCSELs are efficiently coupled to multi-mode fibre (MMF) cables with excellent mode matching, so they have replaced EELs in short-range fibre-optic communication systems, particularly for Gigabit Ethernet local area networks (LANs). Other prominent features of VCSELs that make them a suitable candidate for ultra-high-speed optical wireless system design include the following [23, 25]: (1) a high modulation bandwidth typically greater than 10 GHz; (2) a high electrical-to-optical (E/O) power conversion efficiency of at least .50%; (3) cost-efficient fabrication by virtue of their compatibility with large-scale integration (LSI) processes; (4) ease of the initial probe test without the need for separating devices into discrete chips; (5) convenient bonding and mounting; and (6) the possibility for multiple LDs to be densely packed and precisely arranged in twodimensional (2D) arrays. A single-mode VCSEL generates an output optical field in the fundamental transverse electromagnetic mode (TEM) with a narrow-band optical spectrum centred at the laser wavelength. The amplitude envelope of the resulting beam on the transverse plane is described by a Gaussian function, in that the optical power is maximum at the centre of the beam spot, and it decays exponentially with distance from the centre [26]. Although the Gaussian beam profile on the transverse plane has an infinite tail that never reaches zero, the radius of the beam spot is often determined by the radial distance where the normalised intensity drops to .e−2 with respect to its value on the propagation axis. The radius of the beam spot is of the following form [26]: / w(z) = w0 1 +
.
(
z zR
)2 ,
(4)
Optical Wireless Communication
469
Fig. 3 Longitudinal section of a Gaussian beam propagating in the positive direction of the z-axis along with its main characteristic parameters. The initial beam waist of radius .w0 formed by the laser source is located at .z = 0. The red curve shows the side view of the circular boundary of the Gaussian beam spot of radius .w(z) at any distance z from the beam waist. The dashed blue lines are the asymptotes of the Gaussian beam for .z >> zR , representing the far field divergence angle .θ with respect to the propagation axis
where .w0 is the radius of the beam waist and .zR is the Rayleigh range, which is related to the beam waist radius .w0 and the laser wavelength .λ via [26]: zR =
.
π w02 . λ
(5)
Based on (5), the near-field and far-field regions of a Gaussian beam are specified by .z > zR , respectively. Figure 3 depicts the longitudinal section of a Gaussian beam along the z-axis and its characteristic parameters. The wavefront of a Gaussian beam evolves while propagating in a given direction. The wavefront radius of curvature at distance z from the beam waist is given by Saleh and Teich [26]: (
(
R(z) = z 1 +
.
zR z
)2 ) .
(6)
According to (6), the near-field and far-field approximations of the wavefront radius of curvature are obtained as .R(z) ≈ zR2 /z and .R(z) ≈ z, respectively. Thus, .R(z) is inversely proportional to z in near-field, whereas it linearly increases with z in far-field. Hence, the following result holds: .
lim R(z) = lim R(z) = ∞.
z→0
z→∞
(7)
470
I. Tavakkolnia et al.
This means that the wavefront of a Gaussian beam is initially planar at the beam waist location and then expanding in the direction of propagation. The wavefront radius of curvature arrives at a minimum of .Rmin = 2zR for .z = zR . In far-field, the waves evolve again into plane waves which are deemed as if they originated from a point source at the centre of the beam waist. From (4), for .z >> zR , the beam spot radius approaches the asymptotic value: w(z) ≈
.
w0 z λz = , zR π w0
(8)
thus varying linearly with z. Therefore, the circular beam spot in far-field is the base of a cone whose vertex lies at the centre of the beam waist with a divergence half-angle of .θ , as shown in Fig. 3. Mathematically, the far-field divergence angle is derived through the use of (8): θ = lim tan−1
.
z→∞
(
w(z) z
) = lim
z→∞
λ w(z) = . z π w0
(9)
The spatial distribution of a Gaussian beam along the propagation axis is described by the intensity profile on the transverse plane as follows [26]: I (ρ, z) =
.
) ( 2ρ 2 2Pt , exp − w 2 (z) π w 2 (z)
(10)
where .Pt is the transmit optical power and .ρ is the radial distance from the centre of the beam spot. The optical power collected within an arbitrary area A on the transverse plane is calculated as: f Pr =
I (ρ, z)dA.
.
(11)
A
The optical power carried with the beam spot of radius .w(z) is then obtained as follows: ) ( f w(z) 2ρ 2 2Pt 2πρdρ exp − Pr = w 2 (z) π w 2 (z) 0 (12) . ) ( −2 Pt ≈ 0.86Pt . = 1−e In other words, the optical power of a Gaussian beam is spatially concentrated around the propagation axis, and at any axial distance z from the transmitter, the beam spot contains about .86% of the total transmit power.
Optical Wireless Communication
471
2.2 Receiver The main component of the receiver front-end is the device which captures the optical wave and converts it to an electrical signal. This is usually referred to as direct detection when only the intensity of the optical wave carries the data, resulting in an intensity modulation direct detection system (IM/DD). Retrieving the phase of the optical wave is possible via coherent detection which requires either additional components, e.g. local oscillators, or a specific type of signal processing, e.g. Kramer-Kronig modulation and processing. Photodetectors are the main type of receivers for a wide range of OWC application. Photodetectors are semiconductor device fabricated from different materials, including silicon (Si), germanium (Ge), gallium–arsenide (GaAs), gallium– aluminium–arsenide (GaAlAs), and Indium gallium arsenide (InGaAs). Various device properties can be realised using each of these materials in the structure of the photodetector. For instance, silicon PDs are cost-effective and have a high responsivity in the range of 800 nm to 1000 nm. They are used widely for visible and near-infrared (NIR) ranges. InGaAs and Ge PDs are suitable for higher wavelengths. For OWC application, the noise that a PD generates significantly affects the communication performance. The noise level depends on the temperature as well as the band-gap values for different semiconductors. Higher values of band gap correspond to lower noise levels. Silicon PDs have a larger band-gap compared to Ge PDs; therefore, the phenomenon of thermal pair generation is smaller in Si PDs resulting in smaller noise at the same temperature. PDs can be fabricated in a positive-intrinsic-negative (PIN) structure where an intrinsic region with an n-type dope inserted between two p-type and n-type semiconductor materials. This type of PD requires a reverse bias voltage to operate. The bias voltage, i.e. external electrical field, applies the required force to generate the current from the free electrons and holes. The other type of PD is called an avalanche photodiode (APD), which is fairly similar to PIN PDs but requires a much higher reverse bias voltage compared to the PIN PDs. This is required to trigger an avalanche effect, which significantly increases the internal gain. Solar cells also can be used as receivers for OWC. Unlike PDs, they do not require additional power (i.e. bias voltage) to operate and moreover can efficiently harvest the solar energy. This can potentially offset the energy used by the rest of the receiver and enable an off-grid communication system. Solar cells typically have smaller bandwidth compared to PDs, but they are by default manufactured with multiple cells connected in series and parallel configuration. This results in a large active area as a communication receiver and increases the energy harvesting capability. There are various types of solar cells including silicon, perovskite, GaAs, and organic cells, to name a few. Imaging sensors are the other type of devices that can be used as the optical receiver, but they are only suitable for very few applications as their bandwidth and energy efficiency is generally lower than PDs and solar cells.
472
2.2.1
I. Tavakkolnia et al.
Receiver Design Trade-offs
Designing receivers with the aim of achieving multi-Gb/s data rates imposes a twofold challenge: (i) area-bandwidth trade-off and (ii) gain-field-of-view (FOV) trade-off. In the following, these trade-offs are briefly explained. Area-Bandwidth Trade-off The overall bandwidth of a PD is estimated by Alexander [27]: B=/
.
1 ( )2 2π(Rs + RL) Cp +
(
ld 0.44vs
)2 ,
(13)
where .Rs denotes the junction series resistance and .RL is the load resistance of the transimpedance amplifier (TIA), .ld is the depletion region length, and .vs is the carrier saturation velocity. The junction capacitance, .Cp , is given by .Cp = e0 er APD /ld where .e0 is the permittivity in vacuum, .er is the relative permittivity of the semiconductor, and .APD denotes the area of the depletion region (i.e. the PD effective area). According to (13), the two principal bandwidth limits are due to the RC time-constant (i.e. left term in the denominator) and the carrier transit-time (right term in the denominator). Also, the quantum efficiency of the PD is given by Alexander [27]: η = 1 − e−αld ,
.
(14)
where .α is the absorption coefficient and varies depending on the semiconductor material and the wavelength. It is evident from (13) and (14) that the PD junction capacitance, PD transit-time, and quantum efficiency are interdependent through .ld ; hence there will be trade-offs involved in the design of a high bandwidth PD. Figure 4 shows the bandwidth expression given in (13) and the quantum efficiency given in (14) for a silicon PD for a range of depletion region lengths. For smaller values of .ld , the bandwidth is determined by the RC time-constant and the quantum efficiency is low. By contrast, for larger values of .ld , the quantum efficiency improves and the bandwidth is determined by the transit-time effect. As shown in this figure, there exists an optimum length for the depletion region, denoted by .lmax , which yields the maximum PD bandwidth. Note that .lmax does not necessarily result in a high quantum efficiency. Therefore, an optimisation procedure is generally required to obtain a wide bandwidth while maintaining the minimum desired quantum efficiency [27]. For .ld = lmax , the PD bandwidth is described as [27]: B=/
.
1 . 4π e0 er (Rs + RL ) APD 0.44vs
(15)
Optical Wireless Communication
473
103
1 0.9 0.8 0.7
2
10
0.6 0.5 0.4 101
0.3 0.2 0.1
0
10 10-2
10-1
0 101
100
Fig. 4 Bandwidth and quantum efficiency of a silicon photodiode as functions of depletion region length, .ld with .APD = 100 microns2, .vs = 4.8 × 104 m/s, .Rs = 5 o, .RL = 50 o, .α = 1 μm.−1 , .er = 11.48
102
10
1
100 0
0.2
0.4
0.6
0.8
1
Fig. 5 Area-bandwidth trade-off with .l = lopt
The above equation readily presents the area-bandwidth trade-off for a PD. Note that (15) is an upper bound for the bandwidth of a PD and the condition .ld /= lmax results in smaller bandwidth values. Figure 5 illustrates the bandwidth B versus the PD area assuming .l = lopt . This figure shows the rapidly decreasing behaviour of B with respect to .APD .
474
I. Tavakkolnia et al.
According to Fig. 5 and the discussions above, to maximise the bandwidth of a solid state PD, the junction capacitance must be minimised which dictates that the PD must have a very small effective area, e.g. a side length of less than .50 μm is required for bandwidths higher than 10 GHz. This presents a significant challenge in the design of the optical receivers since the optical signals must be collected, aligned with, and coupled into a very small active area of the PD with minimal loss. Gain-FOV Trade-off The small area of a high-bandwidth PD can be compensated by using appropriate light-concentrating optics. This improves the optical power collection efficiency of the receiver and thus the received signal-to-noise ratio (SNR). However, the use of optical concentrators for increasing the collection area limits the receiver FOV due to the law of conservation of Etendue [28]. In the following, imaging and nonimaging optical concentrators are briefly introduced and then the gain-FOV trade-off is discussed. Imaging optics are intended to form an image of the object (e.g. the source), while the main purpose of non-imaging optics is effective and efficient collection, concentration, and transfer and distribution of the optical power. In other words, non-imaging optics are designed to either collect the optical power and concentrate it at a specified target or manipulate the light intensity profile to create a certain spatial and angular pattern [28]. Among various imaging optical components, converging lenses are widely used for focussing the incident light beam down to any desired spot size as small as the PD’s active area provided that it is larger than the diffraction limit. Diffraction limit determines the maximum resolution that can be achieved by an optical system. It is defined by the equation .d = 2.44λN, where d is the minimum resolvable distance, .λ is the wavelength of the light, and N is the f -number of the imaging optics (i.e. the ratio of focal length to diameter of the clear aperture). To minimise the size of the receiver, it is desirable to use a lens with a short focal length. However, to collect the largest amount of optical power, a lens with the largest possible clear aperture is required. This enforces that the lens needs to have a small f -number. Achieving the desired small spot sizes (e.g. a diameter of .25–30 μm), limited only by diffraction effects, also requires a lens with small f -number. However, lenses with small f -number are very expensive and difficult to manufacture. For applications requiring efficient light concentration, non-imaging optical components are more appealing as they often provide more flexible designs in terms of the range of incident angles and alignment issues. The most commonly used technology that leverages non-imaging optics is compound parabolic concentrator (CPC) which can approach the maximum theoretical concentration gain determined by the law of conservation of Etendue [28]. Figure 6 illustrates the gain-FOV trade-off for a CPC and plots the gain and length of a reflective CPC with an exit aperture diameter of 1 mm against the CPC half-angle FOV. For .FOV = 10◦ , this CPC offers an optical gain of about 33. Such a CPC is .1.9 cm long, and its effective collection area is about .0.26 cm.2 .
Optical Wireless Communication Fig. 6 Gain-FOV trade-off of a reflective CPC with a 1 mm exit aperture
475 300
3
250
2.5
200
2
150
1.5
100
1
50
0.5
0 0
5
10
15
20
25
0 30
2.3 Channel Model A general IM/DD optical wireless channel may include multiple transceivers. .Nt light sources (e.g. one LED or an array of LEDs) can transmit the signal, and the signal is received with .Nr receiver unit (e.g. PDs). The resulting channel is described as: y = Hx + n,
.
(16)
where .x is the transmitted signal vector of size .Nt ×1 and .y and .n are .Nr ×1 vectors, respectively, representing the received signal and noise at each PD. The noise here can include all possible noises, such as shot noise and thermal noise. The noise variance can be expressed as σ 2 = 2qIPD B +
.
4KB Ta B , RL
(17)
where .IPD is the average photocurrent of the PD, B is the modulation bandwidth, and .Ta is the absolute temperature. .KB = 1.38 × 10−23 J/K is the Boltzmann’s constant and .RL denotes the load resistance in the receiver circuit. The channel matrix .H is given by: ⎞ ⎛ h1,1 · · · h1,Nt ⎜ . .. . ⎟ .H = ⎝ . (18) . .. ⎠ , . hNr ,1 · · · hNr ,Nt where the entities .hi,j (.i = 1, . . . , Nr and .j = 1, . . . , Nt ) are the channel gain of the link between the j th transmitter and the ith PD which can be expressed as: NLoS hi,j = hLoS i,j + hi,j ,
.
(19)
476
I. Tavakkolnia et al.
NLoS
LoS
Transmitter
Transmitter Φ
th reflector ,
Tx , Tx ,
Ψ
Receiver
,
th reflector
,Rx
,Rx
Receiver
Fig. 7 LoS and NLoS channel gain schematics
NLoS is the NLoS channel gain. The channel gain where .hLoS i,j is the LoS and .hi,j depends on the distance between transmitter and receiver pairs (i.e. user position) and the orientation of each PD. Figure 7 shows the LoS link geometry for a pair of receiver (Rx) and transmitter (Tx), where .φ is the angle of radiance and .ψ is the incidence angle. The receiver field-of-view (FOV) is shown by .Ψ . The LoS channel gain of an optical link between a light source and a PD is given by Kahn and Barry [29]: LoS .hi,j
( ) ψ m+1 m , A cos (φ) cos(ψ)rect = 2 Ψ 2π d
(20)
where A is the PD area. Furthermore, .rect( Ψψ ) = 1 for .0 ≤ ψ ≤ Ψ and 0 otherwise. For calculating the NLoS component of the channel gain, the environment is segmented into a number of surface elements which reflect the light beams. These surface elements are modelled as Lambertian radiators described by (20) with .k = 1 [30]. Using the frequency domain instead of the time domain analysis, one is able to consider an infinite number of reflections to have an accurate estimation of the diffuse channel. Then, the NLoS channel gain including an infinite number of reflections can be expressed by Schulze [30]: hNLoS = rT Gρ (I − EGρ )−1 t, i,j
.
(21)
where vectors .t and .r, respectively, represent the LoS link between the transmitter Tx and all the surface elements of the room and from all the surface elements of the room to the receiver Rx. Also, .(.)T denotes the transpose operator. Matrix .Gρ = diag(ρ1 , . . . , ρN ) is the reflectivity matrix of all N reflectors, .E is the LoS transfer functions of size .N × N for the link between all surface elements, and .I is the unity matrix. In (21), the elements of .E, .r and .t are found according to (20) and Fig. 7 between pairs of Rx, Tx, and surface elements. For a fixed pair of transmitter and receiver, the channel is modelled as above without any randomness. However, slight changes in the configuration can change
Optical Wireless Communication
477
the channel significantly. For instance, random orientation of handheld devices (e.g. mobile phones) imposes a stochastic distribution on the channel model. Via experimental analysis, it has been shown that distributions of the elemental rotation angles yaw, .α, pitch, .β, and roll, .γ , which fully determine the orientation of the device, are well fitted with a Laplace distribution for sitting activities. The distributions are more close to a Gaussian distribution for walking activities [31]. Similarly, due to the nature of OWC, the link between a pair of transmitter and receiver can be blocked by an opaque object such as a human body. Moreover, movements of objects around the transmitter and receiver can change the light reflections, hence changing the NLoS channel gains. Various studies have used movement models and evaluated the effect of link blockage on the system performance [32, 33].
2.4 Modulation Techniques Two types of modulating signals on the optical waves are commonly used for OWC. Direct intensity modulation is performed by applying the data carrying electrical signal directly to the LED (Fig. 2) or LD as the driving voltage (or current), hence modulating the intensity of the output optical signal. This is widely used for OWC due to its implementation simplicity. External modulation is another type which is based on electro-optic materials, such as lithium niobate crystals. These materials react to an applied electrical field. For instance, the refractive index changes by applying different voltages and consequently causes variations in the properties of light passing through. By using different structures and applying data carrying electrical fields, the amplitude, phase, or polarisation of the output optical signal can be modulated [34]. External modulation is used in optical fibre and freespace communication systems and can lead to high data rates because of the wide modulation bandwidth of external modulator, i.e. tens of GHz. Advanced efficient external modulators are most suitable for point-to-point and back-haul links in applications such as data centre links [35]. This section mainly focusses on IM/DD systems which are the main enabler for 6G applications due to their relative simplicity and wide range of use cases. Most of the modulation techniques for OWC are based on existing RF methods with unique modifications to adapt them the physical characteristics of OWC systems. These include single carrier modulation techniques, such as such as onoff keying (OOK), pulse-position modulation (PPM), and M-ary pulse amplitude modulation (MPAM), as well as multicarrier techniques, such as several variations of orthogonal frequency division multiplexing (OFDM). Some techniques are specifically designed for OWC, such as colour shift keying (CSK) for which the data is modulated on the relative intensities of multiple used colours (e.g. red, green, and blue). The details of different techniques can be found in references such as [36]. Here, the details of DC-biassed optical OFDM (DCO-OFDM) is reviewed as a widely used IM/DD modulation technique in various OWC applications.
478
I. Tavakkolnia et al.
OFDM is a multicarrier modulation technique with a long history in wireless communication systems. OFDM elegantly addresses the issue of inter-symbol interference which is particularly severe in high data rate wireless systems. The available system bandwidth is divided into a number of orthogonal subchannels of equal bandwidth. Data is modulated on these subchannels. Due to the relatively small bandwidth of a subchannel compared to the system bandwidth, the frequency response per subchannel is generally flat. The cyclic prefix is added in time domain which is at least as long as the maximum multipath delay. This means that a single tap equaliser can be deployed in the frequency domain to equalise the channel. As a result, intersymbol interference is effectively mitigated. DCO-OFDM also operates based on the same principles after applying necessary modifications specific to OWC. Among the advantages of OWC is the possibility of implementation using lowcost commercially available light sources. As stated earlier in the description of IM/DD, data signals are modulated on the output optical power of the light source (i.e. intensity of light) by changing its driving current. Therefore, the time domain signal should only contain real and positive samples. Single carrier modulation schemes such as pulse amplitude modulation (PAM) can be easily adopted for VLC. However, OFDM generally produces complex-valued bipolar time-domain samples, and thus, it must be modified to be applicable to VLC. This can be achieved by using a property of the Fourier transform. Specifically, by imposing the Hermitian symmetry on the signal in the frequency domain, the time domain signal would be real. Subsequently, a constant direct current (DC) bias is added to the signal so that the signal becomes mostly positive. However, depending on the value of the DC bias a few samples may still be negative which need to be clipped to zero before transmission. Taking into account the Hermitian symmetry and the DC bias, the frequency domain subcarrier vector, .X, is described as ] [ X = 0, X1 , · · · , X Ns −1 , 0, X∗Ns −1 , · · · , X1∗ ,
.
2
(22)
2
where .Ns is the number of available subcarriers. Each subcarrier is modulated by a .Mk -ary quadrature amplitude modulation (.Mk QAM) symbol. The value of .Mk depends on the available SNR at that subcarrier, .γk , to meet the minimum bit error ratio (BER) constraint required for performing forward error correction. This technique is referred to as the adaptive bit loading. The use of adaptive bit loading leads to the maximum achievable data rate [37]. The BER at the .k th subcarrier is approximated as follows [38]:
BERk =
( 4 1−
√1 Mk
log2 Mk
.
×
√ min(2, EMk ) l=1
)
(
/
) 3γk Q (2l − 1) , 2 (Mk − 1)
(23)
Optical Wireless Communication
479
where .Q(.) is the Gaussian Q-function. An iterative algorithm can be used for bit loading to determine the maximum .Mk for .k = 2, · · · , Ns /2−1. At each subcarrier, the BER.k is calculated for the available SNR, .γk , for values of .Mk increasing from .Mk = 2. The maximum value of .Mk is found so that the BER.k is still below the target. The overall data rate is given by Tsonev et al. [39]: E N2s −1
ρ=
.
log2 Mk , (Ns + NCP )/2B k=2
(24)
where B is the single-sided system bandwidth, and .NCP is the cyclic prefix size.
3 Terabit Optical Wireless Backhaul Link Design Using VCSEL Arrays The proliferation of Internet-enabled premium services such as live video streaming with 4K and 8K ultra-high-definition (UHD) resolutions, immersive experience and three-dimensional (3D) stereoscopic vision with virtual reality (VR) and augmented reality (AR), holographic telepresence, on-demand cloud computing, and multiaccess edge computing will extremely push wireless connectivity limits in years to come [40]. The real-time operation of these technologies will require unprecedented system capacities on the order of multi-Terabit/s (Tb/s) for wireless access networks. Attaining such an ambitious goal constitutes one of the KPIs of the future 6G wireless systems [41]. In the following, the application of VCSELs for designing an ultra-high-speed optical wireless backhaul link is discussed. Terabit 6G wireless access networks will impose a substantial overhead on the backhaul capacity, and a cost-effective solution for backhaul connectivity is a major challenge. A high-capacity wireless backhaul link can be designed based on a LoS laser-based OWC system to support Tb/s aggregate data rates. This is especially appealing to indoor scenarios. Terrestrial FSO systems are commonly used for outdoor applications to establish high-speed, cable-free links between high-rise buildings. While FSO communications suffer from severe outdoor channel impairments such as weather-dependent absorption loss and atmospheric turbulence, short-range laser-based OWC under stable and acclimatised conditions of indoor environments is less challenging as there is no need for bulky FSO transceivers equipped with expensive subsystems to counteract adverse outdoor effects.
3.1 Spatial Multiplexing MIMO OWC System A single infrared laser beam can provide tens of Gb/s data rates depending the communication range and the permissible transmit power limit due to eye safety
480
I. Tavakkolnia et al.
considerations. An effective technique to enhance the capacity of an OWC system based on IM/DD and unlock transmission rates in excess of 1 Tb/s consists in utilising a spatial multiplexing (SM) multiple-input multiple-output (MIMO) system configuration. In an SM-MIMO OWC system, multiple independent data streams are sent in parallel through the shared optical wireless medium while using the same modulation bandwidth.
3.2 VCSEL-Based Optical Wireless Channel The frequency response of VCSELs is relatively flat near DC, so the most important quantity characterising a VCSEL-based optical wireless channel is the DC gain. For a single-input single-output (SISO) OWC system with M/DD, the DC gain .H0 relates the transmitted and received average optical powers as follows: H0 =
.
Pr . Pt
(25)
When operating over a link distance L, the received power is obtained by substituting (10) in (11) and evaluating the integration at .z = L. The corresponding DC gain of the VCSEL-based optical wireless channel is: f H0 =
.
APD
) ( 2ρ 2 2 dA, exp − 2 w (L) π w 2 (L)
(26)
where .APD is the photosensitive area of the PD used at the receiver. Provided that the transmitter and receiver are perfectly aligned with each other, and a circular PD of radius .rPD is used, a closed-form expression for (26) is derived: (
) 2 2rPD .H0 = 1 − exp − . w 2 (L)
(27)
3.3 Structure of Arrays and MIMO Channel An SM-MIMO OWC system can be realised by means of an array of VCSELs. Figure 8 exemplifies a .4 × 4 MIMO link, comprising a .2 × 2 VCSEL array and a .2 × 2 PD array. The transmitter and receiver optics are not shown for simplicity. The array structure used in Fig. 8 can be extended to a .K × K square in order to form an .Nt × Nr MIMO OWC system where .Nt = Nr = K 2 . Here, the assumption of .Nt = Nr is not a necessary requirement, and, in fact, the receiver array can be
Optical Wireless Communication
481
Fig. 8 A .4 × 4 MIMO OWC system using a .2 × 2 VCSEL array and a .2 × 2 PD array [7]
Fig. 9 The structure of a .K × K VCSEL array and a .K × K PD array, forming an .Nt × Nr MIMO OWC system with .Nt = Nr = K 2 [7]. (a) .K × K VCSEL array. (b) .K × K PD array
designed such that .Nr ≥ Nt . Figure 9 depicts a .K × K VCSEL array and a .K × K PD array on the .x ' y ' plane. The gap between adjacent elements of the PD array is controlled by a design parameter .δ > 0, which is referred to as inter-element spacing hereinafter. For those PDs that are close to the edges of the array, there is a margin of . 2δ with respect to the edges. The centre-to-centre distance for neighbouring PDs
482
I. Tavakkolnia et al.
along rows or columns of the array, as shown in Fig. 9b, is related to the PD radius and the inter-element spacing by: dPD = 2rPD + δ.
.
(28)
With this arrangement, the array side length is .W = KdPD , and the total dimensions of the array are .W × W . The MIMO optical wireless channel is identified by an .Nr × Nt matrix of DC gains for [ all] transmission paths between the VCSEL and PD arrays, denoted by .H0 = Hij , where the entry .Hij corresponds to the link from .VCSELj to Nr ×Nt .PDi . The spatial overlapping between the beam spots at the PD array, as shown in Fig. 8, causes cross talk in the MIMO optical wireless channel. The far-field divergence angle of a Gaussian beam is an important determining factor for the cross talk performance of this MIMO OWC system. For any operating wavelength, the divergence angle is inversely proportional to the beam waist radius via (9). While larger beam waists are desired, commercial single-mode VCSELs exhibit a beam waist radius of about .1 μm [24]. This is equivalent to a divergence angle of more than .15◦ . Consequently, it is necessary to use appropriate optics at the transmitter to reduce the divergence angle of VCSELs, thus making the laser beam spots sufficiently small at the receiver. Under perfect alignment, the channel gain between .VCSELj and .PDi is evaluated based on (26), resulting in [7]: f Hij =
rPD
f
ζ
.
−rPD
−ζ
( ) (x − xˇi + xˆj )2 + (y − yˇi + yˆj )2 2 exp −2 dxdy, w 2 (L) π w 2 (L) (29)
/ 2 − y 2 and .(xˇ , yˇ ) and .(xˆ , yˆ ) are 2D coordinates of the ith element where .ζ = rPD i i i i in the transmitter and receiver arrays, respectively, on the .x ' y ' plane. Equation (29) can be efficiently computed using numerical integration methods. For an ultra-high-speed LoS optical wireless link using narrow laser beams, it is imperative to maintain alignment. However, it is very difficult in practice to precisely align the transmitter and receiver, and as a result the performance is degraded due to misalignment errors. The exact analytical channel model for Gaussian beams under misalignment errors is available in [7]. It applies to the generalised link configuration where the transmitter and receiver have arbitrary positions and orientation angles relative to each other, which has been verified by means of optical ray tracing simulations in Zemax OpticStudio [7].
3.4 MIMO-OFDM Transceiver In order to achieve Tb/s data rates with the laser-based MIMO OWC link, a high number of VCSELs are placed with small distances between them in the transmitter array as shown in Fig. 9, leading to overlaps between their beam spots
Optical Wireless Communication
483
Fig. 10 MIMO-OFDM transceiver architecture based on DCO-OFDM [7]
at the receiver array. Besides the cross talk caused by the overlapping beams, misalignment errors between the MIMO arrays give rise to a loss in the signal-tointerference-plus noise (SINR) performance. An SM MIMO-OFDM system based on singular value decomposition (SVD) is used to maximise the aggregate data rate by sending independent data streams over the MIMO optical wireless channel. Also, the commonly used DCO-OFDM technique along with adaptive modulation based on QAM is adopted to ensure a high spectral efficiency [42]. Figure 10 depicts a simplified block diagram of the MIMO-OFDM transceiver architecture. The SVD precoding and decoding stages shown in Fig. 10 are performed, provided that the .Nr × Nt MIMO channel matrix is known at the transmitter and receiver. In the following, first, the transceiver system is described without the use of SVD.
3.5 Aggregate Data Rate At the transmitter, the .Nt input binary data streams are individually mapped to a sequence of complex QAM symbols. The resulting sequences are buffered into blocks of size .Nt × NQAM . They are then loaded onto the .NQAM data-carrying subcarriers of the .Nt OFDM frames in positive frequencies, where .NQAM = NFFT 2 − 1. For each OFDM frame, the number of symbols is extended to .NFFT according to a Hermitian symmetry and the DC and Nyquist frequency subcarriers are zero-padded to produce a real-valued signal after the .NFFT -point IFFT operation. A proper DC bias level is then added in the time domain to obtain a positive signal compatible with IM/DD channels. At the receiver array, after filtering out the DC component and perfect sampling, ˜ k ∈ RNt ×1 be the vector of the vector of received photocurrents is obtained. Let .X symbols modulated on the kth subcarrier in the frequency domain. After the FFT
484
I. Tavakkolnia et al.
operation, the received symbols are extracted from the data-carrying subcarriers and then they are demodulated using maximum likelihood detection. The vector of received signals on the kth subcarrier for .k = 0, 1, . . . , NFFT − 1 is written in the form: / ˜ k + Zk , (30) .Yk = RPD Pelec Hk X where .RPD is the PD responsivity, .Pelec is the average electrical power of each OFDM symbol, .Hk is the frequency response of the MIMO channel, and .Zk is the additive white Gaussian noise (AWGN) vector. Note that without SVD processing, N ×1 and .H ∈ RNt ×Nt . Considering strong .Nr = Nt holds, in which case .Yk ∈ R t k LoS components because of narrow laser beams used, the channel is nearly flat for which .Hk = H0 .∀k. The ith element of the noise vector comprises thermal noise and shot noise of the ith branch of the receiver and the relative intensity noise (RIN) caused by all the VCSELs. The RIN is defined as the mean square of instantaneous power fluctuations divided by the squared average power of the laser source [43]. The total noise variance is denoted by .σi2 . Based on (30), the received SINR per subcarrier for the ith link is derived as follows: γi = E
2 H 2P RPD ii elec
.
j /=i
2 H2P 2 RPD ij elec + σi
.
(31)
With SVD processing, the MIMO channel can be transformed into a set of parallel independent subchannels in the frequency domain. In particular, the SVD of .Hk ∈ RNr ×Nt , with .Nr ≥ Nt , is .Hk = Uk k V∗k , where .Uk ∈ RNr ×Nr and N ×Nt are unitary matrices, .∗ denotes conjugate transpose, and . ∈ RNr ×Nt .Vk ∈ R t k is a rectangular diagonal matrix of the ordered singular values, i.e. .λ1 ≥ λ2 ≥ · · · ≥ λNt > 0 [44]. Note that .Hk = H0 .∀k as discussed so the subscript k can be dropped from the singular values. After SVD decoding at the receiver, the .Nt -dimensional vector of received symbols on the kth subcarrier for .k = 0, 1, . . . , NFFT − 1 becomes: / ˜ k, ˜ k = RPD Pelec k X ˜k +Z (32) .Y where .Z˜ k = U∗k Zk . Since the statistics of the noise vector is preserved under a unitary transformation, the ith elements of .Z˜ k and .Zk have the same variance of .σi2 . Therefore, the received SNR per subcarrier for the ith link is derived as follows: γi =
.
2 λ2 P RPD i elec
σi2
.
(33)
Optical Wireless Communication
485
Based on DCO-OFDM with adaptive QAM, the achievable rate for .VCSELi is [7]: ( γi ) Ri = ξ B log2 1 + , Γ
.
(34)
−2 is the subcarrier utilisation ratio, B is the single-sided bandwidth where .ξ = NNFFT FFT of the system, and:
Γ =
.
− ln (5BER) , 1.5
(35)
represents the SINR gap due to the required bit error ratio (BER). Hence, the aggregate data rate of the MIMO-OFDM system is expressed as: R=
Nt E
.
Ri = ξ B
i=1
Nt E i=1
( γi ) log2 1 + . Γ
(36)
3.6 Performance Results The system performance is evaluated based on computer simulations by using the parameters listed in Table 1. The VCSEL and noise parameters are adopted from [24, 45]. The target BER is .10−3 . Results are presented for an effective beam waist radius of .w0 = 100 μm by assuming that there is a convex lens next to each bare VCSEL to widen its output beam waist in order to reduce the far field beam divergence. For a link distance of .L = 2 m, the beam spot radius and the divergence angle are .w(L) = 5.4 mm and .θ = 0.16◦ . An optical transmit power of .Pt = 1 mW per VCSEL is selected on account of eye safety considerations [7]. The generalised misalignment model from [7] is used to simulate radial displacements as well as orientation errors of the transmitter and receiver arrays. The impact of misalignment Table 1 Simulation parameters
Parameter L .Pt .λ B .RIN .rPD .RPD .δ
Description Link distance Transmit power per VCSEL Laser wavelength System bandwidth Laser noise PD radius PD responsivity Inter-element spacing
Value 2m 1 mW 850 nm 20 GHz .−155 dB/Hz 3 mm .0.4 A/W 6 mm
486
I. Tavakkolnia et al.
30
30
24
24
18
18
12
12
6
6
0
0
-6
-6
-12
-12
-18
-18
-24
-24
-30 -30 -24 -18 -12 -6 0
-30 -30 -24 -18 -12 -6 0
6 12 18 24 30
(a)
6 12 18 24 30
(b)
(c)
Fig. 11 PD array configurations for an .Nt × Nr MIMO link with .Nt = 25 [7]. (a) Config. I (.Nr = 25). (b) Config. II (.Nr = 41). (c) Config. III (.Nr = 81)
3.5 3 2.5 2 1.5 1 0.5 0
0
6
12
18
24
30
36
42
48
54
60
Fig. 12 Aggregate data rate as a function of the radial displacement [7]
is studied for an .Nt × Nr MIMO OWC system with .Nt = 25 (i.e. .5 × 5 VCSEL array) for three configurations of the PD array with different number of elements as shown in Fig. 11, where .Nr = 25, 41, 81 in Configs. I, II, and III, respectively. In these configurations, the number of elements on the PD array is increased from .Nr = 25 to .Nr = 81 when the array dimensions remain the same. For the array r APD structure depicted in Fig. 9, the fill factor (FF) can be calculated as .FF = NW 2 , where .APD is the PD area. The corresponding FF values for Configs. I, II, and III are .FF = 20%, 32%, 64%. Figure 12 plots the aggregate data rate versus the radial displacement error of the PD array relative to the VCSEL array when the radial displacement takes place horizontally. Note that vertical displacement is similar to this case due to the symmetry in the square lattice of the arrays. It can be observed from Fig. 12 that
Optical Wireless Communication
487
3.5 3 2.5 2 1.5 1 0.5 0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Fig. 13 Aggregate data rate as a function of the orientation angle error at the transmitter [7]
without SVD, the data rate rapidly drops to zero after .rDE = 6 mm. When using SVD for Config. I, the rate performance exhibits a decaying oscillation with exactly five peaks. Each peak corresponds to an event that a given number of columns of the VCSEL array are aligned to those of the PD array. The use of Config. II slightly improves the valleys because of the second lattice of PDs placed on the vertices of the first lattice. This keeps the aggregate data rate above 1 Tb/s for .rDE ≤ 17 mm. By comparison, significant performance improvements are attained with SVD and Config. III in which 81 PDs are densely packed onto the receiver array, thus increasing the chance to acquire strong eigenmodes from the MIMO channel as .rDE raises. This comes at the cost of higher hardware complexity arising from the channel estimation and SVD computation for a .25 × 81 MIMO setup. Figure 13 plots the aggregate data rate as a function of the orientation angle error at the transmitter. It is evident how sensitive the system performance is with respect to the transmitter orientation error such that a small error of about .1.7◦ is enough to make the data rate zero. This is because the transmitter is 2 m away from the receiver, and hence small deviations in its orientation angle lead to large displacements of the beam spots on the other end. The results in Fig. 13 follow similar trends as those in Fig. 12, except for their different scales in the horizontal axis. In fact, an orientation angle error of .1.7◦ is equivalent to a displacement error of 60 mm. Consequently, ◦ .φa = 1.7 causes the beam spots of the VCSEL to completely miss the receiver array. Therefore, the transmitter orientation error can be viewed as an equivalent radial displacement error, if the beam spot size at the receiver array is sufficiently small. In this case, an orientation angle error of .φa at the transmitter over a link distance of L translates into an equivalent radial displacement of .rDE = L sin(φa ).
488
I. Tavakkolnia et al.
4 Zero-Energy Transceiver Concept The next generation of wireless systems will be an integral building block of a datadriven sustainable society, enabled by reliable and green connectivity. In particular, the ultimate goal of a green society is envisioned by enabling technologies which target zero-energy and zero-emission communication systems. It should be noted that zero-energy refers to a system which does not demand energy from the grid. The goal is to deploy a wireless communication system which only uses the harvested energy from ambient sources, such as RF or light waves, for sending, receiving, and processing the digital data. This innovation can play a game-changing role as the increasing energy consumed by the information and communication technology sector is becoming a global issue. Light energy harvesting is a traditional method which is widely used nowadays as a renewable energy source. There has been massive advances in developing new types of solar cells with higher power conversion efficiency, lower cost, and better durability [46, 47]. Moreover, solar cells are developed for indoor light energy recycling, with an absorption which matches the spectrum of indoor lights (e.g. LED spectrum) [12]. While outdoor solar energy harvesting enables harvesting in the order of 10 mW/cm.2 , the indoor light energy harvesting can achieve harvested energy in the order of 10 .μW/cm.2 [48]. Even the lower indoor energy recycling can potentially be used to power smaller devices such as sensors or IoT devices. Though batteries are low-cost, it is challenging to deploy billions of IoT devices which require frequent battery charging or replacement. Therefore, it is vital to develop novel off-grid communication systems that provide high speed connectivity without any additional demand for energy from the grid. Hence, it is evident that these cells can be used as dual purpose devices which can enable simultaneous energy harvesting and data communication. The concept of the zero-energy communication system is shown in Fig. 14. The solar cell collects ambient light and data carrying signals. A simple circuit (details in the next section) separates the AC and DC components, respectively, towards the signal detection or energy storage branches. A central controlling unit, which is
Signal transmission AC Receiver circuit
Signal processing Control unit
DC
Solar cell Other functionalities Fig. 14 The block diagram of a zero-energy system
Optical Wireless Communication
489
powered by the harvested energy, controls various parts of the system including data transmission and any other functionality, such as sensing. The signal transmission can be via optical, e.g. infrared, or RF communication. This is a simple structure of a zero-energy system. Data rates as high as 784 Mb/s are already demonstrated using GaAs photovoltaic cells while harvesting about 1 mW of energy [49]. Therefore, it is sensible to imagine zero-energy devices for connected applications such as IoT.
4.1 Solar Cell Model A solar cell can be modelled several ways depending on the material type and usage. It is important to note that the DC and AC models differ slightly. Figure 15 shows the single diode DC equivalent circuit of a solar cell [50]. The photo-current generated in the solar cell as a result of the incident light is shown by .Iph . The cell interconnection resistance is modelled by the series resistance .RS , and the shunt resistance .Rsh models the leakage current in the solar cell. The forward current of the diode is denoted as .ID which can be formulated via ideal diode equation as: ID = I0 (e
.
V +I Rs VT
− 1),
(37)
where .I0 is the reverse saturation current and .VT = AKB Ta /q is the junction thermal voltage of the diode. The parameter A is the diode ideality factor. Other parameter in (37) are defined under (17). Therefore, the current over the load resistance, .RL , can be expressed as I = Iph − ID −
.
V + I Rs . Rsh
(38)
Solar cell operating regime is determined by the short-circuit current, .Isc , and the open-circuit voltage, .Voc . In other words, current or voltage developed on the load resistor are between 0 and .Isc or .Voc . However, the maximum power is generated on a different operation point, .(Im , Vm ), as shown in Fig. 16. The efficiency of the solar cell is defined as the ratio between the power generated at this point to the incident light power. In order to use the solar cell for simultaneous energy harvesting and communication, a receiver circuit should be used which can separate the DC and AC Fig. 15 The DC model of the solar cell
s
+ ph
D
sh
−
L
490
I. Tavakkolnia et al.
Fig. 16 The current-voltage curve and the maximum power point
sc m
Current,
Maximum power
=
Voltage,
Fig. 17 The solar cell model for simultaneous energy harvesting and communication
oc
m
s
+
ph
ph (
)
0
0
L
C
sh
( )
components (as shown in Fig. 14). Such a receiver circuit can be realised via various structures [8, 51, 52]. The circuit shown in Fig. 17 is a common structure for simultaneous energy harvesting and communication. The photo-current source contains both the DC component .Iph and the AC component .iph . The resistor r represents the small signal equivalent of the solar cell diode in the DC model. A capacitor C is used in parallel with .Rsh to model the internal capacitance of the solar panel. An inductor L is set in series with .Rs to model the inductance of wire connections within the solar panel. The output of the solar panel is connected to two branches: the communication branch and the energy harvesting branch. In the communication branch, a capacitor .C0 is connected to the output of the solar panel to block the DC signal and to pass the AC signal. The energy harvesting branch includes an AC signal filtering inductor .L0 and a load resistor .RL . The electrical power gain of the circuit can be determined as follows [8]: | |2 || | | v(ω) | | | .| | i (ω) | = || 1 + PH |r
|2 | | | | , 1 | + Rs +j ωL+R LC |
RLC Rs +j ωL+RLC 1 1 j ωC
+
1 Rsh
RC 1 j ωC0 +RC
(39)
Optical Wireless Communication
491
√ where .ω = 2πf is the angular frequency, .j = −1, and .RLC is the equivalent resistance of the two parallel branches and is calculated as follows: RLC =
.
1 1 j ωL0 +RL
+
(40)
.
1 1 j ωC0 +RC
In the next section, a proof-of-concept experimental system will is studied. The frequency response based on the above equation is validated as well.
4.2 Proof-of-Concept System Inorganic semiconductors are primarily chosen for OWC application, as opposed to organic semiconductors, due to their higher available modulation bandwidth. However, organic semiconductor technology offers a low production cost, thin and lightweight profile, substrate flexibility, and simple integration into a wide range of platforms. Moreover, it has been shown that organic light-emitting diodes (OLEDs) can provide an electrical bandwidth of more than 240 MHz, enabling gigabit per second (Gb/s) transmission speeds [21]. Therefore, a transceiver can be realised by combining organic semiconductor transmission and detection technologies [11]. An important advantage of OPVs compared to their inorganic counterparts is their better suitability for indoor use, which makes them ideal for IoT applications. This is due to the excellent spectral overlap of OPVs, with conjugated organic semiconductors as the light-absorbing photoactive layer, with the indoor light spectrum. It is predicted that a solar panel with 1000 single OPV cells and an area of about 0.05 m.2 can achieve over 100 Gb/s of data rate [53]. Figure 18 shows the block diagram of the proof-of-concept system [11]. LDs with a dominant wavelength of 660 nm are used as light sources. Note that white
Waveform generator
Oscilloscope Processing unit Receiver circuit
LD Lens
AC DC
Optical channel
Lens
Bias tee
Fig. 18 The block diagram of the proof-of-concept experiment
OPV
492
I. Tavakkolnia et al.
30
10 SNR (Experiment) SNR (Estimation) Adaptive bit loading
20 10
8
6
0 4 -10 2 -20
0
5
10
15
10
15
30
Fig. 19 The SNR and bit loading results for the proof-of-concept experiment
light with a desired temperature can be produced by adding blue and green light sources which can potentially carry data for WDM as well. A large lens is used at the receiver side to focus each received light beam on one of the OPV cells. The output from each OPV cell is connected to a circuit that contains two branches for signal detection and energy harvesting as shown in Fig. 17. As shown in Fig. 19, the SNR is measured based on a training sequence, and for comparison, it is also theoretically approximated based on the estimated channel gain, noise power, and OPV parameters (capacitance, shunt resistance, etc.). Adaptive bit and energy loading is used with the a target bit error ratio (BER) of .4.7 × 10−3 . The measured .−3 dB bandwidth is 2.77 MHz, while 30.1 MHz of bandwidth is adaptively modulated. The power dissipated by the load resistor in the energy harvesting branch is regarded as the harvested power. A data rate of .ρ = 147 Mb/s and a harvested power of .EH = 3.7 mW are obtained in the SISO experiment. For the MIMO experiments, multiple LDs and OPV cells are utilised, and 221 Mb/s or 363 Mb/s of data rate and 6.8 mW or 10.9 mW of harvested energy are achieved by implementing 2-by-2 or 4-by-4 systems. Note that there is some electrical cross talk between MIMO channels due to the receiver circuit structure, while the optical channel matrix is almost diagonal because of the imaging optics. Zero-forcing (ZF) is used to de-multiplex signals and mitigate the cross talk impact. This proof-of-concept experiment manifests the potentials of an OPV-based system. The same principle can be applied to a large array of OPV cells. Theoretical analysis shows that tens of giga-bit-per-second data rate can be achieved [53].
Optical Wireless Communication
493
5 LiFi Through Reflective Intelligent Surfaces Reflective surfaces are a general category of surfaces which can reflect an incident electromagnetic wave while changing its amplitude, phase, and polarisation. The manipulation of the wavefront is governed by the generalised law of refection or a change in the physical properties of the surface (e.g. refractive index) [54]. Significant performance improvement has been demonstrated in RF communication via utilisation of RIS in NLoS scenarios [55]. Despite differences with RF, RIS is a game-changing concept for OWC systems in the future as well. The concept is based on engineering the physical propagation environment and controlling the channel using an energy- and cost-efficient approach. The integration of RIS into LiFi networks will have benefits beyond the mitigation of LoS link blockage and can be an enabler for a wide range of new application. Figure 20 shows the concept of RIS for OWC. The propagated signal from the array of transmitters (e.g. LEDs or LDs) is detected by some receivers through a LoS link, which may be blocked for other receivers. RIS 1 is tuned to direct the light beams to the blocked user. RIS 2 is used to enhance the communication channel for the non-blocked user. In the following sections, different types of RIS and corresponding application in OWC systems are presented.
5.1 Types of RIS Although some research assumes a non-reconfigurable structure for RIS, a form of intelligent reconfigurable structure is necessary for the integration of RIS in future communication networks. Depending on the location of transmitter and receiver, the properties of the reflected light can be adjusted to achieve the desired communication performance. There are two main types of RIS for OWC, namely, mirror arrays and metasurfaces. Fig. 20 The concept of RIS in OWC
Transmitter array
RIS 1
RIS 2
LoS Blockage
Reflected beams
494
I. Tavakkolnia et al.
Mirror Arrays Mirror arrays consisted of a large array of small mirror elements which are mounted on a form of rotating plane. This plane can rotate in two perpendicular rotation directions. Therefore, the light beams are directed to a desired direction based on Snell’s law of reflection. In order to avoid large form factors and keep the shape of the mirror array as flat as possible, micro-electro-mechanical systems (MEMSs) are preferred as the rotation mechanism. Each mirror element can be controlled individually. Therefore, light beams can be directed in multiple directions or instead focussed on a single spot. Realisation of RIS is relatively simple when mirror arrays are considered. While there are a number of applications for which mirror arrays are effective, they are generally constrained in their ability to manipulate incoming light beams because they are only able to perform specular reflection. Moreover, the physical rotation of mirrors makes them more vulnerable to failure or being limited by device parameters, such as angle resolution, maximum rotation angles, and lowspeed response. Metasurfaces Metasurfaces are planar structures that comprise an arrangement of sub-wavelength conductive pattern repeated over a dielectric substrate. The impinging electromagnetic wave can be reflected in a desired direction without any mechanical movement, i.e. anomalous reflection. This is possible because the induced current and response field interacts with the impinging electromagnetic wave based on the metasurface’s geometry and composition. Therefore, by applying a voltage and controlling the electric and magnetic properties of the metasurface, the amplitude, phase, and polarisation of the optical signal can be changed. Some materials show a large variation of refractive index when an external electrical field is applied. Both real and imaginary parts of the refractive index can be tuned, allowing the phase and amplitude of the reflected wave to be controlled. Elements of the metasurface can be controlled individually, and consequently, destructive or constructive interference of the reflected wave results in collimating the light in a desired direction. Moreover, the optical state of the RIS may change when an electrical field is applied. For instance, the structure of liquid crystal molecules can be changed by applying an electrical field resulting in different phase shift patterns [56]. Despite many researches, the realisation of metasurfaces for RIS in OWC is limited to a few lab experiments, and real-world implementation is still challenging.
5.2 RIS Applications in OWC As discussed in previous sections, a LoS path is usually assumed for an OWC system. For LiFi access networks, the receiver should be within the cell coverage and orientated towards the transmitter, and for point-to-point links, precise alignment is required. However, a random change of location or orientation, e.g. mobile scenarios, leads to loss of the LoS link by either blockage by an opaque object or having an inapplicable orientation. An uncontrolled NLoS link may
Optical Wireless Communication
495
alleviate the situation in a small number of scenarios, such as a PD close to the wall orientated appropriately. However, since reflection happens by all surfaces in random directions with significant loss, in most scenarios, an uncontrolled NLoS link is neglected or regarded as a small noise. RIS, on the other hand, can direct the light beams which are incident to a large surface towards a desired receiver. Therefore, RIS can improve the communication system performance by mitigating the link blockage effect and blind spots. RIS can also be used to improve the performance of an OWC system even when the LoS link exists. For instance, at the cell edges, the reliability of the communication link reduces because OWC cells are small, and the optical beam directionality in conjunction with random orientation may lead to frequent handover. This degrades the performance further by occupying excessive overheads. RIS can ensure a reliable link exists during the handover process by directing signal beams to the receiver and interference beams away from the receiver [57]. In MIMO-OWC, optical channel gains are mainly determined by the transceivers’ geometry, and due to the lack of fading effects as in RF, the MIMO channel elements may be correlated. This results in a rank-deficient or ill-conditioned channel matrix, which reduces the number of degrees of freedom and significantly degrades the performance. MIMOOWC systems will use arrays of transceiver in close proximity, and therefore this issue is a bottleneck. By using RIS, different components of the channel matrix can be decorrelated. Reconfigurability of RISs ensures distinguishable and decorrelated channel elements for mobile scenarios. Moreover, the same principle may be used for nonorthogonal multiple access (NOMA), which is designed based on having channel distinct gains for different users. By controlling multipath channel propagation via RIS, it is possible to dynamically control the received power at each user’s location. As stated earlier, zero-energy OWC systems are envisioned for future communication networks. RISs can be powered by efficient PV cells, which can be incorporated on the same structure. Control signals can be detected by the RIS simultaneously through these PV cells. Therefore, an OWC system with RIS will be one step closer to real-world implementation as there is no need for extra wiring and power. Moreover, RIS can be used to control and direct the ambient light towards the zero-energy receivers which need charging. In this section, only a few basic use cases for RIS were presented while many more promising applications can be imagined. It is worth mentioning that there are still several challenges and open problems which need to be addressed in the field of RIS for OWC [56].
References 1. R. Bian, I. Tavakkolnia, and H. Haas, “15.73 gb/s visible light communication with off-theshelf leds,” Journal of Lightwave Technology, vol. 37, no. 10, pp. 2418–2424, 2019. 2. M. A. Arfaoui, M. D. Soltani, I. Tavakkolnia, A. Ghrayeb, C. M. Assi, M. Safari, and H. Haas, “Measurements-based channel models for indoor lifi systems,” IEEE Transactions on Wireless Communications, vol. 20, no. 2, pp. 827–842, 2020.
496
I. Tavakkolnia et al.
3. H. Haas, L. Yin, Y. Wang, and C. Chen, “What is lifi?” Journal of lightwave technology, vol. 34, no. 6, pp. 1533–1544, 2015. 4. T. Koonen, K. Mekonnen, Z. Cao, F. Huijskens, N. Q. Pham, and E. Tangdiongga, “Ultra-highcapacity wireless communication by means of steered narrow optical beams,” Philosophical Transactions of the Royal Society A, vol. 378, no. 2169, p. 20190192, 2020. 5. R. Singh, F. Feng, Y. Hong, G. Faulkner, R. Deshmukh, G. Vercasson, O. Bouchet, P. Petropoulos, and D. O‘Brien, “Design and characterisation of terabit/s capable compact localisation and beam-steering terminals for fiber-wireless-fiber links,” Journal of Lightwave Technology, vol. 38, no. 24, pp. 6817–6826, 2020. 6. M. D. Soltani, E. Sarbazi, N. Bamiedakis, P. de Souza, H. Kazemi, J. M. Elmirghani, I. H. White, R. V. Penty, H. Haas, and M. Safari, “Safety analysis for laser-based optical wireless communications: A tutorial,” in Proceedings of the IEEE, vol. 110, no. 8, pp. 1045–1072, Aug. 2022. 7. H. Kazemi, E. Sarbazi, M. D. Soltani, T. E. El-Gorashi, J. M. Elmirghani, R. V. Penty, I. H. White, M. Safari, and H. Haas, “A Tb/s Indoor MIMO Optical Wireless Backhaul System Using VCSEL Arrays,” IEEE Transactions on Communications, 2022. 8. Z. Wang, D. Tsonev, S. Videv, and H. Haas, “On the design of a solar-panel receiver for optical wireless communications with simultaneous energy harvesting,” IEEE Journal on Selected areas in Communications, vol. 33, no. 8, pp. 1612–1623, 2015. 9. G. Pan, P. D. Diamantoulakis, Z. Ma, Z. Ding, and G. K. Karagiannidis, “Simultaneous lightwave information and power transfer: Policies, techniques, and future directions,” IEEE Access, vol. 7, pp. 28 250–28 257, 2019. 10. Cisco and SAS, “Cisco and SAS Edge-to-Enterprise IoT Analytics Platform.” vol. https://bit. ly/3pd8Yuo, 2017. 11. I. Tavakkolnia, L. K. Jagadamma, R. Bian, P. P. Manousiadis, S. Videv, G. A. Turnbull, I. D. Samuel, and H. Haas, “Organic photovoltaics for simultaneous energy harvesting and highspeed mimo optical wireless communications,” Light: Science & Applications, vol. 10, no. 1, pp. 1–11, 2021. 12. C. L. Cutting, M. Bag, and D. Venkataraman, “Indoor light recycling: a new home for organic photovoltaics,” Journal of Materials Chemistry C, vol. 4, no. 43, pp. 10 367–10 370, 2016. 13. C. Liaskos, S. Nie, A. Tsioliaridou, A. Pitsillides, S. Ioannidis, and I. Akyildiz, “A new wireless communication paradigm through software-controlled metasurfaces,” IEEE Communications Magazine, vol. 56, no. 9, pp. 162–169, 2018. 14. H. Abumarshoud, L. Mohjazi, O. A. Dobre, M. Di Renzo, M. A. Imran, and H. Haas, “Lifi through reconfigurable intelligent surfaces: A new frontier for 6g?” IEEE Vehicular Technology Magazine, vol. 17, no. 1, pp. 37–46, 2021. 15. A. Khalid, G. Cossu, R. Corsini, P. Choudhury, and E. Ciaramella, “1-gb/s transmission over a phosphorescent white led by using rate-adaptive discrete multitone modulation,” IEEE photonics journal, vol. 4, no. 5, pp. 1465–1473, 2012. 16. D. Tsonev, S. Sinanovic, and H. Haas, “Complete modeling of nonlinear distortion in ofdmbased optical wireless communication,” Journal of Lightwave Technology, vol. 31, no. 18, pp. 3064–3076, 2013. 17. E. F. Schubert, Light-emitting diodes. E. Fred Schubert, 2018. 18. M. S. Mossaad, S. Hranilovic, and L. Lampe, “Visible light communications using ofdm and multiple leds,” IEEE Transactions on Communications, vol. 63, no. 11, pp. 4304–4313, 2015. 19. H. Elgala, R. Mesleh, and H. Haas, “An led model for intensity-modulated optical communication systems,” IEEE Photonics Technology Letters, vol. 22, no. 11, pp. 835–837, 2010. 20. D. M. Maclure, J. J. McKendry, M. S. Islim, E. Xie, C. Chen, X. Sun, X. Liang, X. Huang, H. Abumarshoud, J. Herrnsdorf et al., “10 Gbps wavelength division multiplexing using UV-A, UV-B, and UV-C micro-LEDs,” Photonics Research, vol. 10, no. 2, pp. 516–523, 2022. 21. K. Yoshida, P. P. Manousiadis, R. Bian, Z. Chen, C. Murawski, M. C. Gather, H. Haas, G. A. Turnbull, and I. D. Samuel, “245 MHz bandwidth organic light-emitting diodes used in a gigabit optical wireless data link,” Nature communications, vol. 11, no. 1, pp. 1–7, 2020.
Optical Wireless Communication
497
22. M. A. Fernandes, P. P. Monteiro, and F. P. Guiomar, “Free-space terabit optical interconnects,” Journal of Lightwave Technology, vol. 40, no. 5, pp. 1519–1526, 2022. 23. A. Larsson, “Advances in VCSELs for Communication and Sensing,” IEEE Journal of Selected Topics in Quantum Electronics, vol. 17, no. 6, pp. 1552–1567, Nov. 2011. 24. A. Liu, P. Wolf, J. A. Lott, and D. Bimberg, “Vertical-cavity surface-emitting lasers for data communication and sensing,” Photonics Research, vol. 7, no. 2, pp. 121–136, Feb. 2019. 25. K. Iga, “Surface-Emitting Laser—Its Birth and Generation of New Optoelectronics Field,” IEEE Journal of Selected Topics in Quantum Electronics, vol. 6, no. 6, pp. 1201–1215, Nov. 2000. 26. B. E. A. Saleh and M. C. Teich, Fundamentals of Photonics, 3rd ed. John Wiley & Sons, Inc., 2019, vol. Part I: Optics. 27. S. B. Alexander, Optical Communication Receiver Design. Bellingham: SPIE Press, 1997. 28. R. Winston, J. C. Minano, P. G. Benitez et al., Nonimaging Optics. Elsevier, 2005. 29. J. M. Kahn and J. R. Barry, “Wireless infrared communications,” Proc. IEEE, vol. 85, no. 2, pp. 265–298, Feb. 1997. 30. H. Schulze, “Frequency-Domain Simulation of the Indoor Wireless Optical Communication Channel,” IEEE Trans. Commun., vol. 64, no. 6, pp. 2551–2562, Jun. 2016. 31. M. D. Soltani, A. A. Purwita, Z. Zeng, H. Haas, and M. Safari, “Modeling the random orientation of mobile devices: Measurement, analysis and lifi use case,” IEEE Transactions on Communications, vol. 67, no. 3, pp. 2157–2172, 2018. 32. M. Obeed, A. M. Salhab, M.-S. Alouini, and S. A. Zummo, “On optimizing vlc networks for downlink multi-user transmission: A survey,” IEEE Communications Surveys & Tutorials, vol. 21, no. 3, pp. 2947–2976, 2019. 33. O. Haddad, M.-A. Khalighi, S. Zvanovec, and M. Adel, “Channel characterization and modeling for optical wireless body-area networks,” IEEE Open Journal of the Communications Society, vol. 1, pp. 760–776, 2020. 34. V. Degiorgio and I. Cristiani, Photonics. Springer, 2016. 35. I. Tavakkolnia, C. Chen, Y. Tan, and H. Haas, “Terabit optical wireless-fiber communication with Kramer-Kronig receiver (Part I),” IEEE Communications Letters, 2022. 36. M. S. Islim and H. Haas, “Modulation techniques for li-fi,” ZTE communications, vol. 14, no. 2, pp. 29–40, 2019. 37. H. E. Levin, “A complete and optimal data allocation method for practical discrete multitone systems,” in GLOBECOM’01. IEEE Global Telecommunications Conference (Cat. No. 01CH37270), vol. 1. IEEE, 2001, pp. 369–374. 38. O. Narmanlioglu, R. C. Kizilirmak, T. Baykas, and M. Uysal, “Link adaptation for MIMO OFDM visible light communication systems,” IEEE Access, vol. 5, pp. 26 006–26 014, 2017. 39. D. Tsonev, S. Videv, and H. Haas, “Unlocking spectral efficiency in intensity modulation and direct detection systems,” IEEE Journal on Selected Areas in Communications, vol. 33, no. 9, pp. 1758–1770, 2015. 40. M. Giordani, M. Polese, M. Mezzavilla, S. Rangan, and M. Zorzi, “Toward 6G Networks: Use Cases and Technologies,” IEEE Communications Magazine, vol. 58, no. 3, pp. 55–61, Mar. 2020. 41. E. Calvanese Strinati, S. Barbarossa, J. L. Gonzalez-Jimenez, D. Ktenas, N. Cassiau, L. Maret, and C. Dehos, “6G: The Next Frontier: From Holographic Messaging to Artificial Intelligence Using Subterahertz and Visible Light Communication,” IEEE Vehicular Technology Magazine, vol. 14, no. 3, pp. 42–50, Sep. 2019. 42. O. Gonzalez, R. Perez-Jimenez, S. Rodriguez, J. Rabadan, and A. Ayala, “OFDM over indoor wireless optical channel,” IEE Proceedings—Optoelectronics, vol. 152, no. 4, pp. 199–204, Aug. 2005. 43. L. A. Coldren, S. W. Corzine, and M. L. Masanovic, Diode Lasers and Photonic Integrated Circuits, 2nd ed. John Wiley & Sons, Inc., 2012. 44. D. Tse and P. Viswanath, Fundamentals of Wireless Communication. Cambridge University Press, 2005.
498
I. Tavakkolnia et al.
45. K. Szczerba, P. Westbergh, J. Karout, J. S. Gustavsson, A. Haglund, M. Karlsson, P. A. Andrekson, E. Agrell, and A. Larsson, “4-PAM for High-Speed Short-Range Optical Communications,” IEEE/OSA Journal of Optical Communications and Networking, vol. 4, no. 11, pp. 885–894, Nov. 2012. 46. H. K. H. Lee, J. Wu, J. Barbé, S. M. Jain, S. Wood, E. M. Speller, Z. Li, F. A. Castro, J. R. Durrant, and W. C. Tsoi, “Organic photovoltaic cells–promising indoor light harvesters for self-sustainable electronics,” Journal of Materials Chemistry A, vol. 6, no. 14, pp. 5618–5626, 2018. 47. G. Grancini, C. Roldán-Carmona, I. Zimmermann, E. Mosconi, X. Lee, D. Martineau, S. Narbey, F. Oswald, F. De Angelis, M. Graetzel et al., “One-year stable perovskite solar cells by 2d/3d interface engineering,” Nature communications, vol. 8, no. 1, pp. 1–8, 2017. 48. O. L. López, H. Alves, R. D. Souza, S. Montejo-Sánchez, E. M. G. Fernández, and M. LatvaAho, “Massive wireless energy transfer: Enabling sustainable IoT toward 6G era,” IEEE Internet of Things Journal, vol. 8, no. 11, pp. 8816–8835, 2021. 49. J. Fakidis, H. Helmers, and H. Haas, “Simultaneous wireless data and power transfer for a 1Gb/s GaAs VCSEL and photovoltaic link,” IEEE Photonics Technology Letters, vol. 32, no. 19, pp. 1277–1280, 2020. 50. J. A. Nelson, The physics of solar cells. World Scientific Publishing Company, 2003. 51. S. Das, A. Sparks, E. Poves, S. Videv, J. Fakidis, and H. Haas, “Effect of sunlight on photovoltaics as optical wireless communication receivers,” Journal of Lightwave Technology, vol. 39, no. 19, pp. 6182–6190, 2021. 52. S. Sepehrvand, L. N. Theagarajan, and S. Hranilovic, “Rate-power trade-off in simultaneous lightwave information and power transfer systems,” IEEE Communications Letters, vol. 25, no. 4, pp. 1249–1253, 2020. 53. I. Tavakkolnia, L. K. Jagadamma, R. Bian, P. P. Manousiadis, S. Videv, G. A. Turnbull, I. D. Samuel, and H. Haas, “High-speed MIMO communication and simultaneous energy harvesting using novel organic photovoltaics,” in Optical Fiber Communication Conference. Optica Publishing Group, 2021, pp. W7E–7. 54. F. Aieta, A. Kabiri, P. Genevet, N. Yu, M. A. Kats, Z. Gaburro, and F. Capasso, “Reflection and refraction of light from metasurfaces with phase discontinuities,” Journal of Nanophotonics, vol. 6, no. 1, p. 063532, 2012. 55. X. Yu, V. Jamali, D. Xu, D. W. K. Ng, and R. Schober, “Smart and reconfigurable wireless communications: From IRS modeling to algorithm design,” IEEE Wireless Communications, vol. 28, no. 6, pp. 118–125, 2021. 56. V. Jamali, H. Ajam, M. Najafi, B. Schmauss, R. Schober, and H. V. Poor, “Intelligent reflecting surface assisted free-space optical communications,” IEEE Communications Magazine, vol. 59, no. 10, pp. 57–63, 2021. 57. Q. Wu and R. Zhang, “Towards smart and reconfigurable environment: Intelligent reflecting surface aided wireless network,” IEEE Communications Magazine, vol. 58, no. 1, pp. 106– 112, 2019.
Part III
Network Evolution Towards 6G
Cell-Free Massive MIMO Giovanni Interdonato and Stefano Buzzi
List of Acronyms 3GPP AP APU BBU BS C-RAN CDF CoMP CPU CSI CSIT CZF DAS DCC ELAA eMBB H-NOMA
3rd Generation Partnership Project Access Point Antenna Processing Unit Baseband Unit Base Station Cloud-Radio Access Network Cumulative Distribution Function Coordinated Multipoint Central Processing Unit Channel State Information Channel State Information at the Transmitter Centralized Zero-Forcing Distributed Antenna Systems Dynamic Cooperation Clustering Extremely Large Aperture Array Enhanced Mobile Broadband Heterogeneous Orthogonal Multiple Access
G. Interdonato (O) University of Cassino and Southern Latium, Cassino, Italy Consorzio Nazionale Interuniversitario per le Telecomunicazioni (CNIT), Parma, Italy e-mail: [email protected] S. Buzzi University of Cassino and Southern Latium, Cassino, Italy Politecnico di Milano, Milan, Italy Consorzio Nazionale Interuniversitario per le Telecomunicazioni (CNIT), Parma, Italy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_18
501
502
H-OMA JMRZF LSFD LTE MEC MIMO MMF MMSE mMTC mmWave MR MRT NOMA OMA OTA PZF QAM QoS RAN RRU SINR SNR TDD TIN TMMSE UC UE URLLC ZF
G. Interdonato and S. Buzzi
Heterogeneous Non-orthogonal Multiple Access Joint Maximum Ratio and Zero-Forcing Large-Scale-Fading Decoding Long-Term Evolution Mobile Edge Computing Multiple Input Multiple Output Maximum–Minimum Fairness Minimum Mean Square Error Massive Machine-Type Communications Millimeter Wave Maximum Ratio Maximum-Ratio Transmission Non-orthogonal Multiple Access Orthogonal Multiple Access Over the Air Partial Zero-Forcing Quadrature Amplitude Modulation Quality of Service Radio Access Network Remote Radio Units Signal-to-Interference-Plus-Noise Ratio Signal-to-Noise Ratio Time-Division Duplex Treating Interference as Noise Team Minimum Mean Square Error User-Centric User Equipment Ultra-reliable Low Latency Zero-Forcing
1 Introduction and Motivation The ultimate mission of a mobile network is providing excellent communication service to anyone, anywhere, any time. This motto translates to technical requirements in terms of spectral efficiency (user-experienced data rates), connectivity, communication service availability, and reliability. For the enhanced Mobile Broadband (eMBB), the key performance indicator is the spectral efficiency with its different flavors such as data rate, throughput, area traffic capacity, etc. Since the spectral efficiency increases logarithmically with the signal-to-interference-plus noise ratio (SINR) (Shannon–Hartley theorem), eMBB communications are designed with the aim of maximizing the power of the desired signal at the intended receiver while minimizing any source of interference and noise. This can be efficiently achieved by using multiple antennas at both the
Cell-Free Massive MIMO
503
receiver and the transmitter and conveniently implementing linear signal processing techniques with interference-suppression mechanisms for both transmit precoding and receive combining, such as minimum-mean-square-error (MMSE) and zeroforcing (ZF). The use of MIMO technologies leads to multiple benefits: .(i) array gain, .(ii) spatial diversity, and .(iii) multiplexing gain, among others. Although these three key ingredients contribute to significantly boost the SINR experienced by the users, there are still two main drawbacks inherent to the network-centric nature characterizing the conventional cellular systems, namely the inter-cell interference created by the cell boundaries and the signal blockage due to the presence of obstacles between transmitter and receiver. In a conventional cellular network, one user is served by only one access point (AP), which is typically the one with the best channel quality indicator. The set of locations for which a specific AP is selected determines its coverage area, also known as cell. The network topology is designed so that multiple APs autonomously provide service over (ideally disjoint) cells. Hence, in such a network/cell-centric deployment, each AP is in the center of its cell serving all the users located within its coverage area. As a consequence, users at different distances from the AP will experience a different quality of service. Especially, the users located at the boundaries of the cell, referred to as celledge users, will experience a poor spectral efficiency due to a significant inter-cell interference and an excessive pathloss and shadowing and are more likely subject to the signal blockage. Overcoming these limitations entails challenging the cellular paradigm characterizing the current mobile networks. Cellular networks are sub-optimal from a channel capacity point of view because higher SINR can be achieved by jointly and coherently processing each signal at multiple APs [1]. As already mentioned, the large variations of the service experienced among the users are inherent to the cell-centric nature of the mobile networks and persist even if the massive MIMO technology [2–4] is implemented in the physical layer. Certainly, massive MIMO operation can remarkably alleviate the inter-cell interference thanks to its highly spatially focused transmissions and ability to capture the energy of the transmitted signal after receive combining. On top of that, the favorable propagation phenomenon triggered by the use of a large number of transmit/receive antennas makes (asymptotically) vanish the multi-user interference. This further boosts the SINR of the users but still does not reduce its dynamic range. In principle, proper power control strategies can be adopted to improve fairness and thereby providing uniform quality of service throughout the cell or the multi-cell system, such as single-cell and multi-cell max–min fairness, respectively. However, such an egalitarian resource allocation policy is often too penalizing for those users with good channel conditions, which are subject to a performance drop for the greater good, namely maximizing the minimum spectral efficiency in the network. Hence, the cell-edge users jeopardize the performance of the cell-center users. Figure 1 depicts a typical example of a multi-cell system according to the conventional hexagonal cellular network model wherein each (macro-)cell is served by an AP (or base station). The users
504
G. Interdonato and S. Buzzi
Fig. 1 Cellular paradigm: An example of conventional multi-cell system wherein each (macro-)cell is served by an AP (or base station). The base station may be equipped with multipleantenna elements operating in massive MIMO mode. The users emphasized in red represent the cell-edge users characterized by poor channel conditions
emphasized in red represent the aforementioned cell-edge users characterized by poor channel conditions. Nowadays, cellular networks can provide high peak spectral efficiency near the AP, but the service variations within each cell is still substantial. This conflicts with the vision of beyond-5G mobile systems of providing uniformly great service to every user and ubiquitous, reliable connectivity. When payments, web surfing, entertainment, and control of autonomous vehicles are all relying on wireless connectivity, ensuring uniformly excellent quality of service is a must. This is the fundamental requirement and the primary goal of the next-generation mobile networks. Fulfilling such an ambitious requirement led the research community to challenge the cellular paradigm and to start a marked transition from cell-centric to user-centric (or cell-free) architectures, wherein each user is located in the center of its own virtual cell, surrounded by many serving APs, thereby experiencing no cell-edge effects. With the cell-free paradigm, there are ideally no cell-edge users and, as a result, almost uniformly high service quality in a given geographical area. The cell-free paradigm can be attained by densifying the network and letting the APs cooperate with each other. Network ultra-densification and distributed
Cell-Free Massive MIMO
505
Fig. 2 Cell-free paradigm: An example of cell-free massive MIMO system wherein each user is jointly and coherently served by possibly overlapped subsets of APs. The APs are connected by means of a fronthaul network (dashed gray lines) to one or more central processing units (CPUs) that are in charge of the cooperation and synchronization. CPUs are connected to each other through a backhaul network (solid black line). The colored cells represent the essence of the user-centric approach: each user is in the center of its own tailored virtual cell surrounded by multiple serving APs
cooperation can jointly reduce the pathloss, provide macroscopic diversity against the shadowing, and drop the blockage probability, since every user is more likely to have at least one AP in its proximity. Hence, a cell-free network can potentially offer ubiquitous coverage and uniformly great service to everyone unlike its cellular counterpart. However, implementing the cell-free paradigm, in turn, requires an expensive architecture for the fronthaul network and accurate time synchronization to enable the joint transmission/reception at multiple APs. These two practical impediments limited the development of cooperative distributed MIMO systems, also known as network MIMO systems, in the last two decades. Nevertheless, the advent of massive MIMO technology has brought new lifeblood to the network MIMO concept, revealing opportunities to competitive practical implementations. The concept merging the network ultra-densification with the cooperative signal processing overlaying the massive MIMO physical layer is known under the name of cell-free massive MIMO [5–8]. Figure 2 depicts an example of cellfree massive MIMO system wherein each user is jointly and coherently served by possibly overlapped subsets of APs (hardware-wise cheaper than a base station) that determine the virtual cell tailored to the needs of the specific user.
506
G. Interdonato and S. Buzzi
1.1 Historical Background and Taxonomy Transmit beamforming from co-located antenna arrays allows to increase the power of the signal at the receiver without increasing the total transmit power. Such a coherent transmission can be performed by multiple APs geographically distributed over a wide area, at the cost of higher implementation complexity. The principle of co-processing the signal at multiple distribute APs to enhance the channel capacity arose in the early 2000s [1], and its implementation took different names over the last two decades. Cell-free massive MIMO is only the latest attempt in providing a successful solution for distributed multiple-antenna technologies. First attempts initially consisted of a few macrobase stations (BSs) cooperating to serve all the users in their range of influence by using multi-user MIMO signal processing techniques [9, 10]. Following on this track, [11] introduced the terminology “network MIMO” to describe a network wherein multiple cells organically operate as a single large cell serving all the users in its joint coverage area. In this initial vision, cooperating BSs were grouped into disjoint clusters, each one served by a central processing unit (CPU), that is an entity (either physical or logical) in charge of the AP coordination, sharing data, collecting global information, and performing network tasks in a centralized fashion. The exchange of information among clusters occurs over a fronthaul network. The ability of network MIMO in mitigating the inter-cell interference and providing excellent spectral efficiency was confirmed in several studies [11, 12], under realistic operational conditions. Although inter-cell interference was significantly alleviating within each cluster, the system performance was however deteriorated by the inter-cluster interference. Hence, the interference was only shifted to another level. Figure 3 shows an example of network MIMO system with clusters of BSs, that is the union of multiple cells, set up to mitigate the intercell interference but suffering from inter-cluster interference. The latter was partially overcome in [12] that proposed an overlapped clustering deployment where adjacent BSs were grouped into clusters overlapped in space such that users could not experience the effects of the cluster boundaries. Cooperative multi-cell MIMO is a terminology introduced in [13–15] to characterize a practical network MIMO system. These works emphasized the practical limitations of “conventional” network MIMO, since its fully centralized signal processing required high computational complexity and the exchange of instantaneous channel state information (CSI) among the BSs. Conversely, in cooperative multicell MIMO, distributed precoding/combining with partial (or no) CSI sharing was advocated in order to reduce the requirements on the fronthaul network and the computational complexity. Such a distributed signal processing accompanied by a local channel estimation at each AP and a dynamic cooperation clustering gave a good recipe for practical implementations of network MIMO. The dynamic cooperation clustering (DCC) revisits the overlapped clustering architecture proposed in [12] and consists in setting up tailored clusters of nearby BSs, upon the knowledge of the channel statistics, to surround any user in the network. On one hand, this ensures to
Cell-Free Massive MIMO
507
Fig. 3 Cellular cluster paradigm: An example of conventional multi-cell system operating in network MIMO mode, wherein clusters of cells serve all the users in their joint coverage area. Although some users benefit from the cluster cooperation, there are still users (emphasized in red) at the cluster edge experiencing poor channel conditions
remove the inter-cluster interference compared to a static clustering since a moving user causes clustering reconfiguration and thereby is more unlikely to stand at the edge of a cluster. On the other hand, DCC confines the cooperation within a handful of adjacent BSs, thereby alleviating the fronthauling load. Distributed Antenna Systems (DAS) [16] and virtual MIMO [17] are synonyms of network MIMO. Both describe a system where distributed antennas, coordinated by a central processor, virtually act as a single MIMO system. Coordinated MultiPoint (CoMP) is the commercial name of network MIMO in LTE-Advanced included in LTE Release 11 [18] and implements several network/cooperative MIMO techniques for inter-cell interference mitigation [19, 20] and cooperation clustering [21]. First, CoMP designs and deployments concerned macro-cell networks, while later they have been extended to multi-tier architectures [22], bringing the inter-cell interference coordination at the small cell level while tackling the inter-cluster interference at higher hierarchical layers by means of massive MIMO BSs [23]. The terminology Extremely Large Aperture Array (ELAA) was introduced in [24] to characterize a system consisting of distributed antennas that jointly and coherently serve multiple users. This terminology points out the large dimension of the distributed array, that is of all the antennas seen as a whole, with antenna separation
508
G. Interdonato and S. Buzzi
in the order of meters. This structure is opposed to compact arrays where the antenna separation is in the order of the wavelength (e.g., co-located massive MIMO). Thus, any network-MIMO-based communication system described above can be equivalently referred to as ELAA as well as the very large aperture massive MIMO systems investigated in [25] and the extra-large-scale massive MIMO systems studied in [26, 27]. Lastly, Cloud-Radio Access Network (C-RAN) identifies a radio access network architecture rather than a physical layer technology. C-RAN is a cloud-computingbased architecture for radio access networks [28] wherein the baseband processing takes place in a central processor referred to as baseband unit (BBU) connected to several remote radio units (RRU) via optical fiber fronthaul. The RRU is simply a device forwarding signals between the BBU and the users. Based on this characterization, we can infer that C-RAN is suitable for implementing network MIMO systems operating in full centralized fashion. However, full centralization is critical in C-RAN since the BBU and the fronthaul network may turn to be the bottleneck of the architecture as the number of users and APs grows large. To limit this issue, dynamic cooperation clustering techniques were investigated for C-RAN architectures in [29, 30]: each user is surrounded by selected serving RRU such that it “falls” in the center of the created cluster. C-RAN architectures might be used to implement a cell-free massive MIMO system, although some baseband processing capability is required at the APs in order to de-centralize some tasks such as channel estimation and distributed precoding/combining.
1.2 Brief Review of the State-of-the-Art on Cell-Free Massive MIMO The terminology cell-free massive MIMO first appeared in the works [5, 31, 32]. Since then, this topic has received much attention from the research community. The pioneering work [5] defined a system model with single-antenna APs and users, provided an achievable spectral efficiency for the uplink and the downlink with maximum-ratio combining and transmission, respectively, and devised some resource allocation strategies for both power control and pilot assignment. Importantly, [5] compared the performance of cell-free massive MIMO against small cells showing the superiority of the former especially in terms of 95%-likely per-user throughput with max–min fairness optimization. Early works on cell-free massive MIMO focused on theoretical performance neglecting practical aspects. For instance, [5] assumed that all the APs in the network would serve all the users. This is clearly not doable in practical implementations covering wide areas as it would require very high degree of cooperation and an expensive fronthaul network that in turn would lead to higher complexity for signal processing, higher signaling overhead, and stricter requirements for latency and synchronization. In a nutshell, involving more and more APs and users in the
Cell-Free Massive MIMO
509
joint coherent processing leads to the system scalability issues described in [33, 34]. Therefore, the dynamic cooperation clustering framework already introduced in cooperative MIMO networks and CoMP has been easily adapted to the cell-free massive MIMO system model to render the network scalable as the number of APs and users increases unboundedly. Many valid strategies have been investigated to implement DCC such as channel statistics-dependent [7, 35–37] and received power-dependent AP selection scheme [35], the joint pilot assignment, and AP selection scheme proposed in [8]. The common objective of these strategies is to select the best (possible overlapped) subset of APs to serve each user so that to confine the cell-free processing within a cluster without significant loss in performance. On a parallel research track, a significant effort has been spent to find an optimal trade-off between nearly optimal performance and degree of cooperation among the APs. A network-wide fully centralized operation is optimal from a performance viewpoint but hard to implement in practice. Conversely, a fully distributed operation where most of the network tasks are carried out locally has low implementation requirements but performs modestly. Some papers advocated the use of maximum ratio (MR) processing locally at each AP to preserve the fronthaul network [5, 38–40], while better performance can be achieved at the expense of higher fronthauling requirements by using partially or fully centralized processing [41–46]. The vast majority of cell-free massive MIMO literature focuses on sub-6 GHz systems as in these frequency bands the role of the joint coherent processing in increasing the spectral efficiency is crucial. At higher frequency bands, high spectral efficiencies can be easily obtained by only capitalizing on the large available bandwidth, and therefore, cell-free processing may be, in some cases, unnecessary [47]. However, the main principles of the cell-free processing can be easily extended to the millimeter-wave (mmWave) bands. Examples in this direction can be found in [48–52] where problems related to channel models, channel estimation, limited fronthaul capacity, and user mobility are investigated.
2 Key Ingredients, Operation, and Caveats Cell-free massive MIMO builds up on the principles of network MIMO, but its implementation relies on the massive MIMO physical layer technology and thereby inherits all its advantages. A cell-free massive MIMO system is characterized by a large number of antennas compared to the users to be served. This leads to larger beamforming and diversity gain as well as to a greater ability of multiplexing the users in the spatial domain. The use of many antennas also triggers the favorable propagation that further reduces the multi-user interference and the channel hardening that reduces the random fluctuations of the effective channel gain. As a consequence, there is no need, as in network MIMO, to adopt advanced and power-hungry signal processing techniques to deal with the user interfer-
510
G. Interdonato and S. Buzzi
Massive MIMO PHY
Macro-diversity gain
User-centric approach
Uniformly great service
Fig. 4 Key ingredients of cell-free massive MIMO. When joint coherent processing meets a distributed dense topology, the result is an outstanding macro-diversity gain. The user-centric implementation ensures uniformly great QoS to every user, while the massive MIMO baseline operation confers an intrinsic practicality and scalability to the implementation of the cell-free network
ence: linear signal processing (e.g., MMSE/ZF precoding/combining) yields nearly optimal performance and reduces the hardware complexity and the circuit power consumption. Besides, by operating in TDD mode, the channel estimation overhead can be significantly reduced as the acquisition of the CSI can be conveniently carried out only in the uplink. Not least, simplified resource allocation strategies can be devised upon the channel statistics rather than on the CSI enabling an excellent trade-off between performance and practicality. All these elements mark a sharp departure from the past. Just like massive MIMO is deemed to be a scalable version of multi-user MIMO, cell-free massive MIMO can be considered as a scalable, lowcomplexity, and practical embodiment of network MIMO. Figure 4 illustrates four key ingredients of cell-free massive MIMO: 1. Massive MIMO baseline operation at the physical layer. Cell-free massive MIMO can be ideally thought as a massive MIMO BS where the antenna elements have been split and spread out over a wide area. 2. Extraordinary Macro-diversity gain. When joint coherent processing meets a distributed dense topology, geography comes into play by enriching the levels of space diversity. This entails higher communication reliability and stronger channel gains.
Cell-Free Massive MIMO
511
8 6 4 2 0 1000
1000 800
600
500 400
Position [m]
200
0
0
Position [m]
Fig. 5 Spectral efficiency delivered by a cellular network to users at different locations in a coverage area served by nine single-antenna APs deployed in a regular grid. The APs operate autonomously. A maximal spectral efficiency of 8 bit/s/Hz is selected, which corresponds to uncoded 256-QAM. These results had been included in [6]
3. User-centric approach [7, 8, 35]. Transmissions/receptions focus on the users, namely are set up to ensure the best QoS. This can be done by serving any user from all the APs in its surroundings, as shown in Fig. 2. As a result, each user is in the center of its own virtual cell, hence the terminology cell-free and user-centric. 4. Uniformly great service [5] is the final result of all previous ingredients. As each user is in the center of its own virtual cell served by many APs, it is likely that all the users experience similar QoS, meaning that there is small variation in the service throughout the network. To give a first sense of the benefits that a cell-free network provides over a cellular network, we next present a performance comparison in terms of spectral efficiency per user and macro-diversity gain. For an exhaustive analytical description of the system and signal models, we refer the reader to the textbook [8] and references therein. Figures 5 and 6 show the spectral efficiency delivered to the user at different locations in a coverage area served by nine APs deployed in a regular grid. The results in Fig. 5 refer to a cellular network wherein the APs do not cooperate and thereby an arbitrary used is served by only the AP that provides the largest channel gain (presumably the closest AP to the user). Such a setup can be thought as a small cells network. The results in Fig. 6 refer to a cell-free network wherein all the APs cooperate to serve the user at any location. The maximal spectral efficiency is selected to 8 bit/s/Hz per user, which is attainable by performing uncoded 256-QAM. We mainly observe two performance improvements:
512
G. Interdonato and S. Buzzi
8 6 4 2 0 1000
1000 800
600
500 400
Position [m]
200
0
0
Position [m]
Fig. 6 Spectral efficiency delivered by a cell-free network to users at different locations in a coverage area served by nine cooperating single-antenna APs deployed in a regular grid. A maximal spectral efficiency of 8 bit/s/Hz is selected, which corresponds to uncoded 256-QAM. These results had been included in [6]
. Cell-free massive MIMO delivers a more uniform spectral efficiency than small cells over the coverage area, which is clear by inspecting the larger surface at which the maximal spectral efficiency is achieved. The signal co-processing over the APs enables to convert the inter-cell interference into useful signal. As a result, the spectral efficiency is only limited by the signal propagation losses. . Cell-free massive MIMO ensures much smaller service variations than small cells throughout the network. Indeed, the minimum achievable spectral efficiency is slightly smaller than 6 bit/s/Hz per user in the cell-free setup and below 1 bit/s/Hz for its cellular counterpart. In cellular networks, the performances are severely limited by the inter-cell interference, and the QoS at the cell edge is extremely poor. Conversely, this result confirms that there are basically no cell-edge users in cell-free massive MIMO, namely the effects of the cell boundaries on the user performance have been erased by the cooperation in the signal processing. Similar conclusions can be drawn when comparing the SEs provided by a cellfree and a co-located massive MIMO system, total number of antennas being equal. The simulation results in Fig. 7 show the cumulative distribution function (CDF) of downlink spectral efficiency per user provided by cell-free and co-located massive MIMO. The coverage area is .100 × 100 m.2 and is served by a total number of antennas equal to 100. In the cell-free setup, the antennas are distributed in 100 single-antenna APs deployed as a regular grid. In the cellular setup, the antennas are co-located in 1 BS placed at the cell center. The number of active users, randomly distributed in the network, is 20. The propagation model follows that in [53],
Cell-Free Massive MIMO
513
1
0.8
0.6
0.4
0.2
0 0
1
2
3
4
5
6
7
Fig. 7 CDFs of the downlink spectral efficiency per user provided by a cell-free and a co-located massive MIMO system with 100 antennas in a coverage area of 10,000 m.2 . In the cell-free setup, the antennas are distributed in 100 single-antenna APs deployed as a regular grid. In the cellular setup, the antennas are co-located in 1 BS placed at the cell center. The downlink SEs are achieved by assuming conjugate beamforming and statistical CSI knowledge at the users
assuming that the carrier frequency is 5.2 GHz, the heights of the user and of the AP antennas are .1.65 m and 5 m, respectively. The downlink SEs shown in Fig. 7 are achieved by assuming conjugate beamforming, which is the only viable option for single-antenna APs operating in distributed fashion, and statistical CSI knowledge at the users. The SE comparison in Fig. 7 clearly shows that co-located massive MIMO provides a SE with large variations among the users, namely very high peak SE to few lucky users and a modest SE to the rest of the users. Conversely, cell-free massive MIMO delivers a great uniform service guaranteeing higher SE than co-located massive MIMO to about 80% of the users in the network. The distribution of the SE over the coverage area is shown in the heatmap in Fig. 8. Compared to the previous simulation scenario, these simulations assume imperfect CSI knowledge at the users. The heatmap highlights how co-located massive MIMO (subfigure at the bottom) can excellently serve only the users close to the BS, while the cell-edge users experience lower SE. Instead, cell-free massive MIMO erases the cell-edge effects by leveraging the AP cooperation and thereby providing uniform SE throughout the coverage area. Lastly, Fig. 9 emphasizes how cell-free massive MIMO is able to greatly support a larger number of users compared to co-located massive MIMO, which is due to the great ability of cell-free massive MIMO in multiplexing the users in the spatial domain. Indeed, coherent transmission at multiple APs increases the signal-to-noise ratio (SNR) at the receiver as it enables constructive interference. Unlike for co-located
514
G. Interdonato and S. Buzzi
meters
14
12
10
100 50
100
8
meters
6
4
50
Downlink spectral efficiency [bit/s/Hz/user]
50
2
100 50
100
meters Fig. 8 Heatmap of the downlink spectral efficiency per user provided by a cell-free (top) and a co-located (bottom) massive MIMO system with 100 antennas in a coverage area of 10000 m.2 . In the cell-free setup, the antennas are distributed in 100 single-antenna APs deployed in a regular grid. In the cellular setup, the antennas are co-located in 1 BS at the cell center. The downlink SEs are achieved by assuming conjugate beamforming and imperfect CSI knowledge at the users
arrays where transmit precoding forms a beam pattern, multiple geographically distributed antennas jointly and coherently transmitting produce a local amplification of the signal energy within a ball around the receiver. This ball can be shrunk (hence the spatial resolution of the transmission is increased) by involving more APs in the coherent transmission. The spectral efficiency improvements provided by cell-free massive MIMO are essentially due to the extraordinary level of macro-diversity experienced by the users. The macro-diversity gain can be quantified by the effective channel gain experienced by an arbitrary user. Assuming, for the sake of simplicity, APs and users equipped with single antenna, for cell-free networks, the channel gain corresponds to the norm square of the channel vector consisting of the scalar channel responses from an arbitrary user to all the serving APs. For cellular networks (i.e., small cell setup), the channel gain is the modulus of the largest scalar channel response.
Cell-Free Massive MIMO
515
3
2.5
2
1.5
1
0.5
0 10
20
30
40
50
60
70
80
Fig. 9 The average downlink spectral efficiency per user provided by cell-free and a co-located massive MIMO as the total number of active users in the network grows
Figure 10 shows the CDF of the channel gain expressed in dB. We observe that with a large intersite distance (ISD), the most unfortunate users (i.e., lower percentiles of the CDF) benefit from the cell-free processing by approximately 5 dB. In this scenario, it is more likely that only a few APs give a significant contribution in the service; hence, the difference between cellular and cell-free operation may be negligible, as we can see at higher percentiles. Conversely, at small ISDs, all the users can obtain from 5 to 20 dB higher channel gain with the cell-free processing. In such a scenario, the user-centric approach can be effectively implemented as the AP density is sufficient to surround each user. More detailed comparison between cell-free massive MIMO, co-located massive MIMO, and small cells can be found in [8] and references therein.
2.1 Uplink Operation The uplink implementation requiring the highest degree of cooperation among the APs and thereby with the highest fronthauling requirements is a fully centralized operation where all the network tasks are in charge of the CPU. Such a centralized operation can be implemented by means of a C-RAN architecture where the APs act as RRHs that receive and forward their received baseband signals to the CPU for processing, see, for example, the system model in [8, Sec. 5.1]. Let us recall that the CPU is either a physical or a logical entity; thus network-wide information
516
G. Interdonato and S. Buzzi 1 Cell-free Cellular
ISD: 5 m
0.8 ISD: 100 m
CDF
0.6
0.4
0.2
0 -110
-100
-90
-80
-70
-60
-50
-40
Channel gain [dB]
Fig. 10 CDFs of the channel gain in dB experienced by the users in a cell-free and a cellular network. The users are distributed uniformly at random over a nominal area of 1 km.2 and served by 2500 single-antenna APs deployed in a regular grid with different intersite distances. These results had been included in [6]
and processing should not necessarily converge to one network node but might be in turn distributed over multiple edge-cloud processors. Decentralizing some network tasks such as channel estimation, power control, and receive combining enables to relieve the fronthaul network. The uplink implementation with the no cooperation among the APs is obtained when the data from every user are decoded locally at each AP upon its own channel estimates. This, however, turns cell-free massive MIMO into a small-cell system. Instead, the uplink implementation with the lowest cooperation among the APs consists in decentralizing channel estimation, receive combining and power control and to perform the final data decoding at the CPUs where the inputs from multiple APs are properly combined. This data fusion at the CPU can conveniently rely only on the channel statistics so that there is no need to exchange CSI over the fronthaul network. This is the basic principle of the so-called large-scale-fading decoding (LSFD) proposed in [54]. In general, partially distributed operations constitute a good trade-off between performance and complexity, see some examples in [8, Sec. 5.2]. Importantly, distributed and partially distributed operations require APs equipped with a baseband processor. Decentralization is key to preserve the network scalability, that is the ability of the network to handle a growing load of work as the number of APs and users grows, and to keep the fronthauling overhead relatively low to prevent severe quantization distortion.
Cell-Free Massive MIMO
517
2.2 Downlink Operation As in the uplink, the downlink implementation requiring the highest degree of cooperation among the APs and thereby with the highest fronthauling requirements is a fully centralized operation where all the network tasks are in charge of the CPU, see for example the system model in [8, Sec. 6.1]. Global CSI knowledge is required at the CPUs to design the transmit precoding vectors. The data signals that the CPU has properly encoded and precoded are then sent over the fronthaul network to the APs that simply forward them to the users. Importantly, when the precoding vectors are designed in a centralized fashion, mechanisms of interference suppression can be implemented in order to maximize the spectral efficiency. In the fully distributed operation, see for example [8, Sec. 6.2], the CPUs send over the fronthaul network only the encoded data to the APs involved in the service. Transmit precoding is carried out locally at each APs upon its own local channel estimates. Although this solution is optimal from a practical point of view, the achieved performance is modest as the APs are not able to cancel out each other’s interference. Each AP can only suppress the interference that itself is generating by exploiting its few antennas. Partially distributed downlink operations, as for example that proposed in [46], strike a good trade-off between inter-AP interference cancelation and SNR increase. The fronthauling overhead is kept relatively low by confining the CSI exchange among small subsets of APs.
2.3 Implementation Constraints, Desiderata, and Challenges As already mentioned, the dynamic cooperation clustering is needed to implement a practical and scalable cell-free network. The main idea of this framework is that each user is served by a tailored cluster consisting of properly selected APs. This allows to confine the cooperation (i.e., data sharing and resource allocation tasks) within a handful of adjacent APs, thereby alleviating the fronthauling load. Moreover, dynamic clustering reduces the inter-cluster interference compared to a static clustering as a user is more unlikely to stand at the edge of a cluster. When implementing a cell-free network, in particular designing the transmit precoding and the receive combining, the desiderata are the following: 1. Low computational complexity. When designing interference-suppression precoding/combining schemes, the complexity of implementing the beamforming vectors mainly lies on the computation of the pseudo-inverse matrix, hence on its size. In cell-free massive MIMO, due to the distributed network topology, the dynamic range of the interference from/to different APs may be however significantly large. Hence, there is no need to design a precoder/combiner that suppresses all the interference contributions, but rather only the most significant ones, in terms of both inter-cell and intra-cell interference. For instance, the
518
G. Interdonato and S. Buzzi
authors in [42] proposed the so-called local partial zero-forcing (PZF) that is a fully distributed precoding scheme that provides an adaptable trade-off between interference cancelation and boosting of the desired signal, with no additional fronthauling overhead, and implementable by APs with very few antennas. This scheme suppresses only the most significant intra-cell interference contributions; hence, the inter-cell interference is not tackled. The performance of its uplink counterpart was studied in [55] with similar results and conclusions. The work [46] proposes a semi-distributed variant of the PZF that also partially suppresses the inter-cell interference by exchanging a limited amount of CSI over the fronthaul network. As only a subset of APs uses centralized ZF, while the rest uses maximum-ratio transmission (MRT), the dimension of the pseudo-inverse and the amount of the exchanged data are determined by the cardinality of the ZF set. Such a joint maximum-ratio and zero-forcing-based precoding scheme (JMRZF) converges to the performance of centralized ZF (CZF) as the number of APs included in the ZF set grows, at the expense of an increased computational complexity and fronthauling overhead. 2. Scalability. To make the precoding/combining scheme scalable, that is practically implementable as the number of APs and users increases, the transmission strategy must overlay a dynamic cooperation clustering strategy that is a priori determined over a longer time scale and reflects the macroscopic changes of the network. Hence, cooperation clustering strategies relying on the channel statistics may be suitable for providing sub-optimal performance, as shown in [36]. According to the definitions given in [8], a precoding/combining scheme is scalable if the associated signal processing for transmission/reception and the fronthaul signaling for data and CSI sharing have finite complexity and resource requirements for each AP as the number of users grows to infinity. Hence, the scalability requirement entails that the total computational complexity and fronthaul requirements will then be independent of the number of users, but proportional to the number of APs because, as long as each AP is equipped with a local baseband processor and a sufficient fronthaul connection, it can accomplish its necessary tasks irrespective of how large the network is [8]. When designing a scalable linear precoding scheme, a designer must be ensure that: . The number of complex multiplications required per coherence block to compute the precoding vector of an arbitrary user is independent of the number of users. . The number of complex scalars to be shared over each fronthaul link per coherence block is independent of the number of users. 3. Low-dimensional CSI exchange. Although the dynamic cooperation clustering framework allows to confine the data sharing among a subset of APs, hence relieving the fronthaul network, CSI sharing between the transmitters (CSIT) needs to be performed within much tighter time constraints and hence may dominate the fronthaul overhead. A viable approach to limit the CSIT sharing consists in implementing cooperative transmission strategies only among a few APs, as in [46] and described earlier. Alternatively, each AP may operate on
Cell-Free Massive MIMO
519
the basis of possibly different estimates of the global channel state obtained through some arbitrary CSIT acquisition and sharing mechanism. For instance, in [56], a novel distributed precoding design is proposed, coined team minimum mean square error (TMMSE) precoding, that generalizes the centralized MMSE precoding to distributed operations based on transmitter-specific CSIT. On the other hand, [57] propose a distributed framework for cooperative precoding design in cell-free massive MIMO systems that entirely eliminates the need for fronthaul signaling for CSI exchange. Focusing on the weighted sum mean-squared error (MSE) minimization, a novel over-the-air (OTA) signaling mechanism allows each AP to obtain cross-term information among nearby APs via parallel fronthaul signaling. Specifically, this is achieved by introducing a new uplink signaling resource and a new CSI combining mechanism that complement the conventional uplink pilot-aided channel estimations. Importantly, the amount of OTA signaling does not scale with the number of APs or users, and there are no delays in the CSI exchange among the APs. These practical benefits come at the cost of extra uplink signaling overhead per bi-directional training iteration, which, however, results in a minor performance loss with respect to the distributed precoding design via fronthaul signaling. 4. Nearly optimal performance, which can be achieved by a unified framework that potentially combines the benefits of all the approaches described above. 5. Robustness against synchronization errors and channel aging. Proper time– frequency synchronization of the distributed APs is necessary for coherent downlink transmission and uplink reception. The user-centric cooperation clusters might relax the synchronization requirements since only the geographically closest APs must be well synchronized, while farther APs that are serving other sets of users do not require the same synchronization accuracy. User mobility degrades the effectiveness of transmit precoding and combining too. As the user moves, the propagation channels toward the APs are subject to time variations, and thereby the available CSI at the APs ages with time, effect called channel aging. The channel aging is considered within a coherence block and is intended as the mismatch between the acquired CSI prior the transmission and the actual CSI at the reception time (channel estimation error aside). Power Control As to any wireless network, power allocation is key in cell-free massive MIMO to tackle the near-far effect and mitigate the interference [5, 38]. Power control strategies can be implemented either in a centralized fashion at the CPUs or in a distributed fashion at each AP. Centralized implementations require a certain degree of CSI sharing over the fronthaul network but enable to maximize network-wide utility functions, such as sum/max–min spectral efficiency. In general, these solutions are neither scalable nor low complexity. Distributed implementations are more practical as solely rely on the local channel estimates available at each AP and thereby are scalable. However, distributed solutions are far to be optimal. Remark 1 Any non-trivial network-wide optimal power allocation is unscalable as the number of AP and/or users grows unboundedly. Distributed power allocations are heuristic, scalable, and low complexity.
520
G. Interdonato and S. Buzzi
Power control strategies may be carried out at two different time scales. A fast power control may be performed at the small-scale-fading time scale, i.e., within a coherence block, by fine-tuning the norm of the precoding/combining vectors. By doing so, an appropriate power allocation may control the effects of the small-scale fading. In this case, the power control coefficients (absorbed by the precoding/combining vectors) are functions of the channel estimates. A slow power control can be performed at the large-scale-fading time scale, i.e., over multiple coherence blocks, by properly setting the power control coefficients as a function of the channel statistics. Slow power control is well-suited to optimization as the power control coefficients do not need to be updated frequently, that is, on a coherence clock basis. In the uplink, letting all the users transmit with full power is optimal to maximize the sum spectral efficiency when MMSE receive combining is adopted at the APs. Therefore, any power allocation based on network-wide optimization problems is pointless [8] unless fairness is targeted. In the latter case, fractional power control strategies [58] are able to provide better spectral efficiency to the most unfortunate users. In the downlink, fractional power control along with MMSE transmit precoding constitutes a valid sub-optimal alternative to both sum spectral efficiency maximization and MMF power control. Especially when performed in a centralized fashion, fractional power control strikes an excellent trade-off between sum spectral efficiency and fairness [8, 58]. When carried out in a distributed fashion, clearly, optimizing the power allocation is beneficial regardless of the target. Sum spectral efficiency maximization and MMF power control provide significant improvements respect to fractional power control, especially if conjugate beamforming (i.e., MRT) is adopted [59]. In the cell-free massive MIMO literature, MMF power control is surely the policy that has received the most attention by the research community. This is mainly due to two reasons: first, because the egalitarian nature of the MMF policy fits perfectly into cell-free massive MIMO that offers uniformly great service per se, thanks to its user-centric perspective. Indeed, MMF works better in cell-free massive MIMO than in co-located massive MIMO, wherein the cell-edge users jeopardize the performance of the cell-center users. Second, MMF framework can easily be extended to more general power control strategies by introducing weights that scale the performance requirements, referred to as weighted MMF. Weighted MMF can be also adopted to handle the user prioritization. Indeed, spectral efficiency requirements can vary from user to user. For instance, real-time application users or users with more expensive subscriptions have higher priority. In this case, a higher priority would correspond to a larger weight. Remark 2 To preserve the scalability of the system, power control implementations should be either heuristic and fully distributed or sub-optimal and centralized but requiring only partial CSI sharing. Quantization of Fronthaul Signaling The vast majority of the works in the cellfree massive MIMO literature assume a fronthaul network with infinite capacity. In practice, the fronthaul capacity is limited, and any signaling sent over the fronthaul
Cell-Free Massive MIMO
521
is quantized. Hardware impairments introduce quantization distortion on a sample basis—usually treated as additional uncorrelated interference—while fronthaul compression operates on block of samples and can be optimized, for instance, with respect to the quantization bits. Some papers in this direction are [60–64]. It is worth to note that the performance of cell-free massive MIMO with centralized operation may be severely degraded in the presence of strict fronthaul constraints to the point of being worse than that of distributed operation or even making the joint coherent transmission/reception ineffective [47]. For further implementation challenges and practical aspects concerning cell-free massive MIMO, such as digital synchronization and reciprocity calibration, we refer the reader to the work [65] and references therein.
3 Enabling Cell-Free Technologies Cell-free massive MIMO requires a widespread and costly architecture to implement the user-centric philosophy, accurate synchronization, and coordination among the APs to carry out the joint coherent transmission/reception, and practical resource allocation schemes to capitalize on the macro-diversity gain while preserving the network scalability. Luckily, there are emerging low-cost and flexible solutions for cell-free network deployments such as pCell [66, 67] and the Radio Stripes [6, 68] that promise to achieve to some extent the outstanding theoretical spectral efficiency and a plethora of practical benefits: reliable communication, network scalability, secure communications, link robustness, low latency, and low power consumption both at the user and at the access point. A recent technology, named RadioWeaves [69], has been conceived, especially for indoor scenarios, to efficiently combine large-scale intelligent surfaces and cell-free wireless access. Radio Stripes and Sequential Signal Processing The basic principle of the radio stripe technology is to fuse into a single entity the antenna elements, the baseband processing circuitry, and the fronthaul network. Specifically, the antenna elements and the associated antenna processing units (APUs) are serially located inside the protective casing of a cable or a stripe, which also provides power supply, synchronization, and data transfer via a shared bus, as in the example illustrated in Fig. 11. The APU consists of circuit-mounted chips, including phase shifters, modulators, filters, power amplifiers, analog-to-digital, and digital-to-analog converters. Hence, the AP can be thought as the APU plus its associated antenna elements. A radio stripe system hinges on a compute-and-forward architecture wherein the signal processing occurs sequentially. At the ends of a stripe, a CPU collects the cumulative processed signal and performs network tasks in a centralized fashion. The signal processing is serialized, that is, the receive/transmit signal processing of an APU is carried out right next to itself. When acting as a transmitter, each APU receives a superposition of multiple-input data streams from the previous APU
522
G. Interdonato and S. Buzzi
Q
Q
D/A
D/A
D/A
D/A
I
I
A/D
A/D
A/D
A/D
I
I
DSP
APU
CPU
Antenna elements
Q
Q
APU
Internal connector (power, fronthaul, clock)
Protecting material
APU
CPU
Fig. 11 Example of a radio stripe where the antenna elements are serially integrated into a cable or stripe. Illustration included in [6]. The APU consists of circuit-mounted chips necessary for the baseband processing. A shared bus provides power supply, synchronization, and data transfer. The AP can be thought as the APU plus its associated antenna elements
via the shared bus, applies its precoding vectors, and finally transmits the resulting signal through its associated antenna elements. When acting as a receiver, each APU first processes the radio signal received by its associated antennas, and then the resulting streams are properly combined with the data streams received from the previous APU, and it finally sends the resulting signal on the shared bus to the next APU. This combination of signals might simply be a per-stream addition operation. Such a sequential signal processing over a serial fronthaul network has been recently demonstrated to be able to provide an optimal spectral efficiency in the uplink [70] and a nearly optimal spectral efficiency in the downlink [56, 71, 72]. Remark 3 Cell-free massive MIMO can be efficiently deployed by using a sequential topology without loss in performance but with much lower fronthaul requirements compared to centralized implementations. Indeed, the work [70] finds the optimal receive combining matrices among the class of sequential receivers that jointly maximize the spectral efficiency and minimize the mean square error. This is attained by using a linear MMSE receiver. Importantly, [70] show analytically that, with uplink sequential processing, the data estimate computed at the last APU is equivalent to that obtained by centralized processing. However, the fronthaul capacity per cable/stripe grows with the number of users in the former case and with the number of APs in the latter case. Since in a cell-free massive MIMO system we have that the number of APs is much
Cell-Free Massive MIMO
523
larger than the number of users, then the percentage of fronthaul saved by the sequential processing is considerable. Similarly, in the downlink, the team MMSE (TMMSE) precoding for sequential processing, proposed in [56, 71, 72], provides nearly optimal spectral efficiency and minimized the mean square error under partial CSI sharing. In particular, unidirectional TMMSE, which is obtained by sharing the CSI in only one direction over the serial fronthaul, is a promising intermediate solution for supporting network-wide interference management when centralized precoding is expensive. Besides the advantages described above, the radio stripe system facilitates a practical, flexible, and cheap cell-free massive MIMO deployment by using a sequential network topology. The benefits include diverse practical aspects: .(i) ease of deployment and cable routing; .(ii) cabling is cheaper than any other topology, e.g., star-like, tree, mesh. Only two links between any pair of APs are needed for sending/receiving the signals; .(iii) node failures can be tolerated by routing mechanisms thereby increasing robustness and resilience, while maintenance costs are cut down; .(iv) components are low power and distributed; thus heat dissipation is less alarming than “packed” solutions; .(v), while conventional APs are bulky, and radio stripes enable non-invasive deployments. pCell The pCell technology synthesizes a tiny personal cell (hence the terminology “pCell”) for each user in the coverage area. pCells are very small, a fraction of a wavelength in diameter at practical mobile frequencies, and thereby this technology leverages an aggressive spatial multiplexing to exploit the interference rather than avoiding it. To synthesize a pCell precisely at the location of each user in space, pCell technology follows the same principles described for cell-free massive MIMO, that is, TDD operation, pilot-based uplink training, centralized transmit digital precoding for the downlink, and receive digital combining for the uplink. Each pCell is an independent radio link, and each device in the coverage area is able to simultaneously utilize the full capacity of the same spectrum, dramatically increasing wireless capacity. This is attained by relying on a software-defined radio Cloud-RAN architecture compatible with the LTE/5G New Radio and Wi-Fi standard. Unlike the radio stripes wherein signal processing is potentially sequential and distributed, the pCell technology builds on a C-RAN architecture and thereby signal processing is fully centralized. Both share similar network topology characteristics, in that they overcome deploying expensive star topologies for the fronthaul network and provide flexible deployment. In pCell, the fronthaul network may consist of a few fiber (or Ethernet) links toward crucial NLoS nodes and a fronthaul mesh network connecting nodes in LoS to each other. RadioWeaves RadioWeaves technology is a wireless infrastructure recently proposed in [69], primarily intended for indoor propagation environment, that deploys a fabric of distributed computing, radio, and storage resources to serve as massive distributed system. This technology combines cell-free massive MIMO and large intelligent surfaces to provide extraordinary levels of spatial diversity, energy efficiency, link reliability, and connectivity. RadioWeaves deployments are envisioned
524
G. Interdonato and S. Buzzi
to integrate dispersed electronics invisibly in a room environment (similar to the radio stripes), surrounding the users and offering them a cell-free QoS experience. Conceptually, RadioWeaves technology builds on the same principles of pCell and Radio Stripes. Unlike pCell where the processing is centralized, RadioWeaves and Radio Stripes distribute signal processing and computations as much as possible over the network infrastructure. Besides radiative elements, RadioWeaves include distributed computing and storage resources in its framework, hence considering a broader architecture at upper protocol layers. In this regard, the Radio Stripes might be a component of a RadioWeaves deployment.
4 Support to 6G Use Cases In this section, we touch potential killer applications of interest in 6G scenarios for which cell-free massive MIMO may give excellent support, thanks to its ubiquitous coverage, user’s proximity, and extraordinary spectral efficiency.
4.1 Multi-access Edge Computing Over the past few years, we have witnessed to an exponential growth of computation-intensive applications requiring ultra-reliable low-latency communications (URLLC), e.g., augmented reality, real-time video image processing, and online gaming. These emerging applications, although supported in 5G, will be hardly realistic 5G drivers but will rather characterize beyond 5G or 6G systems, which eventually become ultra-reliable and low latency itself, in all its components (radio access, backhaul, core, and applications). User’s demands for extra computing resources and the network’s need for supporting URLLC use cases converge to a paradigm called multi-access/mobile edge computing (MEC) [73, 74]. The goal of MEC is to bring the application and all of its computing and storage as close as possible to the user. It constitutes an approach to indirectly increase the computing capabilities of the devices and thereby prolonging their battery lifetime, by either fully or partially delegating users’ computational tasks to the network, specifically to network entities, known as network edge servers.1 This architecture has therefore the ability to reduce latency, allow cloud offloading, and alleviate traffic congestion. It has become evident that MEC will play a critical role in enabling the URLLC use cases.
1 The edge servers are network entities figuratively placed at the edge of the cellular access network, that is, between the radio access network and the core network. The edge servers are in charge of collecting, processing, and feeding data back to the users.
Cell-Free Massive MIMO
525
The marriage between cell-free massive MIMO and MEC is promising. Imagine to equip each AP with a power-efficient edge server and each CPU with a highperformance edge server to serve as a backup edge computing. Then, cell-free massive MIMO, by capitalizing on its dense distributed topology and proximity to the users, can greatly facilitate the computation offloading by enabling mobile devices to delegate either all or part of their computational tasks and/or storage to multiple APs, while the CPUs may give computation/storage offloading support to the APs or directly to the users if needed. Besides, the user-centric implementation characterizing cell-free massive MIMO systems guarantees a uniform great spectral efficiency throughout the network, resulting to an indiscriminate access to the remote computing resources for every user. Indeed, the amount of computational tasks that each user can remotely offload inevitably depends on its achievable spectral efficiency, hence related to the users’ transmit powers. This coupling calls out for a joint optimization of radio and computing resources. The performance of cell-free massive MIMO systems with MEC functionalities has been recently explored in [75–77]. These works study a joint allocation of radio and computing resources under latency constraint in a system consisting of APs equipped with independent MEC servers, and a CPU with a central MEC server. Each device is jointly and coherently served by all the surrounding APs and performs offloading to the central MEC server and possibly to the distributed MEC servers of its serving APs. The main finding of these works is that cell-free massive MIMO is more suitable and flexible to support MEC applications than other network architectures, such as co-located massive MIMO and C-RAN, in that it is able to provide the users with low offloading latency and significant power saving by efficiently distributing the computational workload over multiple MEC edge servers.
4.2 Spatial Multiplexing of Heterogeneous Services With the advent of the mobile application ecosystem and the resulting increase of the data-processing and storage capabilities of the smart devices, several heterogeneous services have emerged setting various stringent communication requirements in terms of data rates, latency, reliability, and massive connectivity. These requirements and related use cases have been summarized by the 3rd Generation Partnership Project (3GPP) into three macroservices, namely, enhanced mobile connectivity (eMBB), URLLC, and massive machine-type communications (mMTC) [78]. eMBB services require high-peak data rate and stable connectivity and include most of the everyday usage applications: entertainment, multimedia, communication, collaboration, mapping, web surfing, etc. URLLC services include real-time and time-critical applications, such as autonomous driving, automation control, augmented reality, video and image processing, etc. mMTC services enable connectivity between a vast number of miscellaneous devices and include applications such as smart grids, traffic management systems, environmental monitoring, etc.
526
G. Interdonato and S. Buzzi
5G started to roll out variously as an eMBB service, essentially like a faster version of LTE, while mMTC and URLLC requirements continue to be refined and will materialize in the next decade. Academic research and industrial standardization is currently interested at different coexistence mechanisms for such heterogeneous services, apparently moving apart from the initial vision of a sliced network [79]. Slicing the network basically means allocating orthogonal resources (storage, computing, radio communications, etc.) to heterogeneous services so that to guarantee their mutual isolation. This approach is, in broad sense, generally known as orthogonal multiple access (OMA). As an interesting alternative to orthogonal resource allocation, non-orthogonal OMA (NOMA) is gaining increasing importance especially with respect to the allocation of the radio access network (RAN) communication resources. The conventional approach to slice the RAN is to separate eMBB, mMTC, and URLLC services in time and/or frequency domains. However, recent studies revealed the soundness of NOMA relying on efficient coexistence strategies accompanied by an aggressive multiplexing in the spatial domain through the use of massive MIMO technologies. In this regard, the terminology Heterogeneous OMA (H-OMA) is often adopted [79] to distinguish the orthogonal resource allocation of heterogeneous services from that of the same type, referred to as OMA. (The same distinction applies to H-NOMA with respect to NOMA.) The main focus of massive MIMO and cell-free massive MIMO research has primarily been increasing the data rates, i.e., targeting the eMBB requirements. However, it has been demonstrated that massive MIMO is not only beneficial for the eMBB but provides significant benefits to URLLC [80, 81] by reducing the outage probability, and therefore increasing the reliability. Higher reliability results to less retransmissions, which translates to lower latency. mMTC also benefits from massive MIMO technology [81, 82] by capitalizing on the high energy efficiency to increase devices’ battery lifetime. Besides, favorable propagation enables an aggressive spatial multiplexing of the mMTC devices, facilitating the detection and the random access procedures. Cell-free massive MIMO would make all these advantages more prominent by capitalizing on its ubiquitous nature and extraordinary macro-diversity gain. In the cell-free massive MIMO uplink, the eMBB signals can be conveniently decoded in a centralized fashion at the CPU in order to exploit the coherent combining gain, thereby increasing the data rate, while the URLLC users would be directly served by the nearby APs in a distributed fashion in order to fulfill the strict low-latency requirements. In the downlink, the eMBB traffic is generated at the core network, and thus the CPU can either perform centralized precoding or send data to the APs for joint distributed precoding. Conversely, URLLC traffic would be solely generated at the APs [83], which would operate in a distributed fashion to quickly and reliably reach the URLLC users. Some papers in this direction have shown promising results. For instance, in [84], the non-orthogonal coexistence of URLLC and eMBB services in the uplink of a cell-free network with C-RAN architecture and analog fronthaul is analyzed. An information-theoretic analysis is provided for the performance of URLLC and eMBB traffic under both H-OMA and H-NOMA by considering different decoding
Cell-Free Massive MIMO
527
strategies: puncturing, treating the interference caused by the undesired service as noise (TIN), and successive interference cancelation (SIC). Results reveal that higher eMBB rates can be achieved with H-NOMA as compared to H-OMA, while fulfilling the strict requirements of the URLLC communications. Moreover, SIC achieves the best performance although increases the complexity at the receiver, while TIN always outperforms puncturing. A similar analysis was conducted in [83] including both uplink and downlink of a cell-free network with C-RAN architecture but additionally considering fading, lack of CSI at the URLLC transmitters, rate adaptation at the eMBB transmitters, and finite digital fronthaul capacity. Whenever the URLLC activation probability is large or the fronthaul capacity is rather limited, the advantages of H-NOMA over H-OMA are significant and relying on an efficient management of the URLLC interference on the eMBB signals. In this regard, puncturing and SIC are suitable techniques for the uplink, when performed prior to fronthaul compression. In the downlink, the URLLC interference management is trickier due to the lack of any knowledge about the random URLLC activation pattern. In general, H-NOMA potentially offers significant gains for each single service as compared to H-OMA, despite creating a tolerable inter-service interference.
5 Conclusions In an age where wireless communication networks are called upon to provide reliable, high-quality connectivity anywhere, anytime, to anyone, emerging mobile applications with diverse heterogeneous requirements and the ever-increasing user’s data hunger pose novel design and implementation challenges that led academic and industrial research to look beyond the cellular paradigm characterizing the conventional mobile systems. In this regard, cell-free massive MIMO is deemed to be a key physical layer technology for 6G systems. It combines the intrinsic efficiency, practicality, and scalability of the massive MIMO operation with a ultradense distributed deployment and the coordinated multipoint processing. These ingredients are efficiently mixed to implement the so-called user-centric experience and the cell-free paradigm: each user is in the center of its own tailored virtual cell that is drawn by its personal set of serving APs. Cell-free networks are able to erase the boundaries effects, mitigating the interference and providing uniform great quality of service to every user. With respect to appealing 6G applications and use cases, cell-free massive MIMO promises a well-suited and flexible support to mobile edge computing than other network architectures, in that it is able to provide the users with low offloading latency and significant power saving by efficiently distributing the computational workload over multiple MEC edge servers. Besides, it promises an efficient joint multiplexing of eMBB, URLLC, and mMTC services by capitalizing on its ubiquitous nature and extraordinary macro-diversity gain. Nevertheless, cell-free massive MIMO requires a widespread and costly architecture to implement the user-centric philosophy, accurate synchronization, and
528
G. Interdonato and S. Buzzi
coordination among the APs to carry out the joint coherent transmission/reception, and practical resource allocation schemes to capitalize on the macro-diversity gain while preserving the network scalability. Emerging low-cost and flexible solutions for cell-free network deployments, e.g., the Radio Stripes, promise to achieve to some extent the outstanding theoretical spectral efficiency and a plethora of practical benefits such as reliable communication, network scalability, secure communications, link robustness, low latency, and low power consumption. Cellfree massive MIMO testbeds and demos are proliferating, confirming the theoretical gains and the high potential of this concept. The transition from cell-centric to usercentric, cell-free architectures has only just begun.
References 1. S. Shamai and B. M. Zaidel, “Enhancing the cellular downlink capacity via co-processing at the transmitting end,” in IEEE VTS 53rd Vehicular Technology Conference, Spring 2001., vol. 3, 2001, pp. 1745–1749. 2. T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Transactions on Wireless Communications, vol. 9, no. 11, pp. 3590–3600, 2010. 3. T. L. Marzetta, E. G. Larsson, H. Yang, and H. Q. Ngo, Fundamentals of Massive MIMO. Cambridge University Press: Cambridge, MA, USA, 2016. 4. E. Björnson, J. Hoydis, and L. Sanguinetti, Massive MIMO networks: Spectral, energy, and hardware efficiency. Now Publishers Inc. Hanover, MA, USA, 2017, vol. 11, no. 3–4. 5. H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta, “Cell-free massive MIMO versus small cells,” IEEE Transactions on Wireless Communications, vol. 16, no. 3, pp. 1834–1850, 2017. 6. G. Interdonato, E. Björnson, H. Q. Ngo, P. Frenger, and E. G. Larsson, “Ubiquitous cellfree massive MIMO communications,” EURASIP Journal on Wireless Communications and Networking, vol. 2019, no. 1, p. 197, 2019. 7. S. Buzzi, C. D’Andrea, A. Zappone, and C. D’Elia, “User-centric 5G cellular networks: Resource allocation and comparison with the cell-free massive MIMO approach,” IEEE Transactions on Wireless Communications, vol. 19, no. 2, pp. 1250–1264, 2020. 8. Ö. Tugfe Demir, E. Björnson, and L. Sanguinetti, Foundations of User-Centric Cell-Free Massive MIMO. Foundations and Trends in Signal Processing, 2021, vol. 14, no. 3–4. 9. S. Zhou, M. Zhao, X. Xu, J. Wang, and Y. Yao, “Distributed wireless communication system: A new architecture for future public wireless access,” IEEE Communications Magazine, vol. 41, no. 3, pp. 108–113, 2003. 10. H. Zhang and H. Dai, “Cochannel interference mitigation and cooperative processing in downlink multicell multiuser MIMO networks,” EURASIP Journal on Wireless Communications and Networking, vol. 2004, no. 2, p. 202654, 2004. 11. S. Venkatesan, A. Lozano, and R. Valenzuela, “Network MIMO: Overcoming intercell interference in indoor wireless systems,” in 2007 41st Asilomar Conference on Signals, Systems and Computers, 2007, pp. 83–87. 12. G. Caire, S. Ramprashad, and H. Papadopoulos, “Rethinking network MIMO: Cost of CSIT, performance analysis, and architecture comparisons,” in 2010 Information Theory and Applications Workshop (ITA), 2010, pp. 1–10. 13. A. Papadogiannis, D. Gesbert, and E. Hardouin, “A dynamic clustering approach in wireless networks with multi-cell cooperative processing,” in 2008 IEEE International Conference on Communications, 2008, pp. 4033–4037.
Cell-Free Massive MIMO
529
14. D. Gesbert, S. Hanly, H. Huang, S. Shamai, O. Simeone, and W. Yu, “Multi-cell MIMO cooperative networks: A new look at interference,” IEEE Journal on Selected Areas in Communications, vol. 28, no. 9, pp. 1380–1408, 2010. 15. E. Björnson, R. Zakhour, D. Gesbert, and B. Ottersten, “Cooperative multicell precoding: Rate region characterization and distributed strategies with instantaneous and statistical CSI,” IEEE Transactions on Signal Processing, vol. 58, no. 8, pp. 4298–4310, 2010. 16. W. Choi and J. G. Andrews, “Downlink performance and capacity of distributed antenna systems in a multicell environment,” IEEE Transactions on Wireless Communications, vol. 6, no. 1, pp. 69–73, 2007. 17. W. Feng, Y. Wang, N. Ge, J. Lu, and J. Zhang, “Virtual MIMO in multi-cell distributed antenna systems: Coordinated transmissions with large-scale CSIT,” IEEE Journal on Selected Areas in Communications, vol. 31, no. 10, pp. 2067–2081, 2013. 18. “Coordinated Multi-Point Operation for LTE Physical Layer Aspects,” 3GPP, 2011, (Release 11) Version 11.1.0, 3GPP TR 36.819. 19. R. Irmer, H. Droste, P. Marsch, M. Grieger, G. Fettweis, S. Brueck, H.-P. Mayer, L. Thiele, and V. Jungnickel, “Coordinated multipoint: Concepts, performance, and field trial results,” IEEE Communications Magazine, vol. 49, no. 2, pp. 102–111, 2011. 20. M. Boldi, A. Tölli, M. Olsson, E. Hardouin, T. Svensson, F. Boccardi, L. Thiele, and V. Jungnickel, “Coordinated multipoint (CoMP) systems,” in Mobile and Wireless Communications for IMT-Advanced and Beyond, A. Osseiran, J. Monserrat, and W. Mohr, Eds. Wiley, 2011, pp. 121–155. 21. P. Marsch, S. Brück, A. Garavaglia, M. Schulist, R. Weber, and A. Dekorsy, “Clustering,” in Coordinated multi-point in mobile communications: From theory to practice, P. Marsch and G. Fettweis, Eds. Cambridge University Press, 2011, pp. 139–159. 22. R. Fantini, W. Zirwas, L. Thiele, D. Aziz, and P. Baracca, “Coordinated multi-point transmission in 5G,” in 5G Mobile and Wireless Communications Technology, A. Osseiran, J. Monserrat, and P. Marsch, Eds. Cambridge University Press, 2016, p. 248–276. 23. V. Jungnickel, K. Manolakis, W. Zirwas, B. Panzner, V. Braun, M. Lossow, M. Sternad, R. Apelfröjd, and T. Svensson, “The role of small cells, coordinated multipoint, and massive MIMO in 5G,” IEEE Communications Magazine, vol. 52, no. 5, pp. 44–51, 2014. 24. E. Björnson, L. Sanguinetti, H. Wymeersch, J. Hoydis, and T. L. Marzetta, “Massive MIMO is a reality–What is next? Five promising research directions for antenna arrays,” Digital Signal Processing, vol. 94, pp. 3–20, 2019. 25. À. O. Martínez, E. De Carvalho, and J. Ø. Nielsen, “Towards very large aperture massive MIMO: A measurement based study,” in 2014 IEEE Globecom Workshops (GC Wkshps), 2014, pp. 281–286. 26. A. Amiri, M. Angjelichinoski, E. de Carvalho, and R. W. Heath, “Extremely large aperture massive MIMO: Low complexity receiver architectures,” in 2018 IEEE Globecom Workshops (GC Wkshps), 2018, pp. 1–6. 27. E. D. Carvalho, A. Ali, A. Amiri, M. Angjelichinoski, and R. W. Heath, “Non-stationarities in extra-large-scale massive MIMO,” IEEE Wireless Communications, vol. 27, no. 4, pp. 74–80, 2020. 28. “C-RAN white paper: The road towards green RAN,” China Mobile Research Institute, 2014. [Online]. Available: http://labs.chinamobile.com/cran 29. J. Yuan, S. Jin, W. Xu, W. Tan, M. Matthaiou, and K. Wong, “User-centric networking for dense C-RANs: High-SNR capacity analysis and antenna selection,” IEEE Transactions on Communications, vol. 65, no. 11, pp. 5067–5080, 2017. 30. C. Pan, M. Elkashlan, J. Wang, J. Yuan, and L. Hanzo, “User-centric C-RAN architecture for ultra-dense 5G networks: Challenges and methodologies,” IEEE Communications Magazine, vol. 56, no. 6, pp. 14–20, 2018. 31. E. Nayebi, A. Ashikhmin, T. L. Marzetta, and H. Yang, “Cell-free massive MIMO systems,” in 2015 49th Asilomar Conference on Signals, Systems and Computers, 2015, pp. 695–699. 32. H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta, “Cell-free massive MIMO: Uniformly great service for everyone,” in Proc. IEEE Int. Workshop on Signal Process. Advances in Wireless Commun. (SPAWC), Jun. 2015, pp. 201–205.
530
G. Interdonato and S. Buzzi
33. G. Interdonato, P. Frenger, and E. G. Larsson, “Scalability aspects of cell-free massive MIMO,” in Proc. IEEE International Conference on Communications (ICC), 2019, pp. 1–6. 34. E. Björnson and L. Sanguinetti, “Scalable cell-free massive MIMO systems,” IEEE Transactions on Communications, vol. 68, no. 7, pp. 4247–4261, 2020. 35. H. Q. Ngo, L. N. Tran, T. Q. Duong, M. Matthaiou, and E. G. Larsson, “On the total energy efficiency of cell-free massive MIMO,” IEEE Transactions on Green Communications and Networking, vol. 2, no. 1, pp. 25–39, 2018. 36. M. Attarifar, A. Abbasfar, and A. Lozano, “Subset MMSE receivers for cell-free networks,” IEEE Transactions on Wireless Communications, vol. 19, no. 6, pp. 4183–4194, Jun. 2020. 37. F. Riera-Palou, G. Femenias, A. G. Armada, and A. Pérez-Neira, “Clustered cell-free massive MIMO,” in Proc. IEEE Globecom Workshops (GC Wkshps), 2018, pp. 1–6. 38. E. Nayebi, A. Ashikhmin, T. L. Marzetta, H. Yang, and B. D. Rao, “Precoding and power optimization in cell-free massive MIMO systems,” IEEE Transactions on Wireless Communications, vol. 16, no. 7, pp. 4445–4459, Jul. 2017. 39. G. Femenias, F. Riera-Palou, A. Alvarez-Polegre, and A. Garcia-Armada, “Short-term power constrained cell-free massive-MIMO over spatially correlated Ricean fading,” IEEE Transactions on Vehicular Technology, vol. 69, no. 12, pp. 15 200–15 215, Dec. 2020. 40. G. Interdonato, H. Q. Ngo, and E. G. Larsson, “Enhanced normalized conjugate beamforming for cell-free massive MIMO,” IEEE Transactions on Communications, vol. 69, no. 5, pp. 2863– 2877, May 2021. 41. S. Buzzi, C. D’Andrea, and C. D’Elia, “User-centric cell-free massive MIMO with interference cancellation and local ZF downlink precoding,” in Proc. of International Symposium on Wireless Communication Systems (ISWCS), 2018, pp. 1–5. 42. G. Interdonato, M. Karlsson, E. Björnson, and E. G. Larsson, “Local partial zero-forcing precoding for cell-free massive MIMO,” IEEE Transactions on Wireless Communications, vol. 19, no. 7, pp. 4758–4774, 2020. 43. M. Attarifar, A. Abbasfar, and A. Lozano, “Modified conjugate beamforming for cell-free massive MIMO,” IEEE Wireless Communications Letters, vol. 8, no. 2, pp. 616–619, 2019. 44. P. Liu, K. Luo, D. Chen, and T. Jiang, “Spectral efficiency analysis of cell-free massive MIMO systems with zero-forcing detector,” IEEE Transactions on Wireless Communications, vol. 19, no. 2, pp. 795–807, 2020. 45. E. Björnson and L. Sanguinetti, “Making cell-free massive MIMO competitive with MMSE processing and centralized implementation,” IEEE Transactions on Wireless Communications, vol. 19, no. 1, pp. 77–90, 2020. 46. L. Du, L. Li, H. Q. Ngo, T. C. Mai, and M. Matthaiou, “Cell-free massive MIMO: Joint maximum-ratio and zero-forcing precoder with power control,” IEEE Transactions on Communications, vol. 69, no. 6, pp. 3741–3756, Jun. 2021. 47. R. Pinto Antonioli, I. M. Braga, G. Fodor, Y. C. B. Silva, A. L. F. de Almeida and W. C. Freitas, “On the energy efficiency of cell-free systems with limited fronthauls: Is coherent transmission always the best alternative?” IEEE Transactions on Wireless Communications, vol. 21, no. 10, pp. 8729–8743, Oct. 2022. https://doi.org/10.1109/TWC.2022.3169114. 48. M. Alonzo, S. Buzzi, A. Zappone, and C. D’Elia, “Energy-efficient power control in cellfree and user-centric massive MIMO at millimeter wave,” IEEE Transactions on Green Communications and Networking, vol. 3, no. 3, pp. 651–663, 2019. 49. G. Femenias and F. Riera-Palou, “Cell-free millimeter-wave massive MIMO systems with limited fronthaul capacity,” IEEE Access, vol. 7, pp. 44 596–44 612, 2019. 50. Y. Jin, J. Zhang, S. Jin, and B. Ai, “Channel estimation for cell-free mmWave massive MIMO through deep learning,” IEEE Transactions on Vehicular Technology, vol. 68, no. 10, pp. 10 325–10 329, 2019. 51. C. D’Andrea, G. Interdonato, and S. Buzzi, “User-centric handover in mmWave cell-free massive MIMO with user mobility,” in 2021 29th European Signal Processing Conference (EUSIPCO), 2021, pp. 1–5.
Cell-Free Massive MIMO
531
52. S. Buzzi, C. D‘Andrea, M. Fresia, and X. Wu, “Multi-UE multi-AP beam alignment in user-centric cell-free massive MIMO systems operating at mmWave,” IEEE Transactions on Wireless Communications, vol. 21, no. 11, pp. 8919–8934, Nov. 2022. https://doi.org/10.1109/ TWC.2022.3170787. 53. E. Tanghe, W. Joseph, L. Verloock, L. Martens, H. Capoen, K. V. Herwegen, and W. Vantomme, “The industrial indoor channel: large-scale and temporal fading at 900, 2400, and 5200 MHz,” IEEE Transactions on Wireless Communications, vol. 7, no. 7, pp. 2740–2751, Jul. 2008. 54. E. Nayebi, A. Ashikhmin, T. L. Marzetta, and B. D. Rao, “Performance of cell-free massive MIMO systems with MMSE and LSFD receivers,” in Proc. Asilomar Conf. Signals, Syst., Comput., Nov. 2016, pp. 203–207. 55. J. Zhang, J. Zhang, E. Björnson, and B. Ai, “Local partial zero-forcing combining for cellfree massive MIMO systems,” IEEE Transactions on Communications, vol. 69, no. 12, pp. 8459–8473, Dec. 2021. 56. L. Miretti, E. Björnson, and D. Gesbert, “Precoding for scalable cell-free massive MIMO with radio stripes,” in Proc. IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Sep. 2021, pp. 411–415. 57. I. Atzeni, B. Gouda, and A. Tölli, “Distributed precoding design via over-the-air signaling for cell-free massive MIMO,” IEEE Transactions on Wireless Communications, vol. 20, no. 2, pp. 1201–1216, Feb. 2021. 58. R. Nikbakht, R. Mosayebi, and A. Lozano, “Uplink fractional power control and downlink power allocation for cell-free networks,” IEEE Wireless Communications Letters, vol. 9, no. 6, pp. 774–777, Jun. 2020. 59. G. Interdonato and S. Buzzi, “Conjugate beamforming with fractional-exponent normalization and scalable power control in cell-free massive MIMO,” in Proc. IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Sep. 2021, pp. 396– 400. 60. P. Parida, H. S. Dhillon, and A. F. Molisch, “Downlink performance analysis of cell-free massive MIMO with finite fronthaul capacity,” in Proc. IEEE Vehicular Technology Conference (VTC-Fall), Aug. 2018, pp. 1–6. 61. M. Bashar, K. Cumanan, A. G. Burr, H. Q. Ngo, M. Debbah, and P. Xiao, “Max–min rate of cell-free massive MIMO uplink with optimal uniform quantization,” IEEE Transactions on Communications, vol. 67, no. 10, pp. 6796–6815, Oct. 2019. 62. G. Femenias and F. Riera-Palou, “Cell-free millimeter-wave massive MIMO systems with limited fronthaul capacity,” IEEE Access, vol. 7, pp. 44 596–44 612, 2019. 63. M. Bashar, A. Akbari, K. Cumanan, H. Q. Ngo, A. G. Burr, P. Xiao, M. Debbah, and J. Kittler, “Exploiting deep learning in limited-fronthaul cell-free massive MIMO uplink,” IEEE Journal on Selected Areas in Communications, vol. 38, no. 8, pp. 1678–1697, Aug. 2020. 64. H. Masoumi and M. J. Emadi, “Performance analysis of cell-free massive MIMO system with limited fronthaul capacity and hardware impairments,” IEEE Transactions on Wireless Communications, vol. 19, no. 2, pp. 1038–1053, Feb. 2020. 65. U. Gustavsson, P. Frenger, C. Fager, T. Eriksson, H. Zirath, F. Dielacher, C. Studer, A. Pärssinen, R. Correia, J. N. Matos, D. Belo, and N. B. Carvalho, “Implementation challenges and opportunities in beyond-5G and 6G communication,” IEEE Journal of Microwaves, vol. 1, no. 1, pp. 86–100, Jan. 2021. 66. A. Forenza, S. Perlman, F. Saibi, M. Di Dio, R. van der Laan, and G. Caire, “Achieving large multiplexing gain in distributed antenna systems via cooperation with pCell technology,” in 2015 49th Asilomar Conference on Signals, Systems and Computers, 2015, pp. 286–293. 67. S. Perlman and A. Forenza, “An introduction to pCell,” Artemis Networks LLC, White paper, 2015. [Online]. Available: http://www.rearden.com/artemis/An-Introduction-to-pCell-WhitePaper-150224.pdf 68. P. Frenger, J. Hederen, M. Hessler, and G. Interdonato, “Antenna arrangement for distributed massive MIMO,” US Patent App. 16/435,054, 2019. [Online]. Available: https://patents. google.com/patent/US20190363763A1
532
G. Interdonato and S. Buzzi
69. L. Van der Perre, E. Larsson, F. Tufvesson, L. D. Strycker, E. Björnson, and O. Edfors, “Radioweaves for efficient connectivity: analysis and impact of constraints in actual deployments,” in Proc. of Asilomar Conference on Signals, Systems, and Computers, 2019, pp. 15–22. 70. Z. H. Shaik, E. Björnson, and E. G. Larsson, “MMSE-optimal sequential processing for cellfree massive MIMO with Radio Stripes,” IEEE Transactions on Communications, vol. 69, no. 11, pp. 7775–7789, Nov. 2021. 71. L. Miretti, E. Björnson, and D. Gesbert, “Team precoding towards scalable cell-free massive MIMO networks,” in Proc. Asilomar Conference on Signals, Systems, and Computers, Oct. 2021, pp. 1222–1227. 72. L. Miretti, E. Björnson and D. Gesbert, “Team MMSE precoding with applications to cell-free massive MIMO,” IEEE Transactions on Wireless Communications, vol. 21, no. 8, pp. 6242– 6255, Aug. 2022. https://doi.org/10.1109/TWC.2022.3147895. 73. P. Mach and Z. Becvar, “Mobile edge computing: A survey on architecture and computation offloading,” IEEE Communications Surveys Tutorials, vol. 19, no. 3, pp. 1628–1656, 2017. 74. Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief, “A survey on mobile edge computing: The communication perspective,” IEEE Communications Surveys Tutorials, vol. 19, no. 4, pp. 2322–2358, 2017. 75. S. Mukherjee and J. Lee, “Edge computing-enabled cell-free massive MIMO systems,” IEEE Transactions on Wireless Communications, vol. 19, no. 4, pp. 2884–2899, 2020. 76. G. Interdonato and S. Buzzi, “The promising marriage of mobile edge computing and cell-free massive MIMO,” in Proc. IEEE International Conference on Communications (ICC), 2022, to appear. 77. ——, “Joint optimization of uplink power and computational resources in mobile edge computing-enabled cell-free massive MIMO,” CoRR, vol. abs/2111.04678, 2021. [Online]. Available: https://arxiv.org/abs/2111.04678 78. IMT Vision – Framework and overall objectives of the future development of IMT for 2020 and beyond, ITU-R, 2015. 79. P. Popovski, K. F. Trillingsgaard, O. Simeone, and G. Durisi, “5G wireless network slicing for eMBB, URLLC, and mMTC: A communication-theoretic view,” IEEE Access, vol. 6, pp. 55 765–55 779, 2018. ˇ Stefanovi´c, E. De Carvalho, E. Ström, K. F. Trillingsgaard, A.-S. 80. P. Popovski, J. J. Nielsen, C. Bana, D. M. Kim, R. Kotaba, J. Park, and R. B. Sørensen, “Wireless access for ultra-reliable low-latency communication: Principles and building blocks,” IEEE Network, vol. 32, no. 2, pp. 16–23, 2018. 81. A.-S. Bana, E. D. Carvalho, B. Soret, T. Abrão, J. C. Marinello, E. G. Larsson, and P. Popovski, “Massive MIMO for Internet of Things (IoT) connectivity,” Physical Communication, vol. 37, p. 100859, 2019. 82. E. Björnson, E. De Carvalho, J. H. Sørensen, E. G. Larsson, and P. Popovski, “A random access protocol for pilot allocation in crowded massive MIMO systems,” IEEE Transactions on Wireless Communications, vol. 16, no. 4, pp. 2220–2234, 2017. 83. R. Kassab, O. Simeone, P. Popovski, and T. Islam, “Non-orthogonal multiplexing of ultrareliable and broadband services in fog-radio architectures,” IEEE Access, vol. 7, pp. 13 035– 13 049, 2019. 84. A. Matera, R. Kassab, O. Simeone, and U. Spagnolini, “Non-orthogonal eMBB-URLLC radio access for cloud radio access networks with analog fronthauling,” Entropy, vol. 20, no. 9, 2018.
6G Radio Access Implementation: Challenges and Technologies Alan Gatherer, Chaitali Sengupta, and Sudipta Sen
1 Introduction In this chapter, we do not attempt to push the boundaries of the new algorithms that may define 6G communications, but instead look at the critical issue of how to implement these algorithms reliably, once they are developed. We focus on the implementation of the digital baseband portion of the radio access network (RAN), leaving consideration of new radio frequency (RF) frontend implementation to other chapters. We consider this from the point of view of an “open RAN” [1] solution for reasons that become clear in the chapter.
2 The Frog Boil of Radio Access Implementation In English, there is a story that you can slowly boil a frog, raising the temperature gradually so that it doesn’t notice it until it is too late [2]. Leaving aside the moral question of why anyone would want to do such a thing, “frog boiling” is an expression used to describe an oncoming crisis that arrives so gradually that no one notices until it is upon us. This is the situation that we currently face in the development of flexible 5G modems, and 6G is about to turn the heat up another few notches. In this section, we summarize the causes of this particular frog
A. Gatherer (O) Cirrus360, Dallas, TX, USA e-mail: [email protected] C. Sengupta · S. Sen Cirrus360, Richardson, TX, USA © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_19
533
534
A. Gatherer et al.
boil, allowing us to identify the technologies that will allow the cellular network deployments to survive in this hotter pot of water.
2.1 A Brief, Oversimplified History of IT Radio Access Technology Much has been made of the differences between cellular technology (CT) and information technology (IT) approaches to network deployment with CT focused on mobile person-to-person communications and IT focused on computer-to-computer communications and web scale operations in the cloud. CT standards are developed by an industry group now combined into the 3GPP standards organization [3] that traditionally focused on a “soup to nuts” development to support a complete application stack. IT in contrast was developed by the IEEE building on the internet protocol (IP) and focused on developing individual pieces of the stack that would interoperate with the rest of the network [4], but without one or more specific applications in mind. In the higher layers of the CT network protocol, as we moved to 4G data pipes and then 5G application variety, CT systems have begun to adopt IT technology allowing them to scale and maintain the network in a more efficient manner for data and machine centric applications. However, IT has little to say about the radio access network (RAN) portion of the CT problem, coming as it does from the fixed computer network space. Usually, IT will deploy hardened and highly optimized silicon solutions in the form of Wi-Fi endpoints. These radio access endpoints, though highly sophisticated, are seen by the network to be just “a thing” that presents an ethernet interface to the network, with both radio signal processing and MAC layer packet control hidden from the network. This is a highly scalable approach for deployment but was developed for systems where radio air interface capacity limits were never approached. Management of user access to the endpoint is serialized to allow one user to burst at a time, and the famous TCP protocol is used to manage backoff to allow some level of fair access. If multiple frequency bands are available, an endpoint would pick one to sit on and may be able to migrate slowly between them. This is a highly simplified description of a Wi-Fi-based system, and Wi-Fi6 has adopted spectrum sharing. However, it is essentially true that Wi-Fi and IT networks more generally are based on maximizing the throughput of each user, bursting packets serially and then managing access to endpoints at a higher layer protocol. This makes sense because the Wi-Fi endpoints and users are nominally stationary and have a constant signal to noise ratio (SNR) in time. If you don’t have enough performance, you need to run some more ethernet in the ceiling and add a new access point. Waiting to transmit, or sharing the spectrum with another user, will not change this. Orthogonal frequency division multiplexing (OFDM) is therefore used to water pour the right number of bits into each tone of a larger spectral slice to optimize for a single user when it bursts. This approach
6G Radio Access Implementation: Challenges and Technologies
535
has lost IEEE802 some ground with respect to 3GPP for secure, mission critical applications [5].
2.2 Chasing SNR in a CT system In cellular systems, the users are moving constantly and there is a strong (and sometimes correct) assumption that a base transceiver station (BTS) sector will be asked to operate close to its spectral capacity. One good reason for this assumption is that in an outdoor environment, the cost and effort of adding new endpoints, which traditionally come with towers and concrete and site permissions, are onerous. The BTS will monitor the changing SNR across frequency for each user and dynamically map users depending on instantaneous SNR, with users sharing spectrum both in time (by allocation of tones to different users in a slot) and in space (by the use of MIMO and beamforming). The MAC layer tries to balance spectral efficiency against some idea of fairness for users that may find themselves in a bad spot for an extended period of time. The main conclusion to take away from this is that the CT endpoint (that is the RAN BTS) is not a black box thing to be bought at your local big box electronic store, but a highly dynamic and constantly evolving system that can have a major impact on the quality of service offered to the network user. In fact, it is not uncommon for safety and performance critical application to choose 5G over Wi-Fi as the former can be customized to produce specific quality of service (QoS) characteristics [6].
2.3 A Brief History of CT Use Case Evolution Figure 1 shows, very simplistically, how cellular system usage has evolved through generations. Cellular systems started life in the 1980s supporting only voice calls. In 2G, the protocol from the air interface to the voice codec was developed in the standard to ensure that a good voice call could be made. RAN solutions were therefore easy to design with each user being given a specific timeslot (in the case of GSM) or spreading sequence (in the case of CDMA). Implementation of the RAN air interface simply bundled the compute resources into single voice channel capable units. In fact, CDMA systems were priced in terms of the number of “channel elements” they supported [7], while GSM systems were supported in terms of the number of timeslots supported. The addition of some data channels in 3G muddied the water. When 4G arrived, voice itself became a data application using Voice over IP (VoIP). As a result, the Cellular Modem had transformed into a “dumb pipe” for data movement, much more like that of the IT space. However, it still needed to cope with the user movement and capacity limitations mentioned in Sect. 2.2.
536
A. Gatherer et al.
Fig. 1 Evolution of cellular systems
5G has upended this evolution. Cellular RAN now is application aware again and is supporting not one, but an unknown number of new applications, based on the 3GPP “three-legged stool” of eMBB, URLLC, and MMTC as shown in Fig. 2. Added to this challenge is the rise of private cellular, which is highly leveraged for specialized applications like mining, manufacturing, and medical, where heavily tuned and robust QoS is required. The current challenge of 5G RAN implementation is therefore to build RAN endpoints that are efficient at supporting applications from massive Internet of Things (IoT) to low single-digit users of, for instance, VR headset applications. These applications cannot use the same RAN solution and achieve low cost and high efficiency. In fact, building a hardwired “Swiss army knife” RAN that has many point solutions is expensive and ultimately futile as there is always a new, not yet thought of, application out there that will not have a knife blade already in the set to be used. Hence, 5G RAN implementation must be much more flexible than 4G. For 6G, we consider the interesting argument made by Gerhard Feittweis at WCNC22 [8] that the transition from odd G to even G is a movement of application from business use cases to personal use cases. So, just as 3G brought data to cellular but mainly for business use (think blackberry email), and 4G extended it to everyone’s personal smart phone, 6G will extend the 5G user cases from factory 4.0 and remote surgery to individual use such as immersive gaming and holographic communication. In fact it will spearhead the evolution of the metaverse for wireless users. Implementing such a network for personal use cases across both public and private cellular spectrum is the challenge of 6G RAN implementation.
2.4 Meeting IT (i.e., Data Centers) Coming the Other Way Modems were traditionally implemented using embedded system methodology, much like you would see in the development of a critical control system or a rocket
6G Radio Access Implementation: Challenges and Technologies
537
Fig. 2 5G use cases
guidance system. Each possible scenario was carefully considered, Heisenbugs1 were carefully eliminated, and hundreds to thousands of engineers spend many thousands of cumulated hours discussing spreadsheets of scenarios and identifying corner cases. Simple arithmetic shows that even one Heisenbug on average in the lifetime of a single System on Chip (SoC) in a large metro deployment can lead to an unacceptable level of performance degradation, if that Heisenbug creates serious QoS issues. The design complexity challenge is generally NP complete, and as more requirements are added, it becomes exponentially more difficult to manually find efficient approximate solutions that will meet real-time requirements. As we moved into the later stages of 4G and into 5G, the complexity of the applications to be supported therefore made this methodology difficult to support. Imagine designing a rocket guidance system that can also control your robot vacuum cleaner. There was a strong push in the community toward purely, or mostly, software-based RAN running on Commercial Off the Shelf (COTS) hardware, called Virtual RAN (VRAN). However, this does not solve the design complexity problem because complexity in hardware-embedded solutions is still complexity in software COTS solutions. One can consider the problem to be a game played between a deployment machine and a requirements setting adversary. Each time the deployment machine makes a move, i.e., modifies software and functional mapping to produce a feasible
1 A Heisenbug is a bug that occurs only under very specific conditions in the field. They appear to be sporadic and random, appearing and disappearing at different points in the network and can be hard to replicate in the lab because they may disappear if test data is extracted from the system (hence the “Heisen” in Heisenbug, referring to Heisenburg’s uncertainty theorem of quantum physics).
538
A. Gatherer et al.
deployment of all requirements, the adversary will try to add a new requirement to cause the deployment to fail its goals. The deployment machine must then adjust its strategy. Just as in a game of chess, the machine attempts to make a move that makes the position of the machine as robust as possible for as many moves as possible into the future. This generally means maximizing timing margin for any parts of the deployment that are close to race conditions or timing failure, and keeping margin on memory use at these critical periods. This approach has been tested for learning good schedules [9]. As the RAN deployment community was struggling with this problem, the IT community (in particular the data center platform community) was coming the other way. They had realized that all software solutions did not suffice to implement large machine learning (ML) training tasks that were being increasingly performed in the cloud. Some kind of hardware acceleration specific to ML was required alongside the general purpose hardware [10]. This is due to the explosion in compute requirements of ML, but it is also due to the slowdown in improvements to general purpose compute engine efficiency at the end of Moore’s Law [11].
2.5 How Boiled Is the Frog? To summarize this section, the problems faced currently in 5G RAN deployment are as follows: • Network robustness requirements for high reliability and mission critical technology • A dramatically more complex applications space, with much more specialization from RAN to RAN • A continuously evolving applications space needing continuous integration and deployment/delivery (CICD) methodology to maintain the RAN in the field • A continued need for specialization of hardware to improve performance further complicating deployment and flexibility of the RAN In the following sections, we consider how this situation will worsen with 6G and provide some guidance on what is being done and what could be done to find a path out of the pot into cooler surroundings.
3 Open RAN for 6G 3.1 Open RAN Today Open RAN is an operator-driven effort to allow multi-vendor solutions for the RAN. This allows vendors to specialize on specific parts of the RAN (for instance RF
6G Radio Access Implementation: Challenges and Technologies
539
Fig. 3 O-RAN Architecture
design) while the telco operator enterprise support staff, or a third-party system integrator, puts all the pieces together to construct the specific RAN required for the local requirements. O-RAN [1] is a community that has been defining specifications for such an “assembled from parts” solution and has been defining interfaces to allow parts to communicate. For acceleration, O-RAN has also been developing some basic architectural frameworks to allow less programmable parts to be included [12]. The Telecom Infra Project (TIP) is developing reference designs and holding plug fests to simplify entry to the market [13]. O-RAN divides the RAN into units. The basic O-RAN split is shown in Fig. 3. The traditional RAN is now divided into a radio unit (RU), distributed unit (DU), centralized unit (CU) and a service management and orchestration unit. Added to this is a new unit, the RAN intelligent controller (RIC). The RIC is an important feature of the O-RAN architecture and is seen by some as the main architectural differentiation of O-RAN. It is divided into a nonreal-time RIC and a near-real-time RIC. These components provide interfaces that give opportunities for automation of RAN operations and maintenance. The nonreal-time RIC is mostly part of the traditional operations and maintenance (O&M) function, but the near-real-time RIC provides response times close to a millisecond that could enable new features for flexibility and automation of RAN performance. As of writing, the use of the RIC has been very limited and we believe this is due to structural issues with the rest of the O-RAN architecture that must be addressed before 6G deployments begin. We will discuss this in the next section.
540
A. Gatherer et al.
3.2 Open RAN for 6G Machine learning (ML)-based RAN algorithms are seen as a major goal of 6G [14]. An ML solution can be precisely tuned for a specific RAN (for instance, a RAN that happens to be on a highway next to a racetrack), so that each RAN in the network is slightly different in its behavior and is constantly improving its behavior with respect to network-level metrics. But this cannot happen if a battalion of engineers is required to deploy the RAN software for this specific ML-based RAN use case, especially as it is constantly optimizing based on new data sets. Some simple O&M-level optimization may be automated, such as beam tilt [15, 16], but more complicated trade-offs, such as algorithm performance of physical random access channel (PRACH) detection, or even the frequency of random access channel (RACH) messaging, cannot be easily changed without having an impact on the deployment of the RAN on the target hardware platform. Hence, it is critical that RAN deployment is also automated and potentially ML enabled for 6G. O-RAN standardization of interfaces by itself will not enable ML-driven adaptivity to occur in an open manner because automation of the resulting deployment of the O-RAN components is not possible. As a simple example, consider the integration of a radio unit (RU) and a distributed unit (DU). The use of O-RAN specified eCPRI interfaces means the RU and DU can communicate successfully, but the RU vendor has developed differentiating algorithms on the RU that require specialized parameter feedback from calculations performed on the DU, and these have not been comprehended in the DU so far. The system integrator (SI) can connect the two units but the RU operation is deficient or even incorrect due to the lack of feedback from the DU. In 6G this kind of parameter, creation and feedback will increase dramatically as we move toward ML-powered algorithms [15, 16]. To support this, the DU and RU must be constructed in an open manner at some granular level to allow the operator, post deployment, to activate or deactivate with the push of a button features associated with ML training and inference. Eventually the activation/deactivation role will be taken over by other ML algorithms running perhaps in the RIC. How functionality is mapped to the RU/DU/CU will vary depending on the environment of the RAN [17]. Automation of deployment and redeployment, with integration under performance constraints, is therefore critical to 6G O-RAN.
3.3 Automating ML-Based Algorithm Deployment for 6G In this section, we introduce our proposal for an architecture to achieve ML-driven automation of a 6G RAN. The top-level flow of such an architecture is shown in Fig. 4. Machine learning to optimize and tune algorithms is enhanced and made practical for an automated RAN deployment by adding a partitioning and feasibility checking stage. Deployment of the resulting feasible solution then requires an
6G Radio Access Implementation: Challenges and Technologies
541
Fig. 4 Automated flow for 6G open RAN
automated deployment stage. The feasibility check must run in a tight loop with the ML optimization, with new constraints being fed back to the ML to adjust the solution to one that is feasibly deployed on the existing hardware. Note that all this effort is implemented in the cloud, probably in the RIC, based on network analytics input as well as analytics from the actual field deployments that are gathered over a period of time. The field can also supply analytics to the network, as is usual. But to this can now be added analytics from the ML-learning process and the feasibility checker. Feasibility checking provides a brake to the natural tendency of higher performing algorithms to also be more complex. Changes in algorithms may also change the balance of deployment between CU/DU/RU with some functionality moving from limited resources in one unit to another. The alternative is to blindly assign classes of functionality to a certain piece of hardware even though other hardware is sitting idle. If fairly generic hardware is used we can also consider “slicing” of this hardware to provide different virtual compute resources to different units of the RAN, changing the balance of resources between CU and DU for instance. This slicing is hardware slicing in the RAN, not network slicing of the spectrum resources, and is a convenient way to reduce the overall complexity of feasibility in a hierarchical manner. Progress toward hardware abstraction has been made in the O-RAN alliance [12] but is focused on the abstraction of discovery and interfaces for hardware acceleration. It does not currently provide an abstracted view of the complete system to allow for the disaggregation of software functionality from hardware capability while supporting real-time constraints. Progress continues to be made in O-RAN and more sophisticated abstract models of acceleration hardware have been proposed. With the addition of abstract timing models, these may serve as an abstract model on which to apply automated feasibility checking. It is critical to note that the excellent work on containerization of software in cloud applications, which is summarized for RAN in [12], automates the delivery and deployment of
542
A. Gatherer et al.
O-RAN Common Language for RAN requirements and construction
IP Agnostic Abstraction of Components
Automation with the goal of Optimization of CAPEX and OPEX
Machine Learning
Explainability for Smart Decision Process
SOLUTION FOR AUTOMATED INTEGRATION IN OPEN RAN
Fig. 5 Five pillars for a solution for automated integration in open RAN
software but does not guarantee its feasibility as a real-time component. Traditional RAN deployments employ function specific System on Chip (SoC) that enforce a partition of the CU/DU/RU functionality, simplifying the feasibility check. But in 5G and now 6G, we see much more flexible deployment options making this approach highly inefficient in cost and power.
3.4 Implementing Feasibility Checking and Its Implications for 6G Deployments In order to automate feasibility checking, we need to introduce a framework for requirements and constraints to be input in a domain-specific manner. Here domain specificity means the use of a language that is designed specifically for the system being automated [18, 19]. For example, the popular MATLAB ™ programming language is a domain-specific language for the development of signal processing and control algorithms. Domain specificity is a critical and subtle issue because the system developer will want to input data in a format that makes sense to their expertise and the automation will need to understand it precisely. We identify five pillars on which automation of RAN CU/DU/RU must sit, as shown in Fig. 5, and describe them in more detail in this section.
3.4.1
Open RAN Common Language for RAN Requirements and Construction
This pillar addresses the need to be able to construct a complete RAN solution from multiple vendor IPs using operator- and deployment-specific constraints. We
6G Radio Access Implementation: Challenges and Technologies
543
propose a simple domain specific language which we call the RAN domain specific language (RDSL) to achieve this goal. Requirements for the RDSL include the following: • The construction of the RAN must be described in a manner that precisely defines the construction of the IP with no ambiguity. • The description must be hardware implementation agnostic as much as is possible to allow the same RDSL application to be ported to different hardware SKUs of the same platform or even different platforms with minimal changes. This porting will be at the domain-specific language level, and automation with knowledge of the platform will take charge of its mapping. • The RDSL must allow for an automated implementation of the construction on the target platform using an abstract platform description and system-constraint parameters that define system-level constraints on the use of the hardware. One important example of system constraints would be timing of input and output of the platform allowing properly timed hook up of the DU to the RU and CU. System parameters can also allow definition of memory use by the IP blocks so that efficient mapping of IP to hardware can be automatically optimized.
3.4.2
IP Agnostic Abstraction of Components
To complement the constructive description of the RDSL, for a specific platform, we need an abstract description of the platform up to its middleware. The current effort in O-RAN Alliance WG6 to define extensions to the acceleration abstraction layer (AAL) could allow for a suitable hardware–abstracted platform description. So RDSL is very synergistic with the current open RAN efforts and can be seen as the “missing piece” to allow full automation of RAN system integration. We also use the system constraints of the RAN to define the required boundaries of IP on the hardware. IP vendors (hardware or software) need to publish timing and resource use requirements for their unit IP and these can be easily tested in abstraction. Multiple vendors can now compete to develop improved channel estimation, for instance, for improved functional performance, while the operator can use automation to construct the full RAN system to meet the network system performance requirements with this improved IP. The disaggregation of functional performance and system performance is critical to the automation of RAN development.
3.4.3
Automation with the Goal of Optimization of CAPEX and OPEX
With the RDSL, hardware abstraction and system constraints in place, an automation platform can now be employed to explore the space of potential solutions. This exploration can be performed to optimize for one or more soft constraints such as power, latency, resources used, and so on. This is similar to synthesis in chip design
544
A. Gatherer et al.
except it works at the abstract IP hook up level rather than the register transfer level of digital design synthesis. The system integrator (operator or enterprise IT support team) can employ such a platform, to explore the performance of a particular RAN for different deployments without needing to understand the details of the hardware or be an expert in embedded system development. The feasibility of such a platform running in the cloud has already been demonstrated by Cirrus360. Results for differently constrained synthesis runs will allow optimized solutions for different RAN deployments, such as urban versus rural, or factory versus suburban. The system integrator can focus on exploring trade-offs that maximize the overall network deployment rather than having to choose from one of a small number of fixed designs that may not provide a good solution at the network level. Automation is therefore a critical aspect of the success of open RAN in providing fine-grained analysis and synthesis for specific deployment scenarios. Changes to the RAN deployment are traditionally slowed due to concerns about Heisenbugs and timing failures as an unintended consequence of changing runtimes and resource use of IP within the RAN. With automation any of these kinds of failures, even if they are not visible in test and verification runs, are flagged by the automation tool. Failure flagging allows the system integrator to adjust the system deployment constraints or add mitigation strategies for any rare corner cases. With this assurance in place, the system integrator can become more aggressive in improving the system-level performance of the RAN. Automation will also allow for a continuous tracking and searching for Heisenbugs and timing failures in the RAN. These can be fixed as they are found. We discuss this more in Sect. 3.4.4. Automation of the process allows for an abstract digital twin of the RAN deployment, and any new constraints or changes in understanding of the RAN environment can be fed back into this digital twin and a more optimal and performant RAN solution produced for deployment. The deployment could also be automated so that the RAN could continue to learn its environment.
3.4.4
Machine Learning
One source of new constraints for the automation is from machine learning and analytics in the RIC. As the construction and functionality of the RAN are defined in RDSL, the operator can add RIC enabling modules into the RAN as they see fit and can switch on and off monitors in the RAN that feed the RIC with data for analysis. The RIC in turn provides the automation platform with updated environment constraints and these are used to reoptimize the solution automatically. So automation of RAN deployment becomes a key enabler of the value of the RIC and therefore part of the key value chain of open RAN itself.
6G Radio Access Implementation: Challenges and Technologies
3.4.5
545
Explainability for Smart Decision Process
If the automation platform does not find a feasible solution given the system constraints applied, automated techniques can be used to provide explanations for why the RAN will not work correctly with the current requirements and platform. The system integrator can then make intelligent, data-driven decisions on how to modify the system, either by relaxing the system requirements, reducing capacity, or adding hardware. In our experience, much of the art of system integration is in deciding what to do in the event that too much is asked of the available resources. Automation of explainability is therefore a critical tool for open RAN.
4 Conclusions The 6G Air Interface will be a machine learning-driven and evolving component of the overall RAN solution. The RAN will automatically adjust the air interface based on training algorithms, and each RAN will independently optimize due to a wide variety of deployments and user applications. The implementation of these algorithms in a firm real-time manner on existing hardware in the field must therefore be automated and potentially learned based on field performance data. Both partitioning of algorithm across CU/DU/RU and ensuring firm, real-time performance require a feasibility checking capability within the RAN. Scheduling and mapping of real-time systems is a well-know NP complete problem, so automation of both the mapping and the checking is a strong requirement for a machine learned and maintained 6G solution. In this chapter, we identified five pillars that must be developed to support this automation, and we argue that without these pillars and the resulting automation platform for 6G development and maintenance, both 6G and open RAN will fail to live up to their expectations.
References 1. “O-RAN Alliance,” [Online]. Available: https://www.o-ran.org. 2. “Wikipedia Boiling Frog,” [Online]. Available: https://en.wikipedia.org/wiki/Boiling_frog. 3. “3GPP,” [Online]. Available: https://www.3gpp.org/. 4. “IEEE 802,” [Online]. Available: https://www.ieee802.org. 5. “LTE and 5G private wireless networks help port terminal operators move more for less,” [Online]. Available: https://www.nokia.com/blog/lte-and-5g-private-wireless-networks-helpport-terminal-operators-move-more-for-less. 6. “A Case Study in Automation in Mining,” [Online]. Available: https://www.ericsson.com/en/ reports-and-papers/consumerlab/reports/a-case-study-on-automation-in-mining. 7. C. B. a. M. Scott, UMTS Network Planning and Development: Design and Implementation of the 3G CDMA Infrastructure, Elsevier, 2003. 8. WCNC, “WCNC 22 program,” 2022. [Online]. Available: https://wcnc2022.ieee-wcnc.org/ sites/wcnc2022.ieee-wcnc.org/files/WCNC_slides_V11.pdf.
546
A. Gatherer et al.
9. R. Y. d. C. Danilo Carastan-Santos, “Obtaining Dynamic Scheduling Policies with Simulation and Machine Learnings,” 2017. [Online]. Available: https://hal.inria.fr/hal-01618940/file/ paper-hal.pdf. 10. J. L. Hennessy and D. A. Patterson, “A New Golden Age for Computer Architecture,” February 2019. [Online]. Available: https://cacm.acm.org/magazines/2019/2/234352-a-new-golden-agefor-computer-architecture/fulltext. 11. B. Sterling, “Preparing for the end of Moore’s Law,” Wired Magazine, pp. https:// www.wired.com/beyond-the-beyond/2020/03/preparing-end-moores-law/. 12. O-RAN Alliance, “O-RAN Working Group 6, “O-RAN Acceleration Abstraction Layer General Aspects an Principles 1.01,” O-RAN.WG6.AAL-GAnP-v01.01 Technical Specification,” July 2021. 13. T. I. Project, “Telecom Infra Project, Test and Integration,” [Online]. Available: https:// telecominfraproject.com/test-and-integration/. 14. NextG Alliance, “6G Next G Alliance Report: 6G Technologies,” 2022. 15. Y. Heng and J. G. Andrews, “Machine Learning-Assisted Beam Alignment for mmWave Systems,” IEEE Trans on Cognitive Communications Networks, vol. 7, no. 4, pp. 1142–1155, 2021. 16. G. L. Santos, P. T. Endo, D. Sadok and K. Judith, “When 5G Meets Deep Learning: A Systematic Review,” Algorithms, vol. 13, no. 9, p. 208, 2020. 17. B. Ojaghi, F. Adelantado, A. Antonopoulos and C. Verikoukis, “Impact of Netwrok Densification on Joint Slicing and Functional Splitting in 5G,” IEEE Communications Magazine, vol. 60, no. 7, p. 30, July 2022. 18. K. Olukotun, “High Performance Domain Specific Languages using Delite,” 2012. [Online]. Available: https://ppl.stanford.edu/papers/CGO2012-1.pdf. [Accessed 2022]. 19. M. Lohstroh, “Toward a Lingua Franca for Deterministic Concurrent Systems,” 2021. [Online]. Available: https://dl.acm.org/doi/10.1145/3448128. [Accessed 2022].
Network Disaggregation Soohyun Park, Chanyoung Park, Jae Pyoung Kim, Minseok Choi, and Joongheon Kim
1 Introduction We live in an environment where numerous terminal devices are closely connected and various network structures coexist. The rapid development of communication technology has significantly influenced network diversification such as large-scale network configurations and three-dimensional network structures [1, 2]. Based on small base stations or edge servers with mobile devices or distributed infrastructures made possible by 5G communication technology, the effectiveness of distributed networks has already been proven through various research results [3]. With the emergence of 6G or beyond 5G networks, massive connectivity for high-speed, ultra-reliable, and high-quality communication services should be provided to endusers and nodes at the higher tier (e.g., server). In addition, as more terminal devices become interconnected within the network, robustness and security of communication and networking services have become important factors as well. Changes in the network structure to meet the above requirements of 6G or beyond 5G networks are required to efficiently support these characteristics. In addition, efforts to further develop and supplement technologies for services such as faster response time, lower latency, and computing support on light and small devices must be continuously made. As the need for a new network structure is emphasized, network disaggregation is considered a promising technology [3–5].
S. Park · C. Park · J. P. Kim · J. Kim () Artificial Intelligence and Mobility Laboratory, Korea University, Seoul, South Korea e-mail: [email protected]; [email protected]; [email protected]; [email protected] M. Choi Wireless Intelligent Networks Laboratory, Kyung Hee University, Yongin, South Korea e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_20
547
548
S. Park et al.
Network disaggregation creates a new network structure that splits hardware or software resources to enable more complex and flexible designs at a lower cost, which differs from traditional centralized and distributed networks [6–8]. This section introduces the newly proposed network disaggregation technology, which is available in 6G networks. First, we explain the history of changes in the network structure from a classical centralized network to disaggregated networks. Research trends similar with network disaggregation and the technical details of network disaggregation are described in Sects. 2 and 3, respectively. Finally, Sect. 4 presents open issues and challenges associated with network disaggregation technology.
1.1 Centralized Network In a typical (traditional) network such as Fig. 1, the architecture takes on a form in which many end-users are connected to a central authority (e.g., a central server), and all users can communicate with each other through the central node. In previous studies, the communication capability and techniques between endusers were not sufficiently advanced; therefore, managing them using centralized network configuration was natural. Generally, a central server has a large storage space, sufficient computing resources, and a large power budget. In a centralized network system, all communication requests and responses between end-users are concentrated at the central node [9, 10]. The central node acts as a manager of the entire system, and the end-users in the coverage area of the central node are guaranteed smooth communication service. Furthermore, the central node helps with the limited resources as well. Data sensed and collected by the end devices are uploaded to a cloud server while computational tasks can be offloaded to the server. This characteristic of the central node allows end-users to overcome the Fig. 1 Comparison of centralized network and decentralized network
Network Disaggregation
549
performance limitation by sharing the resources of the central server. However, if the central server is defective, all users connected to the server will not be able to communicate, store data, and process computational tasks. In addition, it will be difficult for the central server to schedule and manage service requests from all users as the number of end devices increases. Particularly, due to the distance between the central node and the end-user and many cumulative requests that must be processed by the central node, communication delay is inevitable.
1.2 Decentralized Network As previously mentioned, the central server is required for managing all users because of the small hardware size of end-users, limited resources of the central node, and inability of the central server to communicate without administrators (e.g., central nodes). The development of memory/non-memory semiconductors has improved power, storage, and computing capacity of devices which have been limited by the physical limitations of end-user devices before [11]. With this technology, the number of devices has exploded, and the types of services available to users have diversified. As a result, it has become difficult to handle all users in one central node. Accordingly, a decentralized network structure was proposed in which several distributed nodes present near users could act as a central server. With the development of communication technology and devices, the implementation of a distributed network capable of one-to-one communication between end-users was made possible in a decentralized network environment that communicates only through an intermediate node.
1.2.1
Distributed Computing
Due to the capability of the network edge in processing computing tasks, end devices do not have to offload their tasks to a distant cloud server every time. Instead, network nodes between the server and end devices such as routers or gateways can deal with offloading requests from users; this architecture is called fog computing. Furthermore, the technique of handling computing tasks via the user devices and small access point at the network edge with sufficient computing resources and power is known as edge computing [12, 13]. Therefore, these distributed computing nodes can be located anywhere in the network such as virtual edge servers and city infrastructures. Consequently, this allows computing resources and capabilities of the cloud server to be close to users. Then, the user can determine the locations in the entire network to offload or process the computational tasks or programs. Meanwhile, a central node can also assign tasks to end devices to reduce the processing time. As a result, a resource-sharing network can be established, and its resource efficiency can be significantly enhanced because it allows the flexible usage of
550
S. Park et al.
computing resources in the entire network. In addition, this distributed computing structure benefits in terms of the completion time for computing tasks because of the relatively close distance between the end devices and edge computing servers [12, 13]. Moreover, the closer distance is also beneficial in terms of transmission delay for distributed computing structures (i.e., fog computing, edge computing) Compared to the central server, the distributed server may be slightly less resource-rich, but it still has better performance than the user device. Thus, it can perform and manage user task offloading like the central server. Similar to the computing nodes, these distributed servers can be located anywhere in the network, such as virtual edge servers and city infrastructures (e.g., small base stations). In addition, improved computing and caching resources of the network edge allow end devices to store and train a heavy deep neural network (DNN) model. This has led to the explosive growth of a various distributed learning techniques. Despite such improvements, end devices still do not have sufficient resources to store, update, and train the DNN model and infer the results. To mitigate the limitations, various approaches of training and inference at end devices have been studied. For example, split computing and split learning allow the DNN model to be partitioned, and end devices store a small fraction of the model rather than the entire model [10, 14]. The computational load involved in tasks such as training of the DNN model and the inference process is distributed among the server and the end devices, reducing the intensive computational burden originally placed on the end devices. This distributed computing network structure enables distributed learning to divide and process common-purpose learning in multiple distributed devices, such as federated learning, split learning, and multi-agent deep reinforcement learning.
1.2.2
Distributed Caching
Similar to the distribution of computing resources, cache resources can be shared among network nodes and end devices for efficient utilization such that heterogeneous user request demands can be met. Multimedia content services (e.g., video streaming) have shown to generate approximately 72% of global data traffic, and 20% of popular contents occupy almost 80% of video data traffic [10, 15]. This means that user requests may be overlapped and repeated, giving way to the possibility of reducing the repeated data traffic caused by popular content requests. Heavy data traffic can be generated in the core network, and excessive delivery latency may be expected when end devices always request content from the remote server with the entire file library [16]. The simple idea is to cache popular content in network nodes (e.g., routers, base stations, access points, or end devices) in advance of user requests. Then, user requests can be directly provided by network nodes with the desired content,
Network Disaggregation
551
which would be much closer to the user compared to the remote server; thus, it can reduce the transmission delay. As service requests concentrated on one central server are now divided into multiple caches, it is possible to flexibly respond to simultaneous user requests. Similar to the distribution of computing resources, it is possible to configure a cache-based distributed network in which the storage is divided. It is possible to provide effective services compared with centralized networks in situations where requests for content are repeatedly made, such as video streaming or content sharing. Limited distributed storage can be efficiently used by storing only the necessary data in advance. In addition, it is possible to respond flexibly to simultaneous user requests compared to the central server because service requests concentrated on one central server are divided into multiple caches [17, 18]. Distributed storage can be applied to elements constituting mobile networks, such as vehicles, UAVs, and road side units (RSUs). Particularly, the mobile cache can maximize performance in distributed networks by being utilized in dynamic network environments that require real-time data sharing, such as autonomous vehicles.
1.2.3
Toward Network Disaggregation
Given the size of the current disaggregated data centers (DCs), it is easier to optimize them through disaggregation to increase efficiency. Disaggregated DCs make it easier to adopt the latest technologies because different resources are physically separated, and the infrastructure can also be customized to handle tasks for maximum performance at a lower cost in terms of workloads. As shown in Fig. 2, the high-level steps to implement disaggregation use independent hardware blades for each resource such as CPU, GPU, ASIC, memory, and storage and network interfaces [8].
Fig. 2 Physically disaggregated DC
552
S. Park et al.
2 Research Trends 2.1 Software Defined Network The network can be divided into two types: (1) a control plane controlling network equipment and (2) a data plane transmitting data. The existing router performs the role of both the control and data planes. However, this structure had a problem that each hardware had to be managed manually and the expensive cost of the entire structure. In addition, the existing router has the characteristic of being dependent on the physical resource equipment, and dynamic network structure is required to use services with mixed traffic and a very large size. To address this issue, a software-defined network (SDN) with the structure such as Fig. 3 was developed. SDN starts with the idea of virtualizing the network, and the router only plays the role of the data plane. The role of the control plane that controls the network equipment is performed centrally by the remote controller, which is physically far from the routers. In other words, the remote controller in SDN calculates the routing path through software and distributes optimal routing paths to the routers such that all routers only transmit data according to the remote controller’s command. Since a remote controller can manage multiple network devices through SDN, cost of management and operation can be reduced. In addition, it has high scalability and flexibility because network resources can be expanded or reduced as needed. Although the quality of transmission (QoT) is evaluated based on the generalized signal-to-noise (GSNR), QoT of modern optical networks using advanced SDN technology is affected by network abstraction. Wavelength selective switches (WSSs) and bandwidth variable transponders (BVTs) use SDN technology for optical communication to select the most suitable modulation format for a given
Fig. 3 Architecture of SDN
Network Disaggregation
553
light path (LP), making optical communication more flexible and versatile. BVT developed a single-vendor optical communication system into a multi-vendor method using these advantages and finally developed a disaggregated network. In this development process, it is necessary to dynamically and accurately estimate QoT for each LP. For this, the QoT for disaggregated optical communication is evaluated using SDN implemented with SDN-enabled sliceable BVT (S-BVT) based on multicarrier modulation (MCM). In this case, not only can SDNs of different service providers be used, but different BVTs can also be employed. In other words, the SDN approach offers the advantage of reducing the cost of dynamically allocating networks and reconfiguring transceivers. SDN research on wavelength division multiplexing (WDM) transmission is expected to make great strides in the foreseeable future [19, 20].
2.2 Virtual Machine A virtual machine (VM) is a virtual environment built on a physical hardware system. It has its own CPU, memory, storage, and network interfaces and acts as a virtual computer system such as Fig. 4. A virtual machine is a software computer, but it runs applications and operating systems similar to a physical computer. In other words, a virtual machine behaves the same as a separate computer system.
Fig. 4 Architecture of virtual machine
554
S. Park et al.
The resources configured for each VM are not always the same as the actual hardware and can be used more efficiently according to the tasks performed by the VM. For example, the reduction of the cost of managing the infrastructure of a network server is possible by separating only the functionality of the hardware and reusing the previously created VMs on another hardware system. The virtualization concept has improved in various ways over the years. The current approach for improvement is to package data related to multiple applications and functions into a single container and then distribute it as a single entity on either physical or virtual machines. Several virtualized components have been used to handle various networking functions. This is called network functions virtualization (NFV), and each function is called a variety of common and specialized networking functions (VNFs). Virtualization capabilities for hardware and software disaggregation provide open source characteristic, resiliency, location independence, elastic performance, and statistical multiplexing.
2.3 Network Disaggregation Since both centralized and distributed networks manage all of the software and data on one server, if there is a defect with a specific factor among the server components or if the capacity for a specific resource is insufficient, the entire server system will be affected. These existing network structures generate the costs necessary for network maintenance and management and are inefficient in terms of network design and complicated operations. Therefore, a new method for network construction is proposed as a more advanced solution. The network disaggregation method allows a server to be modularized to isolate a typical server structure and adaptively select modules as needed. This makes it possible to design an innovative network structure that breaks away from the existing one, such as partially reusing a specific module. Network disaggregation separates and modulates server components (resources) based on the types of hardware, software, and operating systems. Compared with the aforementioned traditional network paradigm, a disaggregated network consists of basic component units according to the resource type. This results in flexibility, scalability, and resource reuse efficiency for network configurations. A detailed description of this technique is provided in Sect. 3.
3 Technical Details 3.1 Categorized by Disaggregation Levels According to the level of resource disaggregation, DCs can be subdivided into partially disaggregated and fully disaggregated DCs.
Network Disaggregation
555
• Partially Disaggregated: Recently, storage has been separated from the central server to use partially disaggregated resources. Furthermore, the DC is interconnected with the rest of the computing resources, (i.e., CPU and memory) through a switching fabric. This may require a network interface card (NIC) to support storage communication. However, since CPU and memory are still integrated into the computing node, their usage is limited in a partially disaggregated environment. Low resource utilization in modern DCs can be related to the mismatch between the diversity of resource usage of running applications and the fixed amount of resources integrated into the physical blade servers in DC. In addition, integrating all resources within a server chassis makes it impossible to change and upgrade only one or a few types of resources. To overcome these limitations, a fully disaggregated structure emerges, where resources eliminate the physical boxes that unite different types of resources and constitutive units, such as resource blades, racks, or clusters [21]. • Fully Disaggregated: As previously discussed, a fully disaggregated architecture allows DC operators to individually replace or upgrade resources as needed, improving resource utilization and portability. However, disaggregated DCs have disregarded the limit of transmission capacity when communicating different types of data between the CPU and the memory. Furthermore, in recent computer architectures, the minimum latency ranging from milliseconds to nanoseconds for communication varies for different types of computer resources. Additionally, the maximum bandwidth representing the amount of data that can be processed ranges from a few bits to hundreds of gigabits per second. The performance of the running process is severely degraded if these requirements are not met. The optical communication system can provide a bandwidth of the corresponding range but is not unlimited. Thus, researches that can satisfy the above bandwidth and latency are currently in progress [21].
3.2 Categorized by Sizes In a partially disaggregated network, the CPU and the memory are still consolidated into computing nodes; therefore, their usage is restricted. Recently, a fully disaggregated network architecture has emerged to compensate for this shortcoming. In a fully disaggregated network, resources of the same type are organized into units (i.e., resource blades, racks, and clusters) to communicate with each other without the physical boxes that aggregate different types of resources [22]. Disaggregation of server-specific resource components consists of various scales [23]. They vary in size from the rack scale to the pod (cluster) scale with multiple racks to a DC scale with multiple clusters within a DC [21]. • Rack-Scale: In the rack scale, functional disaggregation is often considered first since the integrated servers are replaced by resource blades. Due to the short distance, low propagation delay and large capacity can be easily realized
556
S. Park et al.
Fig. 5 Rack scale with all optical interconnection
Fig. 6 Rack scale with hybrid optical interconnection
with functional disaggregation. Figures 5 and 6 show two architectures for implementing a rack-scale fully disaggregated DC. These two architectures can be distinguished into the types of interconnection: optical or hybrid. As shown in the figures, the different types of resources are completely separated. While server blades consist of all types of resources, resource blades interconnected
Network Disaggregation
557
Fig. 7 DC architecture in rack scale and pod scale
by the optical interfaces (OIs) are composed of only one type. In a rack scale, different resource blades are connected through optical interconnections or electronic switches to communicate with each other. In addition, minimizing the latency for the actual expansion OIs is necessary. Interconnected nodes in the rack scale must supply sufficiently high bandwidth with minimal latency. To form a computing system, physically disaggregated resources in the rack scale are integrated into nodes, and the same resources are allocated to racks, as shown in Fig. 7. In this case, resources can only be used by other identical rack resources. Although each node placed in a rack-scale DC comprises homogeneous resources when physical resource disaggregation is implemented at rack scale, each node in the rack can still hold different resource types. Each rack has the same three types of nodes (CPU), and the other three are another type (memory). Finally, a rack-scale DC consists of two heterogeneous resource racks and is assigned to one pod. Low bandwidth traffic traverses a high layer of the network topology, while high bandwidth traffic is limited to the rack. In the traditional DC, high bandwidth inter-component traffics are node-limited (i.e., they are restricted to the onboard backplane of servers). By contrast, inbound and outbound traffics of the DC (traffic between DC resources and remote systems) flow through higher tiers of each network topology. However, high bandwidth
558
S. Park et al.
traffic between the CPU and memory components is rack-limited in the rackscale DC. Low bandwidth DC north-south communication traverses higher tiers of the adopted network topology. Using an optical network topology for rackscale DC is energy efficient [22]. • Pod Scale: A pod scale in a cluster unit consisting of multiple racks and has a computer system architecture in which nodes of the same resource type are allocated to a rack. Then, the racks of different types of resources are allocated to a pod level, as shown in Fig. 7. Resources in each rack in a pod are only accessible inside the domain of the pod. In the physical disaggregation implemented at the pod scale, each rack of a pod-scale DC is composed of only one type of resource: CPU or memory. This means that a rack would have six identical CPUs or memory resource nodes. Finally, each pod would consist of one CPU resource rack and one memory resource rack. Therefore, a logical server can only be formed at the pod level. In a pod-scale DC, traffic between CPU and memory is differently handled differently depending on bandwidth. High-bandwidth traffic is restricted to pods, while low-bandwidth traffic is unrestricted and traverses all layers of the network topology. Therefore, the power consumption of the network topology in a pod-scale DC is relatively high compared to the rack-scale DCs discussed previously. However, since pod-scale DCs use racks configured with the same resources as Fig. 7b, all CPU memory traffic must traverse the rack-torack fabric. In other words, compared to rack-scale DCs, pod-scale DCs have the advantage that the network topology does not interfere with the optimal selection of CPU and memory resources. Compared to the observation under traditional and rack-scale DCs, network topology does not inhibit optimal CPU and memory resource selection in pod-scale DC. This is because all CPU-memory traffic must traverse inter-rack fabric due to the use of homogeneous resourced racks in the pod-scale DC. On the other hand, network power consumption by shuffling memory data between workloads is considerably limited in pod-scale and rackscale DCs. As observed in the rack-scale DC, network power consumption resulting from inter-workload memory data shuffle is also significantly limited in pod-scale DC. This can be achieved by placing the memory resource demand of workloads of the same workload group in the same component or node. However, clustering of workload memory resource demands into memory components or nodes also depends on memory capacity constraint, power consumption of memory components, and their corresponding impact on the total DC power consumption [22]. • DC Scale: DC scale computing with multiple clusters is characterized by large volumes of data ranging from petabytes to exabytes, sub-hundreds of microsecond latencies, and various structured data forms such as NoSQL, MapReduce, and Spark/Hadoop in cloud environments. However, these cloud environments are limited by dynamic characteristics of workloads, different innovation cycles for different system hardware components, and rapidly changing system configuration requirements. A disaggregated DC consists of a collection of individual resources such as CPUs, HDDs, and memory which can also be organized into units of workload execution depending on the user’s needs.
Network Disaggregation
559
These disaggregated DCs are evolving technologies that can overcome the aforementioned constraints [24].
4 Open Issues and Summary In this section, we propose a new research field of study that can extend the network disaggregation technology.
4.1 Combining SDN and Disaggregation Among the technologies that meet the requirements of 5G or beyond optical communication, the core technologies are SDNs and disaggregated networks. SBVT using SDN technology is characterized by large capacity and high scalability, enabling disaggregation through SDN. The implementation of novel, spectrally efficient, and adaptive transmission solutions based on the adoption of modular programmable S-BVTs enhances capacity, flexibility, and hyper-scalability. Multiple slices/signals can be transmitted in a high-capacity single data flow or independent flows reaching different destination nodes, thereby enhancing the dynamic nature of the network. Transceivers’ sliceability and modularity are essential for adapting the system to network needs/requirements following a pay-as-you-grow approach. Therefore, the capacity can be increased by enabling more slices and fostering network agility. The S-BVT is a particularly suitable technology for distributed optical networks in 5G environments as it also facilitates a fully disaggregated environment because it can be configured with different BVTs. SDN agents can integrate with disaggregated transceivers, remarkably reducing the pre-configuration time on fiber-optic channels. Therefore, the S-BVT architecture is crucial for optical communication [20]. Moreover, a soft failure can be addressed without the help of an expert that can repair the device. Detecting soft failures is crucial because it can prevent network devices from malfunctioning. Recently, various studies have been conducted to effectively detect soft failures, even in disaggregated networks [25, 26]. SDN is used to remotely detect the soft failure of disaggregated networks, where network components, such as reconfigurable optical and drop multiplexers (ROADMs) or optical amplifiers, come from different suppliers. However, they are subject to the same SDN control within their transparent optical domains. The fully disaggregated architecture shown in the Fig. 8 utilizes remote detection for soft failure detection by considering three hierarchical levels according to the OpenROADM framework: (1) SDN-based network controller (NC) to configure and manage the entire network area; (2) network element controller (NEC) composed of a white box; and (3) device agent (DA) used to control a disaggregated optical device, such as a BVWSS, line power amplifier, and transponder. Inspection and control of the SDN state
560
S. Park et al.
Fig. 8 Remote detection service for fully disaggregated network
occurred in these NC-NEC-DA chains. At each specified time interval, the average performance parameters for the LP and the signal were measured. If the value of this parameter deviates from the normal operating value and exceeds a certain threshold, it determines that a fatal error is likely to occur and delivers a warning notification asynchronously. After the notification is sent, the actions occur in the following order. (1) The OAM handler creates a set of telemetry subscription instances for granular live monitoring. (2) NEC uses the gRPC server and NE handler to send subscription requests to the target DA by NE gRPC clients. (3) gRPC servers of the DA provide real-time streaming services based on monitoring data directly received from hardware drivers [7].
4.2 Security for Disaggregated Network Security for partially and fully disaggregated non-volatile memory (NVM) is an important consideration in edge cloud. This is because disaggregated networks are dynamic multi-tenant software architectures, where cryptographic acceleration, policy engine, and enforcement control are integrated with networking. Distributed ID management infrastructure is also important. Since data are persistently stored in memory, new mechanisms for controlling access to users are required to avoid the possibility of cross talk caused by an electric or magnetic field in a communication
Network Disaggregation
561
signal that affects other signals on adjacent lines. Additionally, ubiquitous data encryption to accelerate the performance on large-scale hardware requires new technologies for local services and distributed ID management [6]. Companies that provide real-time disaggregation capabilities use cloud services that can help protect their data. Future research on the development of a relatively simple algorithm must be conducted for security in a real-time disaggregation environment since the disaggregation algorithm is complex. Furthermore, the need for future research on personal information protection technologies that are suitable for disaggregation algorithms using cloud services is emerging [27].
References 1. Ju-Hyung Lee, Ki-Hong Park, Mohamed-Slim Alouini, and Young-Chai Ko. Free space optical communication on UAV-assisted backhaul networks: Optimization for service time. In 2019 IEEE Globecom Workshops (GC Wkshps), pages 1–6. IEEE, 2019. 2. Wanmei Feng, Nan Zhao, Shaopeng Ao, Jie Tang, XiuYin Zhang, Yuli Fu, Daniel KC So, and Kai-Kit Wong. Joint 3d trajectory design and time allocation for UAV-enabled wireless power transfer networks. IEEE Transactions on Vehicular Technology, 69(9):9265–9278, 2020. 3. Bin Li, Zesong Fei, and Yan Zhang. UAV communications for 5g and beyond: Recent advances and future trends. IEEE Internet of Things Journal, 6(2):2241–2263, 2018. 4. E. Yaacoub and M. Alouini. A key 6G challenge and Opportunity–Connecting the base of the pyramid: A survey on rural connectivity. Proceedings of the IEEE, 108(4):533–582, Apr. 2020. 5. Zhengquan Zhang, Yue Xiao, Zheng Ma, Ming Xiao, Zhiguo Ding, Xianfu Lei, George K Karagiannidis, and Pingzhi Fan. 6g wireless networks: Vision, requirements, architecture, and key technologies. IEEE Vehicular Technology Magazine, 14(3):28–41, 2019. 6. Luis M Vaquero, Yehia Elkhatib, and Felix Cuadrado. Disaggregated memory at the edge. arXiv preprint arXiv:2102.03124, 2021. 7. Francesco Paolucci, Andrea Sgambelluri, Filippo Cugini, and Piero Castoldi. Network telemetry streaming services in SDN-based disaggregated optical networks. Journal of Lightwave Technology, 36(15):3142–3149, 2018. 8. Sangjin Han, Norbert Egi, Aurojit Panda, Sylvia Ratnasamy, Guangyu Shi, and Scott Shenker. Network support for resource disaggregation in next-generation datacenters. In Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks, pages 1–7, 2013. 9. Seongah Jeong, Osvaldo Simeone, and Joonhyuk Kang. Mobile edge computing via a uavmounted cloudlet: Optimization of bit allocation and path planning. IEEE Transactions on Vehicular Technology, 67(3):2049–2063, 2017. 10. Evangelos K Markakis, Kimon Karras, Anargyros Sideris, George Alexiou, and Evangelos Pallis. Computing, caching, and communication at the edge: The cornerstone for building a versatile 5g ecosystem. IEEE Communications Magazine, 55(11):152–157, 2017. 11. Leucio Antonio Cutillo, Refik Molva, and Thorsten Strufe. Privacy preserving social networking through decentralization. In 2009 Sixth International Conference on Wireless On-Demand Network Systems and Services, pages 145–152, 2009. 12. S. Jeong, O. Simeone, and J. Kang. Mobile edge computing via a UAV-mounted cloudlet: Optimization of bit allocation and path planning. IEEE Trans. Veh. Technol., 67(3):2049–2062, Mar. 2018. 13. Won Joon Yun, Soyi Jung, Joongheon Kim, and Jae-Hyun Kim. Distributed deep reinforcement learning for autonomous aerial eVTOL mobility in drone taxi applications. ICT Express, 7(1):1–4, 2021.
562
S. Park et al.
14. Shinyoung Ahn, Joongheon Kim, Eunji Lim, and Sungwon Kang. Soft memory box: A virtual shared memory framework for fast deep neural network training in distributed high performance computing. IEEE Access, 6:26493–26504, 2018. 15. Minseok Choi, Myungjae Shin, and Joongheon Kim. Dynamic video delivery using deep reinforcement learning for device-to-device underlaid cache-enabled internet-of-vehicle networks. Journal of Communications and Networks, 23(2):117–128, 2021. 16. Adeel Malik, Joongheon Kim, Kwang Soon Kim, and Won-Yong Shin. A personalized preference learning framework for caching in mobile networks. IEEE Transactions on Mobile Computing, 20(6):2124–2139, 2020. 17. Joongheon Kim, Giuseppe Caire, and Andreas F Molisch. Quality-aware streaming and scheduling for device-to-device video delivery. IEEE/ACM Transactions on Networking, 24(4):2319–2331, 2015. 18. Minseok Choi, Joongheon Kim, and Jaekyun Moon. Wireless video caching and dynamic streaming under differentiated quality requirements. IEEE Journal on Selected Areas in Communications, 36(6):1245–1257, 2018. 19. Andrea D‘Amico, Elliot London, Emanuele Virgillito, Antonio Napoli, and Vittorio Curri. Quality of transmission estimation for planning of disaggregated optical networks. In 2020 International Conference on Optical Network Design and Modeling (ONDM), pages 1–3. IEEE, 2020. 20. Laia Nadal, Michela Svaluto Moreolo, José Alberto Hernández, Josep M Fabrega, Ramon Casellas, Raul Muñoz, Ricard Vilalta, Laura Rodríguez, F Javier Vílchez, and Ricardo Martínez. SDN-enabled S-BVT for disaggregated networks: design, implementation and cost analysis. Journal of Lightwave Technology, 38(11):3037–3043, 2020. 21. Rui Lin, Yuxin Cheng, Marilet De Andrade, Lena Wosinska, and Jiajia Chen. Disaggregated data centers: Challenges and trade-offs. IEEE Communications Magazine, 58(2):20–26, 2020. 22. Opeyemi O Ajibola, Taisir EH El-Gorashi, and Jaafar MH Elmirghani. Energy efficient placement of workloads in composable data center networks. Journal of Lightwave Technology, 39(10):3037–3063, 2021. 23. Opeyemi O Ajibola, Taisir EH El-Gorashi, and Jaafar MH Elmirghani. Disaggregation for energy efficient fog in future 6G networks. arXiv preprint arXiv:2102.01195, 2021. 24. Chung-Sheng Li, Hubertus Franke, Colin Parris, Bulent Abali, Mukil Kesavan, and Victor Chang. Composable architecture for rack scale big data computing. Future Generation Computer Systems, 67:180–193, 2017. 25. Patrick Iannone, Alan H Gnauck, Michael Straub, Jörg Hehmann, Lothar Jentsch, Thomas Pfeiffer, and Mark Earnshaw. An 8-× 10-gb/s 42-km high-split TWDM PON featuring distributed Raman amplification and a remotely powered intelligent splitter. Journal of Lightwave Technology, 35(7):1328–1332, 2017. 26. Zhenhua Dong, Faisal Nadeem Khan, Qi Sui, Kangping Zhong, Chao Lu, and Alan Pak Tao Lau. Optical performance monitoring: A review of current and future technologies. Journal of Lightwave Technology, 34(2):525–543, 2016. 27. Anthony Faustine, Nerey Henry Mvungi, Shubi Kaijage, and Kisangiri Michael. A survey on non-intrusive load monitoring methodies and techniques for energy disaggregation problem. arXiv preprint arXiv:1703.00785, 2017.
AI-Native Communications Hankyul Baek, Haemin Lee, Soohyun Park, Hyunsoo Lee, Jihong Park, and Joongheon Kim
1 Introduction Over the last few years, we have reached the era of 5G beyond 4G. In addition, due to the rapid development of communication networks, many researchers in academia and industries aim to establish standards for 6G technology, the nextgeneration network. 6G has many properties, but the most important property is that AI is used to establish the communication network. As the 6G network evolves into a programmable and flexible cloud-native implementation, ML/AI-based network optimization will be a key feature of the 6G network. In this chapter, we look at these methods and analyze them to find a more suitable optimization method for each communication system. As 5G technology matures day by day, studies looking beyond 5G and to 6G are being actively conducted. Currently existing communication methods are incomparably faster than before in terms of speed. However, the data size increases accordingly, and new environments have emerged accordingly. Thus, communication optimization based on 6G is an essential future. The major characteristic of 6G communication is that AI is applied to the communication system. From optimizing a specific communication system module to replacing the entire communication system with AI, the innovation of the communication system using AI is advancing day by day based on many studies. Note that AI does not always guarantee better and safer performance than existing algorithms in terms of performance
H. Baek · H. Lee · S. Park · H. Lee · J. Kim () Korea University, Seoul, South Korea e-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected] J. Park Deakin University, Geelong, VIC, Australia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_21
563
564
H. Baek et al.
Fig. 1 The structure of deep residual learning-based channel estimator
or cost. However, numerous studies have corroborated that AI-based techniques perform equally well compared to legacy model-based ones, even though the active engagement of AI has so far varied depending on the methodologies covered. In this section, we go one step further from the point of view of the AI network itself, covered in previous sections, and focus on AI-based communication. First of all, we describe AI-based communication by dividing it into two stages, i.e., AIaided and AI-native. Several studies show that the AI-aided method is practical and feasible. On the other hand, based on the AI-aided method, we can optimize the communication module with the AI optimization process. However, optimization for only a specific module is far from optimal communication. Based on the AInative method, we can optimize the entire communication pipeline but it cannot be asserted that this is practical with current technology. Many studies are trying different approaches to make AI-native communication feasible. Therefore, we discuss what kind of research is being done and what kind of challenge should be solved to achieve AI-native communication.
2 Two Steps Toward the AI-Native First, we describe the difference between the AI-aided and AI-native methods. An AI-aided method aims to optimize specific communication modules. An AI-nativebased method optimizes the entire network structure by changing it into an AI structure. Figure 1 shows the structure of a brief example of communication. Each module can be optimized via AI and AI-aided methods can be broadly implemented in this structure. Therefore, when we choose the AI-aided method, we can optimize the system by optimizing each module. However, this method is inefficient because it is hard to convince that the output optimization method is optimal. Therefore, we consider the AI-native method. Note that it is essential to ensure that the
AI-Native Communications
565
AI-aided optimization method is preceded to utilize the AI-native method. After successfully implementing the first step, complex signal processing blocks in the network, such as joint channel estimation, data detection, and synchronization, can be replaced or optimized by AI. At this point of time, if the AI-aided optimization method shows good performance, someone may ask the question, “Is the AInative optimization method essential?” because there may be limitations to AI-aided optimization in solving some problems, such as data processing, data collection, and the convergence of training algorithms. In order to solve these limitations, the AI-native optimization method is attracting much attention. AI-native optimization methods replace more modules, even entire communication layers, with AI. In general communication modules, AI-native networks can replace encoding, mapping, and modulation modules on the transmitter side and synchronization, channel estimation, equalization, demapping, and decoding modules on the receiver side with AI. End-to-end training is one of the most efficient ways to replace an entire network with AI rather than just replacing multiple modules. Through endto-end learning, AI learns communication environment parameters (i.e., small-scale fading and large-scale fading) and module functions such as modulation methods to replace the entire network. Therefore, to utilize these methods properly, this section describes previous studies on both AI-aided AI-native communication given the physical layer of the network (PHY) layer.
2.1 AI-Aided/AI-Native Network for PHY Layer The existing PHY layer is designed based on a mathematical model. Accordingly, the optimization of the PHY layer is hampered by the randomness of the channel environment. In particular, several essential technologies, such as massive multiinput multi-output (MIMO) and millimeter-wave (mmWave), aim to achieve a thousand times larger capacity, smaller millisecond latency, and more massive connectivity. For these sophisticated techniques, the randomness of the channel and the fact that general mathematical modeling does not cover the characteristics of the channel are huge problems. Accordingly, there is a need for a PHY layer that is robust to channel variance and can provide optimal performance in specific environments and in fluctuating channel conditions. In addition, several following requirements must be satisfied to utilize large-capacity data and highspeed communication. • Requirements for effective and fast signal processing • Requirements for effective channel modeling Machine learning (ML), deep learning (DL), and even AI-native methods are applied to the PHY layer to satisfy these requirements.
566
H. Baek et al.
Fig. 2 The structure of deep residual learning-based channel estimator
2.1.1
AI-Aided Network for PHY Layer
Using a traditional PHY layer can introduce imperfections requiring the usage of compelling reception processing algorithms for several applications such as channel estimation and detection. Therefore, many researchers in academia and industry apply the AI-aided network for the PHY layer to overcome the disadvantages. Research in [1] shows a good case of raising the performance of the PHY layer using the AI-aided optimization method. The research aims to mitigate the effects of pilot contamination under the consideration of residual hardware impairments (RHWIs) using AI-aided approaches. The research in [1] proposed a conventional linear minimum mean square error (LMMSE)-based channel estimator. The channel estimator tries to alleviate the distortion noise caused by pilot contamination by extracting and learning the distortion noise from the received signal. Figure 2 illustrates the structure of the proposed AI-aided channel estimator. The estimator contains two parts: (i) noise estimator and (ii) denoising module. The proposed AI-aided channel estimator utilizes multiple convolutional layers with activation functions to extract and construct the feature space of the distortion noise. The denoising module in Fig. 2 utilizes the output of the estimator. With the estimated distortion noise, the denoising module approximates the channel by subtracting the preprocessed signal from the output of the noise estimator. In addition, to construct the proposed AI-aided channel estimator properly, the research treats the signal as an image and adopts convolutional layers to extract the features of the signal. With the AI-aided channel estimator, the estimated channel can be obtained. In addition, AI-aided optimization methods have been applied to many modules within the PHY layer, including modulation recognition [4–7] and channel decoding [8–10]. Beamforming refers to the technique of properly focusing a beam such that the user obtains high throughput. We can apply deep learning methods to support beamforming for multiusers. Since there is a limitation of channel power, a method of configuring an orthogonal random beam and allocating a beam to a user is proposed [13].
AI-Native Communications
567
Fig. 3 Structure of autoencoder (AE)
Fig. 4 General structure of AE
2.1.2
AI-Native Network for PHY Layer
On the other hand, several studies have been conducted to optimize the PHY layer by implementing an AI-native module that replaces the network with AI. A representative and common feature of AI-native network optimization research is that they treat the task as an end-to-end reconstruction problem. Many research about AI-native networks utilize the autoencoder (AE) to solve the task. Research in [3] suggests that an AI-native communication system could be built by employing AE in the network. Figure 3 illustrates a brief structure of an autoencoder. In a communication system that transmits the message s, the original message is modulated from s to x and y via the transmitter and channel. Therefore, the network receiver receives the modulated message, y. During communication, the receiver can use AE to reconstruct the original message with a low error rate. The entire AE-based communication system, as in Fig. 4, is trained to achieve endto-end performance. The AE architecture can process IQ samples to solve other communication problems in the physical layer, such as pulse shaping and offset compensation. It can be applied to complex scenarios where the communication channel is unknown. Research in [2] proposed a training method to implement AI-native networks well. AE is also adopted for substituting the entire network. In addition, it jointly trains the network based on the concept of transfer learning [12]. Figure 5 shows a brief training process. Firstly, the AE is trained on a stochastic channel model. Then, it fine-tunes the receiver to compensate for the mismatch between the channel model and the actual channel. With the AE and training algorithm, the entire physical layer processing can be replaced with
568
H. Baek et al.
Fig. 5 Two-phase training
DNNs. Research using AI-native in other ways is also being actively conducted. For example, research in [11] replaces the entire receiver module with AI. A CNN-based AI-native receiver is utilized to transform frequency domain signals into decoded bits. With an AI-native receiver, the single receiver is able to cover large parts of the network and show robustness in various different scenarios and parameters. In addition, several operations in the PHY layer, such as equalization, channel estimation, interpolation, demodulation, and signal demapping, are replaced by DNN. Nowadays, the conventional communication system comprises of several artificially defined processing blocks (e.g., source coding, modulation, and detection) [14]. Some problems, such as bottlenecks in complex communication environments, can be managed using the conceptual simplicity of deep learning theory. However, the optimization is limited to each block and insufficient for optimizing the overall system [14]. By using AE networks, which have an encoding-decoding network with a class of trainable and differentiable transformation functions, the communication system with a transmitter and receiver can be substituted. Moreover, the complete system is optimized jointly for any differentiable end-to-end performance parameter rather than utilizing the conventional block structure [14] and designing the appropriate channel model is crucial for system optimization. It will not be easy to update the transmitter parameters if we get a channel model that is not available. Since there are numerous impairments present such as additive noise, carrier frequency, phase offset, sampling frequency offset, and multi-path fading, determining the correct channel transfer function in a natural communication system will be challenging [15–21]. Finally, we can categorize the channel AEs into two groups to address these issues: (1) model-assumed channel AE and (2) model-free channel AE.
AI-Native Communications
569
A general method to address the aforementioned impairments is investigated by the model-assumed channel AE and is based on the radio transformer network (RTN). The RTN upgrades the receiver by adding a parameter estimation network and a transformation layer before the decoding network. Moreover, the transformation layer functions as a deterministic function parameterized by the parameter estimation network on the received signal to correct the modeled impairments. From the signal that was received, the parameter estimation network learns these crucial parameters [14]. Therefore, the decoding network can retrieve the original message more clearly by receiving a wide-ranging signal. The simplified channel model has a limitation in that it can deceive trained systems because of inconsistencies with the real world, even though the model-assumed channel AE can model the aforementioned disturbances [14, 15, 17, 19]. Some real-world complexities cannot be accurately represented in a differentiable way. By using a two-phase training technique, model-free channel AE addresses the problem with model-assumed channel AE. In model-free channel AE, we have to solve the issue that the transmitter parameters cannot be changed because the transmitter gradients are not available over the real channel [14]. Therefore, it is suggested that a gradient-generating network (GGN) propagates gradients to the transmitter. • RL-based approximation: The transmitter acts as an agent, the action denotes the signal transmitted, and the status is a message [22]. As a reward from the environment, the end-to-end loss function calculated at the receiver is the reward to the transmitter to modify the behavior approach. After the receiver has been trained under supervised training, reinforcement learning (RL) trains the transmitter. The transmitter-added policy contains trainable parameters that are directly optimized through compensation [14, 22]. • SPSA-based approximation: This gradient-free optimization method enables you to immediately apply SPSA to the channel AE system optimization without being aware of the precise channel model [23]. The SPSA approximates the loss function slope with respect to the AE parameter as the message is initially delivered from the transmitter to the receiver across the channel. By doing this, the channel transfer function is not necessary to update all the parameters of the encoding and decoding network in a single step [14, 23].
3 Concluding Remarks As mentioned above, beyond the sophisticated modeling of wireless communication networks, we are in the transitional stage of optimizing the communication network in an AI-aided method and further building the network based on an AI-native method. It is undeniable that the AI-native method will bring about a radical paradigm shift, and groundbreaking technologies based on AI will play a crucial role in the promising 6G network. To achieve an AI-native network successfully,
570
H. Baek et al.
many researchers in industry and academia have proposed research directions, and this suggestion is summarized as the two-step strategy. In addition, we introduced the case of network optimization research using AI-aided and AI-native methods in the communication system. According to the proposed direction, if the communication system can be optimized through the AI-aided method and the network can be replaced with AI through the AI-native method, we can develop various technologies and advanced applications.
References 1. Lim, Byungju, Yun, Won Joon, Kim, Joongheon Kim, and Ko, Young-Chai (2022) Joint Pilot Design and Channel Estimation using Deep Residual Learning for Multi-Cell Massive MIMO under Hardware Impairments In IEEE Transactions on Vehicular Technology 2. S. Dörner, S. Cammerer, J. Hoydis and S. T. Brink, “Deep learning based communication over the air,” IEEE Journal of Selected Topics in Signal Processing, pp.132–143, 2017 3. T.J. O’Shea and J. Hoydis, “An Introduction to deep learning for the physical layer,” IEEE Transactions on Cognitive Communications and Networking, pp. 563–575, 2017 4. West, Nathan E. and T. O’shea, “Deep architectures for modulation recognition,” IEEE International Symposium on Dynamic Spectrum Access Networks, pp. 1–6, 2017 5. L. Ma, Y. Yang and H. Wang, “DBN based automatic modulation recognition for ultra-low SNR RFID signals,” in Proc. of the IEEE Chinese Control Conference (CCC), pp.7054–7057, 2016 6. Y. Zeng, M. Zhang, F. Han, Y. Gong and J. Zhang, “Spectrum analysis and convolutional neural network for automatic modulation recognition,” IEEE Wireless Communications Letters, pp.929–932, 2019 7. F. Wang, Y. Wang and X. Chen, “Graphic constellations and DBN based automatic modulation classification,” Proc. of the IEEE 85th Vehicular Technology Conference (VTC Spring), pp. 1–5, 2017 8. T. Gruber, S. Cammerer, J. Hoydis and S. TenBrink, “On deep learning-based channel decoding”, Proc. of the IEEE Annual Conference on Information Sciences and Systems (CISS), pp.1–6, 2017 9. F. Liang, C. Shen, F. Wu, “An iterative BP-CNN architecture for channel decoding”, IEEE Journal of Selected Topics in Signal Processing, pp. 144–159, 2018 10. , T. Wang, L. Zhang, S. C. Liew “Deep learning for joint MIMO detection and channel decoding”, Proc. of the IEEE 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), pp. 1–7, 2019 11. M. Honkala, D. Korpi and J. M. Huttunen, “DeepRx: Fully convolutional deep learning receiver,” IEEE Transactions on Wireless Communications, pp. 3925–3940, 2021 12. S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Trans. Knowl. Data Eng., vol. 22, no. 10, pp. 1345–1369, 2010 13. J.Choi, M. Yerzhanova, J. Park, Y. H. Kim, “Deep Learning Driven Beam Selection for Orthogonal Beamforming with Limited Feedback,” ICT Express, vol. 8, no. 3, pp. 473–478, 2022. 14. C. Zou, F. Yang, J. Song and Z. Han, “Channel Autoencoder for Wireless Communication: State of the Art, Challenges, and Trends,” IEEE Communications Magazine, vol. 59, no. 5, pp. 136–142, 2021. 15. T. J. O’Shea, K. Karra, and T. C. Clancy, “Learning to Communicate: Channel Auto-Encoders, Domain Specific Regularizers, and Attention,” Proc. of IEEE Int’l. Symp. Signal Processing and Info. Technology, Limassol, Cyprus, pp. 223–28, December 2016.
AI-Native Communications
571
16. T. J. O’Shea, T. Erpek, and T. C. Clancy, “Deep Learning Based MIMO Communications,” CoRR vol. abs/1707.07980, 2017; http://arxiv.org/abs/1707.07980. 17. A. Felix et al., “OFDM-Autoencoder for End-to-End Learning of Communications Systems,” Proc. of IEEE 19th Int’l. Wksp. Signal Processing Advances in Wireless Commun., Kalamata, Greece, June 2018. 18. S. Dorner et al., “Deep Learning Based Communication Over the Air,” IEEE J. Sel. Top. Sign. Process., vol. 12, no. 1, pp. 132–143, 2018. 19. H. Zhang et al., “Non-Coherent Energy-Modulated Masive SIMO in Multipath Channels: A Machine Learning Approach,” IEEE Internet of Things J., vol. 7, no. 9, pp. 8263–8270, 2020. 20. M. Stark, F. Ait Aoudia and J. Hoydis, “Joint Learning of Geometric and Probabilistic Constellation Shaping,” Proc. of 2019 IEEE Globecom Workshops (GC Wkshps), Waikoloa, HI, USA, December 2019. 21. V. Raj and S. Kalyani, “Design of Communication Systems Using Deep Learning: A Variational Inference Perspective,” IEEE Transactions on Cognitive Communications and Networking, vol. 6, no. 4, pp. 1320–1334, 2020. 22. H. Ye, L. Liang, G. Y. Li and B. -H. Juang, “Deep Learning-Based End-to-End Wireless Communication Systems With Conditional GANs as Unknown Channels,” IEEE Transactions on Wireless Communications, vol. 19, no. 5, pp. 3133–3143, 2020. 23. V. Raj and S. Kalyani, “Backpropagating Through the Air: Deep Learning at Physical Layer Without Channel Models,” IEEE Communications Letters, vol. 22, no. 11, pp. 2278–2281, 2018.
AI-Native Network Algorithms and Architectures Haemin Lee, Soohyun Park, Hankyul Baek, Chanyoung Park, Seokbin Son, Jihong Park, and Joongheon Kim
1 Introduction Due to the advent of various applications and intelligent devices, current networks (i.e., 4G, and 5G networks) cannot meet user traffic demands. In light of this, research and development (R&D) from industry and academia have been tasked to establish standards for the future-generation network, .6G. In conjunction with AI techniques, ML/AI-based network optimization has been applied for composing 6G network architecture realizing intelligent control and network adjustment. Therefore, this chapter reviews the AI techniques for the 6G network, including AI-native network access and resource allocation/scheduling in the medium access control (MAC) layer. We first define the two-step meaning of “AI-native” and analyze these two approaches with the latest research trends as depicted in Fig. 1. As wireless technology advances, new environments are emerging (e.g., industrial IoT, 6G, and beyond) which calls for new protocols. Key application-level drivers for future wireless access networks include the ability to handle emerging applications, such as AR/VR, intelligent sensors, and digital twins. At the same time, over the last few years, we have witnessed an unprecedented advancement in communications research toward a data-driven approach. Although the applicability and performance of AI technique vary depending on the various applications above, recent researches show the on-par performance of AI-based approaches compared to legacy model-based ones. Therefore, next-generation wireless communication
H. Lee · S. Park · H. Baek · C. Park · S. Son · J. Kim () Korea University, South Korea e-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected] J. Park Deakin University, Australia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_22
573
574
H. Lee et al.
Fig. 1 AI-native
has to embrace the widespread use of AI within the network. AI techniques can be applied to the next-generation network in two ways, i.e., AI-aided and AI-native, depending on its purpose and scope. The AI-aided optimization method aims to leverage AI to optimize the performance of specific modules in an existing network. It focuses on the performance of a specific module itself, such as optimizing a classification task or determining the action of an agent. Since most AI research is basically learning the parameters of a deep neural network (DNN), the AI-aided approach aims to apply the DNN architecture to a specific module within the entire network. The second approach is the AI-native method, which goes one step further from the AI-aided method. The AI-native method attempts to replace the functional modules constituting a communication network, such as PHY and MAC layer, or even the entire communication network with AI. This approach can adaptively optimize the network even in dynamic and undefined situations that cannot easily be covered by rule-based networks. However, to fully exploit the advantages of the AI-native method, many factors must be considered at an early stage of designing the network. Also, usability and scalability should be considered when applied to industries. To make the most of the AI-native method, it is essential to fully understand the characteristics and function of the module being applied, and stable optimization using the AI-aided method should take precedence with a sufficient amount of validations and simulations.
2 Two Steps Toward the AI-Native In this section, we dive into two-step approaches in AI-native networks. As mentioned above, AI-aided optimization is the first step, in which the module in the network is optimized using AI, and the following step, the AI-native optimization method, optimizes the entire network structure by replacing the existing rule-based network. To achieve significant performance and guarantee stable operation in the second step, it is essential to ensure that the first step is well implemented beforehand. After the successful implementation of the first step, complex signal processing blocks in the network, such as joint channel estimation, data detection, and synchronization, can be replaced or optimized within the AInative method. Someone, at this point, may raise doubts like “If the AI-aided
AI-Native Network Algorithms and Architectures
575
Fig. 2 The structure of deep residual learning-based channel estimator
optimization method shows good performance, then is the AI-native optimization method really necessary?” It is because there may be some limitations for AIaided optimization to solve problems such as data processing, data collection, and the convergence of training algorithms due to uncertainty, and the AI-native optimization method can be one solution. It replaces a number of modules, or even entire communication layers with AI. Figure 2 illustrates a simplified case of AInative network architecture where AI replaces most of the communication modules. AI-native networks can replace encoding, mapping, and modulation modules on the transmitter side and synchronization, channel estimation, equalization, demapping, and decoding modules on the receiver side with AI. In addition, many modules of the MAC layer in charge of the communication network can be replaced with AI, bringing significant gains in terms of performance. One of the most effective ways to replace an entire network with AI, rather than just replacing multiple modules in a network, is end-to-end training. End-to-end training can leverage AI to gain more autonomy and let AI design the components of the communication layer itself. Through end-to-end learning, AI learns communication network parameters (i.e., small-scale fading and large-scale fading) and module functions such as modulation methods to substitute the existing network.
2.1 AI-Aided/AI-Native Network for MAC Layer The existing resource allocation/ channel access policy in the MAC layer is defined in a model-based approach which is optimal when the underlying models are accurate. However, model-driven approaches have relatively complex algorithms in terms of computational complexity and require the acquisition of necessary side information. In addition, if the system works in a complex environment or has unknown properties, the traditional model-based method often needs to reflect the
576
H. Lee et al.
actual system and lead the given task to an optimal one. AI techniques can overcome these deficiencies by learning the input-output relationships in a data-driven manner, which we also call an AI-aided approach. This approach allows the optimization of an individual component of the MAC layer and further enables the joint optimization design. To take a step further for the future network, end-to-end learning, which we call an AI-native approach, is a potential research trend. Reinforcement learning frameworks learn and optimize under varying conditions over time to perform resource management tasks with minimal supervision. Therefore, the AI techniques for learning the optimal network access control and resource allocation on the MAC layer will be addressed in twofold to enhance the performance of the communication system.
2.1.1
AI-Aided Network for MAC Layer
Many researchers apply AI-aided methods for the MAC design under various objectives such as throughput, latency, and fairness. The related applications can be categorized into two parts: network access and resource allocation/scheduling. Random access also suffers from difficult channel estimation and time slot selection caused by ultradense and highly dynamic networks. As a result, various approaches based on ML are studied for fast and accurate channel estimation and random access in complex network environments. Authors in [8] provide a deep learning framework for massive grant-free random access in 6G cellular Internet of Things (IoT) networks. A model-driven deep learning algorithm for joint activity detection and channel estimation is proposed based on the principle of approximate message passing. It does not require any prior information about active probabilities and channel variance and can significantly improve the performance with a finite number of training data. Simulation results validate the effectiveness of the proposed deep learning algorithm. The authors in [9] propose a reinforcement learning (RL)based random-access scheme in non-orthogonal multiple access (NOMA)-based massive IoT networks. In the proposed scheme, it is assumed that base station (BS) can observe channel state information and decides on appropriate transmission power and random-access time slot based on the observation. To maximize the average performance of sum rates, this work uses DRL (deep reinforcement learning) and SARSA (state-action-reward-state-action) learning. Conventional resource allocation problems use methods such as greedy algorithms, game theory, or convex optimization problems. However, beyond the 5G network, the advent of a dense network, hyper-connectivity, fast mobility, and high requirement of quality of service (QoS) makes the existing optimization method challenging. As a result, AI-aided approaches are being actively studied [13]. Research in [10] focuses on the radio resource allocation problems in multi-cell networks using deep learning (DL). The supervised DL model solves the subband and power allocation problem. The authors [11] propose an energy-efficient
AI-Native Network Algorithms and Architectures
577
Fig. 3 Resource allocation using the RL framework
resource allocation strategy using a convolutional neural network (CNN) model. The model applies the copula theory to learn and evaluate the utility and energy constraints. The trained model achieves efficient resource allocation that maximizes channel capacity and minimizes power consumption, which is for umMTC and ELPC scenarios in a massive 6G IoT environment. DRL-based research also has been studied for resource management. In [11], the power control method is investigated in cognitive radios for spectrum sharing, guaranteeing the QoS (quality of service) requirements of PU (primary user) and SU (secondary user). In the scenario, PUs and SUs share the same spectrum according to the adjusted amount of allocated power. The proposed scheme first constructs an information interaction model among PU, SU, and wireless sensors where the RSS of the wireless sensors are spatially distributed to help the PU obtain the power allocation information of SU. Then, through the proposed A3C-based power control, multiple SUs learn the power allocation scheme simultaneously. The authors in [12] propose an RL-based time division duplex (TDD) uplink/downlink resource allocation technique for highmobility UE environments as depicted in Fig. 3. In the RL model, the agent selects a TDD configuration that can maximize channel utility based on the observed channel state when UEs move at high speed. Research in [6] proposes a deep reinforcement learning (DRL)-based adaptive video streaming system that employs infrastructures on the road, i.e., small mBSs (mmWave base stations). The proposed streaming system learns a policy of proactively transmitting the desired video chunks from the MBS (macro base station) to mBSs before users actually request them, depending on time-varying network states.
578
2.1.2
H. Lee et al.
AI-Native Network for MAC Layer
As discussed above, the idea of end-to-end learning can be extended to the MAC layer, where it would be desirable to formulate optimized resource allocation schemes and channel access policies depending on the use case and environment. In this context, a set of rules or guidelines between radio nodes to exchange message data can be understood as learning a language [7]. Learning to communicate (L2C) is a technique that is being actively being studied due to significant developments in deep multi-agent reinforcement learning (MARL), and these techniques can also be used for training wireless devices to learn communication protocols. In other words, user equipment (UE) can be trained to learn the .5G MAC protocol. This includes learning to interpret the different control messages received from the base station (BS) (e.g., discontinuous reception [DRX], timing advance [TA]), as well as learning what to send in the uplink (e.g., buffer status report [BSR], power headroom report [PHR]) [1]. And it can also solve the problem of optimally multiplexing resources for communication and detection. Research in [1] shows how the agent uses machine learning to automatically discover the communication protocols to coordinate their decision under limited bandwidth channels. This research investigates a set of multi-agent benchmark tasks that require full coordination for sequential decisions in a partially observable situation which means that all agents share the goal of maximizing the same discounted amount of compensation, and no agent can observe the underlying Markov state. However, each of them receives a partial personal observation correlated with that state. Agents are required to discover communication protocols that can coordinate their behavior and achieve a common goal under partial observability conditions and limited channel capacity. These settings are regarded as centralized training for decentralized execution (CTDE), where agents are trained offline using centralized information but execute in a decentralized manner online, which has gained popularity in the multi-agent reinforcement learning community. In particular, actor-critic methods with a centralized critic and decentralized actors are a common instance of this idea. According to [1], two approaches can be used to address this setting on CTDE as shown in Fig. 4. The first approach, reinforced interagent learning (RIAL), uses Q-learning with a recurrent network to address partial observability. It is to combine deep recurrent Q-network (DRQN) with independent Q-learning for action and communication selection. The Q-network of the agent is defined as .Qa (ota , mat−1 , hat−1 , ua ), where .ota , .hat−1 , and .mat−1 is observation, individual hidden state, and messages from other agents, respectively. The second approach is differentiable interagent learning (DIAL) which shares parameters and delivers gradients from one agent to another through the communication channel. DIAL can be trained end to end across the agents. Similarly, researchers in [2] address whether radios can pre-learn a given target protocol with the research trend that mainly focuses on new channel access policies using machine learning. Cellular radios are trained to produce a channel access policy that performs optimally under the constraints of the target protocol. In other words, they are interested in which MAC protocol is optimal for a given use case
AI-Native Network Algorithms and Architectures
579
Fig. 4 RL-based communication. (a) RIAL-RL based communication. (b) DIAL-Differentiable communication
and whether it can be learned through experience. In [2], the MAC learner is a learning agent with two main functions (i.e., wireless channel access and automatic repeat request [ARQ]). To replace the MAC layer in a mobile UE, an expert MAC uses the traditional design-build-test-and-validate approach. Then, several learners are concurrently trained in a mobile cell to jointly learn a channel access policy and the MAC signaling to coordinate channel access with the BS. The BS uses an expert MAC that is not trained. The proposed learning framework uses multiagent reinforcement learning (MARL) and L2C techniques to achieve this goal with gains over expert systems. For any given UE, the observations are different from others which makes the problem a partially observable Markov decision process (POMDP). This POMDP is formulated as a cooperative Markov game, where all learners are cooperatively trained, receiving exactly the same reward from the environment. Figure 5 shows the MARL architecture with a set of all MAC learners. Every time step t, the MAC learner invokes an action on the environment, receiving a reward and an observation. The actions of all learners are aggregated .t + 1 into a joint action vector, and then the environment executes joint action and delivers the same reward with independent observations to all learners. The BS also receives a scalar observation following the execution of the joint learner action. Research in [3] also proposes a framework that the agents learn the channel access policy and the signaling policy. Their proposed framework exploits the multiagent deep deterministic policy gradient (MADDPG) algorithm with CTDE setting, enabling BS and UE to devise MAC protocols in multiple-access scenarios. In the same manner, the problem can be formulated as a MARL cooperative task, where the BS and UEs become UE MAC agents and the BS MAC agent, respectively. These agents need to learn to cooperate in order to deliver data across the network. Each agent has an actor network that relies only on its own agent’s state to learn a decentralized policy. During the training, each agent has a centralized critic that receives states and actions from all agents to learn a joint value function.
580
H. Lee et al.
Fig. 5 MARL system model with one BS and two UEs
There is another CTDE framework for novel distributed channel access (DCA) schemes. The proposed QMIX-advanced listen-before-talk (QLBT) MAC protocol exploits the overall information of all agents during training. Each agent decides the optimal channel access behavior based on its local information [4]. The centralized training is performed at the AP side based on the experiences consisting of joint action observation history, joint action, global environment state, and reward. In this process, they leverage a well-known MARL algorithm, QMIX, by putting an extra individual Q-value for each agent in the mixing network apart from the original total Q-value, which makes QLBT more stable. After training, AP sends out the agent network parameters to its corresponding stations. In decentralized execution, each station determines whether to access the channel independently based on its own action-observation history. The QLBT MAC protocol outperformed CSMA/CA and proved its various performance-bound scenarios, including saturated traffic, unsaturated traffic, and delay-sensitive traffic.
2.1.3
AI Application for MAC Layer
This section explores the UAV (unmanned aerial vehicle) coordination using MARL. UAV communication is one of the growing fields in future communication
AI-Native Network Algorithms and Architectures
581
Fig. 6 CommNet structure
networks. Communication for cooperative operation between unmanned vehicles is not limited to protocol design for channel access or signaling policy and can also be seen as the application of AI technology in the MAC layer. Most previous research on UAV management studies solve the problem of centralized optimization. However, with the natural traits of highly dynamic and distributed UAV-enabled networks, machine learning-based approaches to solving the problem are effective. In addition, most UAVs, satellites, etc. solve cooperative behavior between multiple agents (autonomous trajectory optimization) through reinforcement learning. Work in [5] assumes that a UAV-based network system’s partially accessible multi-agent systems has different information. Thus, this chapter builds on the existing issues by creating a novel algorithm that exchanges partially observed information from a particular UAV to other UAVs in an uncertain and constantly varying environment. They configure the autonomous surveillance resilience system based on a multiagent DRL (MADRL) scheme called communication neural network (CommNet), which is presented in Fig. 6. The proposed scheme aims to optimize the energy consumption of UAVs and trajectories that strengthen surveillance performance, which means deploying the surveillance UAVs to cover the area with the most number of users. CommNet enables communications among multiple agents using a single deep neural network. Unlike the conventional CommNet with a central server that collects the distributed agents’ information, [5] suggests one of the UAV agents as a leader UAV that collects and distributes other agents’ information. In addition to existing works related to UAVs, the CommNet-based MADRL scheme can be applied to the wireless communication field. In the environment where multi-UAV-BS (UAV-base station) provides wireless communication service to UEs in the air, UAV-BSs need to cooperate for high-quality communication service. All UAV-BSs also consider the uncertainty of the environment, and they can only observe the limited environment information due to the physical limitation of the UAV-BS model. As mentioned above, UAV-BS’s observations differ, and they need to learn extensive information about the environment. Therefore, CommNet
582
H. Lee et al.
Fig. 7 RL-based COA attack
helps the UAV-BS learn more information by sharing information about other UAVBS. Another application is COA (course-of-action) attack search for network security, illustrated as in Fig. 7. With the emergence of large-scale complex networks and the possibility of multiple cyber threats, the importance of cybersecurity has grown. To avoid probable cyberattacks, it is essential to identify the attackers’ activities, and this process has been widely conducted in the security and privacy field under the name of attack COA. There are passive and automatic methods of COA attack detection techniques. Passive methods require expert participation and have limitations in time and cost, whereas methods like attack tree [14], attack graph [15], and game theory [16] automatically perform COA attacks. However, the performance of existing automatic COA algorithms is not always guaranteed in uncertain network environments. To resolve the above issue, RL-based communication methods have been suggested such that optimal attack paths can be learned even in a dynamic environment with uncertainty. Among the various RL algorithms, POMDP is considered as an applicable candidate for automated COA. POMDP is an RL algorithm that considers whether the agent should communicate when the environment is unknown. Using POMDP for a COA Attack has the benefit of being able to naturally understand various network elements, such as imperfect knowledge of the network configuration, relationships between firewalls, and multiple attack methods [17]. As a result, to create an assault on the whole network, POMDP can discover the most effective attacks to launch against specific devices [17].
AI-Native Network Algorithms and Architectures
583
3 Future Opportunities This section explores the challenges and opportunities of AI-aided and AI-native methods in a 6G network. The AI-aided/AI-native approach can overcome the existing difficulties of the traditional model-based method in which it is hard to reflect the complex real-world environment by simply learning the input-output relationships in a data-driven manner. In other words, the AI approach can reduce the signaling overheads, standardization, and development efforts of today for the highly complex radio technologies of the following decades. Moreover, the ability to learn new communication protocols opens new pathways to radio systems that were previously highly tailored to deployment environment [7], satisfying the higher requirement of 6G applications. However, in the AI approach, it is hard to interpret or analyze the performance, which causes problems in making the system more developable and scalable. It should be possible to prove that the performance does not change radically and will always remain stable, which is an open issue to be considered.
4 Concluding Remarks We are in the transition stage of optimizing communication networks with AIbased methods and is aiming to design an AI-native network varying from the traditional predefined modeling of wireless communication networks. AI will take the center stage with the 6G network in the ongoing paradigm shift to enhance every aspect of our lives and the fabric of our society. Hence, researchers in industry and academia provide a two-step strategy, i.e., AI-aided and AI-native, for further direction to adjust an AI-native network successfully. Throughout this chapter, we reviewed the two-step strategy and introduced the network optimization research based on it, especially in the MAC layer. Also, we present several AI-based applications, including UAV networks, wireless networks, and network security. Lastly, we explore the future opportunities and challenges of AI-native networks. In conclusion, we believe that AI-aided/native method enables the best performance for a dynamic situation and will be a key component of future 6G network.
References 1. J. Foerster, I. A. Assael, N. De Freitas and S. Whiteson, “Learning to communicate with deep multi-agent reinforcement learning,” Advances in neural information processing systems, pp. 2137–2145, 2016. 2. Valcarce, Alvaro, and Jakob Hoydis. “Toward Joint Learning of Optimal MAC Signaling and Wireless Channel Access.” IEEE Transactions on Cognitive Communications and Networking, vol. 7, no. 4, pp. 1233–1243, 2021.
584
H. Lee et al.
3. M. P. Mota, A. Valcarce, J. M. Gorce and J. Hoydis, “The emergence of wireless mac protocols with multi-agent reinforcement learning,” in Proceedings of IEEE Globecom Workshops, pp. 1–6, December 2021. 4. Z. Guo, Z. Chen, P. Liu, J. Luo, X. Yang and X. Sun, “Multi-agent reinforcement learningbased distributed channel access for next generation wireless networks,” IEEE Journal on Selected Areas in Communications, vol. 40, no. 5, pp. 1587–1599, 2022. 5. W. J. Yun, S. Park, J. Kim, M. Shin, S. Jung, A. Mohaisen and J. H. Kim, “Cooperative MultiAgent Deep Reinforcement Learning for Reliable Surveillance via Autonomous Multi-UAV Control,” IEEE Transactions on Industrial Informatics, 2022. 6. W. J. Yun, D. Kwon, M. Choi, J. Kim, G. Caire and A. F. Molisch, “Quality-aware deep reinforcement learning for streaming in infrastructure-assisted connected vehicles,” IEEE Transactions on Vehicular Technology, 71(2), 2002–2017, 2021. 7. J. Hoydis, F. a. Audia, A. Valcarce and H. Viswanathan, “Toward a 6G AI-native air interface,” IEEE Communications Magazine, pp. 76–81, 2021 8. Y. Qiang and X. Shao and X. Chen, “A model-driven deep learning algorithm for joint activity detection and channel estimation,” IEEE Communications Letters, vol. 24, no. 11, pp. 2508– 2512, 2020. 9. W. Ahsan, W. Yi, Z. Qin, Y. Liu and A. Nallanathan, “Resource allocation in uplink NOMA-IoT networks: A reinforcement-learning approach,” IEEE Transactions on Wireless Communications, vol. 20, no. 8, pp. 5083–5098, 2021. 10. K. I. Ahmed, H. Tabassum and E. Hossain, “Deep learning for radio resource allocation in multi-cell networks,” IEEE Network, vol. 33, no. 6, pp. 188–195, 2019. 11. A. Mukherjee, P. Goswami and M. A. Khan, L. Manman, L. Yang and P. Pillai, “Energyefficient resource allocation strategy in massive IoT for industrial 6G applications,” IEEE Internet of Things Journal, vol. 9, no. 7, pp. 5194–5201, 2020. 12. F. Tang, Y. Zhou and N. Kato, “Deep reinforcement learning for dynamic uplink/downlink resource allocation in high mobility 5G HetNet,” IEEE Journal on selected areas in communications, vol. 38, no. 12, pp. 2773–2782, 2020. 13. Y. Kim, S. Ahn, C. You, S. Cho, “A Survey on Machine Learning-based Medium access control technology for 6G requirements,” Proc. of the IEEE Region 10 Symposium (TENSYMP), Jeju, Korea, Republic of, pp. 1–4, August 2021. 14. V. Nagaraju, and L. Fiondella, and T. Wandji, “A survey of fault and attack tree modeling and analysis for cyber risk management,” Proc. of the IEEE International Symposium on Technologies for Homeland Security (HST), MA, USA, pp. 1–6, April 2017. 15. N. Ghosh, and S. K. Ghosh, “A planner-based approach to generate and analyze minimal attack graph,” Applied Intelligence, vol. 36, no. 2, pp. 369–390, 2012. 16. X. Liang, and Y. Xiao, “Game theory for network security,” IEEE Communications Surveys & Tutorials, vol. 15, no. 1, pp. 472–486, 2012. 17. C. Sarraute, O. Buffet, and J. Hoffmann, “POMDPs make better hackers: Accounting for uncertainty in penetration testing,” AAAI Conference on Artificial Intelligence, Ontario, Canada, July 2012.
Pareto Deterministic Policy Gradients and Its Application in 6G Networks Zhou Zhou, Yan Xin, Hao Chen, Charlie Zhang, Lingjia Liu, and Kai Yang
1 Introduction The global increase of mobile users as well as massive newly formed wireless connections (e.g., autonomous driving cars and Internet of Things) have raised unprecedented challenges to cellular communications systems. Rolling out AI/MLbased driven forces for network management is the primary direction in the sixth-generation cellular system (6G) [1]. The concept of self-organized network (SON) is summarized in [2] to address this technique trend, where network techniques are enabled with self-configuration, self-healing, and self-optimization functionalities. To achieve these goals, the techniques are anticipated to handle the coupling relations between resource allocation, handover management, interference cancellation and coverage optimization, etc. As a paradigm of the network self-optimization, we consider an automatic way to jointly optimize cell load balance [3] and network throughput. Specifically, the realization is required to be as consistent as the operation mode of current cellular systems (i.e., with a smooth evolution). We observe that the inter-cell handover (HO) criteria and massive MIMO [4] antenna angles are often static in 5G systems [5]. Motivated by this fact, we consider exploiting the freedom of changing these parameters, i.e., we aim to design an online policy for updating the inter-cell
Z. Zhou () · L. Liu ECE Department, Virginia Tech, Blacksburg, VA, USA e-mail: [email protected]; [email protected] Y. Xin · H. Chen · C. Zhang Samsung Research America, Plano, TX, USA e-mail: [email protected]; [email protected]; [email protected] K. Yang ECE Department, Tongji University, Shanghai, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_23
585
586
Z. Zhou et al.
handover thresholds and massive MIMO antenna tilt angles. More importantly, the optimization objective is no longer considered as a monopolized metric but with multiple ones. On the opposite, the non-self-optimization approach is referred to as static optimization approaches where the network parameters are configured according to a pre-assumed model or long-term statistics of the environment. These analytical methods are often unextendable to online fashions due to the bottlenecks of modeling the network dynamics and developing feasible solvers [6]. From the experiments in [7], it demonstrates that directly using the results from a static optimization approach to a dynamic scenario can lead to 85% to 97% handover failures and connection outages. Also, the negative impacts of model mismatch (applying longterm measurement-based strategies to dynamic situations) are discussed in [8]. Accordingly, we can summarize the practical challenges of using these nonadaptive approaches as follows: • Model-based utility optimization cannot extensively incorporate all hidden factors into a unique analytical framework. • Even explicit factors cannot be precisely characterized or timely measured, such as user mobility and the relation between antenna tilt angles and reference signal receiving power (RSRP). • Usually, a non-convex batch-based optimization problem has to be solved at the end due to the mixed-integer features of network parameters (here, the antenna tilt angles are discrete variables due to hardware constraints). The non-convex feature makes the static formulation-based approaches being intractable online. To adaptively adjust network parameters with light efforts, dynamic programming [9] as well as heuristic approaches [10] can be considered. However, these methods often rely on environment assumptions or have uncertainties of their optimality. The randomness of user mobility often results in fluctuated cell loads and frequent handover [11] which is the hardest part to characterize. To alleviate any assumptions on the environment, we utilize a reinforcement learning (RL) framework for the online algorithm design. The success of the RL mechanism has been justified in different tasks, such as computer games [12], Atari [13] and StarCraft II [14], chemical reactions [15], and image captioning [16] which is our primary motivation of applying RL-based optimization approaches. In our RL approach, a central agent tracks the user measurements and react to favor the predefined objectives to adjust cell individual offsets (CIOs) and BS antenna tilt angles. More importantly, the RL actions are anticipated to perform in a proactive way rather than reactive, i.e., the actions are taken into account of the prediction on user mobility. In this chapter, we introduce a multi-objective RL method and verify the effectiveness of using our measurement data associated with a comprehensive system-level simulator. The main contributions of this chapter are as follows:
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
587
• A new operational mode: Our innovative combination of network handover management and massive MIMO antenna tilt is a hierarchical approach for user mobility management. The tilt actions essentially allow a broadcast beam adaption for downlink transmission. The handover management which determines user association is considered as a small-scale adjustment since it especially adheres to the cell edge users. This functional separation is explicitly mapped to the network hardware components which enables enhanced flexibility to the network configuration. • A new algorithm: Most existing RL algorithms can only accumulate scalar rewards for a single-objective optimization. However, our method can learn through vector rewards thus handling multiple-objective optimization online. To the best of our knowledge, it is the first Pareto-optimization-based RL algorithm with continuous action space. Accordingly, this extension allows the underlying policy network to generate mixed actions (i.e., both continuous and discrete actions, where the latter can be easily obtained through quantization). Therefore, the policy on controlling handover threshold (continuous variable) and antenna tilt angles (discrete variable) can be jointly optimized. • Extensive evaluations: We evaluate our method based on a comprehensive simulator which is designed according to 3GPP standards on air interface and access networks [17, 18]. The user-side power distribution is obtained from our measurement data in a real city environment. Moreover, we include hardware constraints to the network parameters, such that the massive MIMO antenna tilt and handover are only allowed to be tuned periodically. Moreover, for selfcontained evaluation, we formulate the multi-objective optimization problem through a static perspective. Accordingly, brute-force search-based methods are introduced as the benchmark: We give these static methods full access to the user mobility and at the same time relax the periodic action constraints to them. Our results demonstrate that the RL method performs fairly well as the static approaches. In addition, our introduced RL framework can be generalized to handle more than two objectives as well as supporting vector reward with arbitrary dimensions. Also, the action operation can be extended to an asynchronous manner which is considered as a promising direction for our future work. The rest of this chapter is organized as follows: In Sect. 2, we will review the network optimization methods including both model-based static/online optimization and learning-based approach. In Sect. 3, we will introduce the background on our network configuration including massive MIMO antenna tilt, handover management, and network utilities. Section 4 will elaborate on our vector rewardbased multiple-objective reinforcement learning methods. Evaluation results are presented in Sect. 5, where the details of a static solver for the joint optimization will be put in the Appendix as a comparison benchmark. Finally, Sect. 6 will conclude the chapter.
588
Z. Zhou et al.
2 Related Work Cell load balancing is often considered as the primary objective in network optimization. Ye et al. [9] introduced a near-optimal distributed solver in heterogeneous networks. In [19], an optimizer of user association in massive MIMO systems is proposed. However, the resulting analytical solution is based on a perfect assumption of the massive MIMO channel and without any specific MIMO operations, such as precoding/beamforming. Meanwhile, [20] introduced a downlink precoding method for mobility load balance. But this method does not include the optimization for user association. Moreover, a stochastic geometry-based analytical approach is proposed in [21]. While this method can solve a static user association distribution, it cannot be applied online. Meanwhile, it relies on assumptions of user mobility. Overall, the aforementioned approaches are limited to ideal scenarios and with high computational complexity. In current cellular systems, the user association is defined by handover events. Thus, the method [22] containing adaptive optimization on handover parameters by using reference signal receiving power (RSRP) measurements is more relevant. This method is comprised of load prediction and resource adjustment. Indeed, it offers a practical framework for user association optimization. However, it is a heuristic approach, where the optimality cannot be justified. Rather than the aforementioned single-objective or single-variable-based optimization, [23] considers optimizing load balance and maintaining throughput at the same time, where the solver is formulated as an integer optimization. To overcome the computational bottlenecks of this NP-complete problem, online-based multipleobjective optimizations are proposed in [24, 25]. Son et al. [24] introduced jointly solving the frequency reuse and load balance via changing user association. In addition, the method in [25] considered combining the objectives on throughput maximization and load balance. The result shows that it can achieve .60%– .67% optimality. However, these methods often assume ideal RSRP measurements without handover delay. Although the previous methods offer certain strategies on network parameter optimization, the performance is vulnerable to environment variation due to model mismatch. Alternatively, learning-based approaches were introduced: [26] proposed a Q-learning-based algorithm to adaptively adjust cell individual offsets (CIOs), which is shown to be superior to a CIO adjustment strategy with fixed step size. Mwanje et al. [27] proposed a mobility load balance algorithm using generalized Q-learning. On the other hand, [28] introduced a framework that operates the handover decision merely through user-side calculation. Each user is configured with a reinforcement agent which aims to minimize transmission outage and cell load fluctuation. Xu et al. [29, 30] introduced using RL for CIO control, where the RL observation is well-designed environment features which are demonstrated to be able to effectively handle the curse of dimensionality. Alternatively, an asynchronous multi-agent RL framework is proposed to deal with a large number of users in [31]. Rather than RL-based approaches, [32] introduced a feed-forward
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
589
neural network-based supervised learning approach, where the input is directly chosen as the point cloud comprised by user locations and power measurements. Accordingly, the network output is chosen as the ideal user association. However, the operations rely on users’ real-time feedback to BSs. Meanwhile, predefined training set labels on the optimal user association is required. Considering the pros and cons of previous methods, we employ a deterministic poly gradient-based RL method [33] to handle the CIO adjustment as well as massive MIMO antenna tilting. Regarding multi-objective RL techniques, [34] extended using scalar reward to vector reward-based Q-learning approach. However, the action space is restricted as discrete values. In addition, a simple treatment on vector reward by transforming it to scalar reward using scalarization function can be found in [35]. Nonetheless, this method relies on handcrafted scalarization functions and cross-validation over multiple policies. Van Seijen et al. [36] proposed an RL framework by decomposing a single reward into multiple rewards to accelerate the convergence of Q-learning. Moreover, this method aims at optimization on discrete variables rather than continuous variables.
3 Network Scenario In this section, we will introduce our network layout and utilities. We consider N base stations (BSs) distributed in a city area. For ease of discussion, we denominate the coverage of one BS as a cell, where cells are geographically adjacent to each other, such that Fig. 1 depicts a case of four cells. Our RL agent controls the parameters of the adjacent BSs in a centralized way. We assume there are K users moving around this area. We let t represent the system time index, k be the user index, and n be the BS index. Fig. 1 Network layout
590
Z. Zhou et al.
Fig. 2 Massive MIMO antenna tilt
3.1 Massive MIMO with Antenna E-Tilt For each BS, it is deployed with a 2D antenna array to facilitate the massive MIMO operation which can offer an extra degree of freedom to the access network. More than operating beamforming towards narrow directions, BS is enabled with an antenna electrical tilt (E-tilt) function as shown in Fig. 2, where the antenna tilt contains both elevation and azimuth directions. The tilt action essentially allows a broadcast beam adaption for downlink transmission. Thus, the adjustment of BSs’ antenna tilt angles can alternatively change the cell coverage. Clearly, the received power by user k from BS n at time instant t is a function of the tilt angle, which we denote as .pn,k (bn (t)). .bn (t) comes from a predefined dictionary .{θ0 , θ1 , · · · , θM−1 }, with .θm being a combination of azimuth and elevation angles. For simplicity, we collect all BSs’ tilt angles as a vector .b(t). Remarks • The downlink received power .pn,k (bn (t)) can be characterized as a function of the channel realization. However, it is challenging to give an analytical expression of .pn,k (bn (t)) using both tilt angles and channel coefficients. In our simulation, .pn,k (bn (t)) is determined via our measurement data set of RSRP. p (b (t)) The received signal to noise ratio (SINR) is defined as . n,kp n (b (t)) , where n ==n
n ,k
n
n is assumed as the serving cell. It is jointly determined by the tilt angles of both serving cell and interference cells. In our system, SINR is first mapped to the channel quality indicator (CQI) and then reported to the BS in a periodical approach. • Compared to downlink beamforming, the antenna tilt is a slower action due to hardware constraints. Meanwhile, the tilt action impacts a wider range of users such that it adjusts the cell coverage.
3.1.1
Handover Management
The handover process is introduced to switch the user association between cells. Due to user mobility, handover happens when the signal quality from neighbor
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
591
cells is better than that of the serving cell. In our system, users report their RSRP measurement to the serving BS; thereafter, the BS determines the handover event according to certain criteria. Particularly, we consider A3-event-based inter-cell handover which is defined as pn ,k (t) − pn,k (t) > On ,n (t) + Hys
.
(1)
where n represents the serving cell and .n represents the neighbor cell; .On ,n (t) is the cell individual offset from cell n to .n which is usually symmetrical to .On,n (t) such that .On ,n (t) = −On,n (t); .Hys is a hysteresis parameter which is set as a constant to avoid frequent handover. When we increase the value of .On,n (t), the number of handover users can be reduced. Accordingly, the cell load will change, and vice versa. Therefore, adjusting CIO can control the cell load which motivates us to consider it as a parameter in our optimization framework. For ease of discussion, we stack the CIOs as a matrix .O(t).
3.2 Network Utilities Given user SINR and association, we define the following network utilities.
3.2.1
Cell Load
We define .lk (t) := min{Ck (t)/rk (t), llimit } as the load/bandwidth occupation from user k per the unit of physical resource blocks (PRBs), where .rk (t) is the rate of user k which is calculated by mapping through a CQI table adopted in the system; .Ck (t) is the traffic of UE k, and .llimit is the maximum load allowed for each user. Here, .Ck (t) is a pre-given parameter rather than a variable to optimize. For cell n, the cell load is given by Ln (I (t), b(t)) =
.
In,k · lk (t),
(2)
k
where .In,k is the identification for user k’s association to BS n: .In,k (t) = 1 when BS n is associated with user k, .In,k = 0, otherwise. Note that although we define user association rather than CIOs as variables of the cell load, we still choose CIOs as the intermediate parameters in our online optimization. Moreover, the load metric is often normalized by the maximum PRBs per cell.
592
3.2.2
Z. Zhou et al.
Cell Throughput
Given user association, user rate, and user load, we can sum them together to obtain the network throughput, Rn (I (t), b(t)) =
.
In,k · rk (t) · lk (t).
(3)
k
Comparing (3) to (2), we see (3) is a weighted summation of the same terms in (2) which indicates a coupling relation between cell load and throughput. Intuitively, the coupling relation between cell load balancing and throughput maximization can be observed as follows: We suppose users are unevenly distributed in a bandwidth-limited system. To achieve a high throughput (an increase of . n Rn ), every BS tends to connect to high-rate users. Therefore, users are likely to be selected by their nearest BSs. However, this strategy can result in an increase of the peak load in some cells. Consequently, the available PRBs in the highload cells can potentially be lower than the user-needed PRBs which will limit the increase of network throughput. Conversely, if BSs hand over some high-rate users to neighbor cells to balance the cell load, a high network throughput can potentially be achieved. This is because neighbor cells can allocate more PRBs to low-rate users to compensate for the power loss without sacrificing the overall throughput. Thus, choosing proper user association (via CIO adjustment) and power allocation (through antenna tilt) are important to cell load balancing as well as network throughput maximizing. Moreover, it is important to notice that choosing load balancing as the sole objective is not enough for the optimization. This is because balancing cell load can trigger user link dropping which decreases the overall network throughput. Therefore, the load balancing and throughput maximizing are jointly considered as the objective in this chapter. Moreover, when the network is operating with a low peak load, it is more robust to handle user mobility anomalies. For ease of discussion, we summarize the introduced notations in Table 1.
4 The Introduced Approach In this section, we first introduce some important concepts of reinforcement learning. Then, we will introduce an RL formulation of the online joint optimization between cell loads and throughput. Finally, we will present the details of our algorithm flow.
4.1 Preliminary on Reinforcement Learning From the perspective of online optimization, reinforcement learning is considered as an approximation to dynamic programming, where the major difference is RL
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
593
Table 1 Notations Symbols N K T
Units N/A N/A h/s
.bn (t)
N/A N/A
.In,k (t)
.I (t) .b(t) .Ln (I (t), b(t))
or .Ln (t)
.l(t)
.Rn (I (t), b(t))
N/A N/A PRB PRB
or .Rn (t) or .U (t) .pn,k (bn (t)) or .pn,k (t) .rk (t) .lk (t) .llimit .Ck (t)
bit/s N/A mW bit/s/PRB PRB PRB bit/s
.On,n (t)
mW mW N/A N/A
.U (I (t), b(t))
.O(t) .en (t) .e(t)
Definition The number/set of BSs; also the set of BSs The number/set of users; also the set of users The antenna tilt period; also the RL action period as well as the time index set The antenna tilt index of cell n Association identification from user k to BS n: .In,k (t) = 1 when BS n is associated with user k, .In,k = 0, otherwise User association matrix A vector by stacking the tilt index of all BSs The load of cell n A vector by stacking the load of all cells: .l(t) = [L1 (t), L2 (t), · · · , LN (t)] The throughput of cell n The trade-off utility of the overall network The kth user’s received power from the nth cell The rate of user k, a function of p The load of user k, .lk (t) := min{Ck (t)/rk (t), llimit } The maximum load constraint for each user The traffic of user k, also known as the bit rate requirement The CIO parameter between cell n and cell .n The CIO matrix The percentage of cell edge users of cell n A vector: .e(t) = [e1 (t), e2 (t), · · · , eN (t)]
can operate without a mathematical model of the environment. There are two components in the RL framework: an environment and an agent. RL algorithms control the agent to interact with the environment and accumulate rewards from the environment. The RL operation can be generally described as a Markov decision process (MDP) which is a 4-tuple . S, A, Pa (s(t)s(t −1), ra (s(t)s(t −1) , where • S represents the states which are an interface from the environment to the RL agent. • A is the action set at the RL agent. The RL agent takes actions to affect the environment and then receives a change of the states. The action is taken according to a policy denoted as .π. • .Pa (s(t)s(t − 1)) represents the station transition probability from .s(t − 1) ∈ S to .s(t) ∈ S with action .a ∈ A. • .ra (s(t)s(t − 1)) is the immediate reward after action .a is conducted.
594
Z. Zhou et al.
The Markov property of the states is described via the state transition probability Pa (s(t)s(t − 1)). The RL agent takes actions in discrete time steps such that at time t, the agent observes a state .s(t). Then, it selects an action .a(t) ∈ A according to a policy .π . The states are updated to .s(t + 1) associated with sending the RL agent a feedback reward .ra(t) (s(t + 1)s(t)). Accordingly, this transition can be stacked as a 4-tuple . s(t), a(t), ra(t) (s(t + 1)s(t)), s(t + 1) , namely, a record from the RL experiments. More importantly, the underlying state transition .Pa (s(t)s(t − 1)) and reward .ra (s(t)s(t − 1)) cannot be analytically characterized in most scenarios. Thus, RL-based approaches fully rely on the samplings from these distributions. The objective of an RL agent is to accumulate more rewards from the environment in the near future. To achieve this goal, the RL algorithm has twofold features: (1) It can learn a policy which can offer good received rewards, and (2) it can learn a criterion to evaluate whether the policy is good enough. Here, the criterion is named as “value function” which estimates the future accumulative rewards as a consequence of using policy .π from a state .s, .
Vπ (s) = Eπ
∞
.
γ ra(t) (s(t + 1)s(t))s(0) = s , t
(4)
t=0
where .0 ≤ γ ≤ 1 is a discounted factor introduced to characterize the uncertainties on the future environment and avoid cyclic rewards, and .Eπ is taken over the states, rewards, and policy, where the policy is usually formulated as a conditional probability: .π(a(t)s(t)). This is because of the fundamental trade-off between exploration and exploitation in online learning: The exploitation aims to take the best action based on current experiments; the exploration is to gather more information from the environment. Some standard techniques to achieve this tradeoff are . greedy, decayed . greedy and softmax, etc. [37]. With a slight modification to the value function, we can define the action-value function [13], Qπ (s, a) = Eπ
∞
.
γ ra(t) (s(t + 1)s(t))s(0) = s, a(0) = a t
t=0
= Eπ ra (s(1)s) +
∞
.
(5)
γ ra(t) (s(t + 1)s(t)) t
.
(6)
t=1
= E [ra (s(1)s) + γ Qπ (s(1), a(1))] ,
(7)
where the expectation in (7) operates over the state .s(1) and action .a(1). We also see that (7) as a recursive form of the action-value is an unbiased estimation of .Qπ (s, a). .Qπ (s, a) together with (7) is known as the Bellman equation. In general, when we set the starting time in (5) as t, we can immediately attain an expression of .Qπ (a(t), s(t)).
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
595
Moreover, we can observe that the calculation of the Q function requires the statistics in the future, which is a noncausality formulation. Therefore, to evaluate the current states and actions, we have to first estimate the Q function. This is often achieved by solving the Bellman equation. In many RL algorithms, the solver is based on minimizing a temporal difference (TD) [38] which is defined as T D(Q)=l(Qπ (s(t), a(t))−ra(t) (s(t + 1)s(t))−γ Eπ [Qπ (s(t + 1), a(t + 1))]) (8)
.
where l represents a predefined loss function, such as .E| · |2 , and .π and .Q are named as target policy and target value function, respectively. In this chapter, we consider deterministic policy-based optimization. Therefore, the inner expectation .Eπ can be drooped. Then, the TD objective becomes T D(Q) = l(Qπ (s(t), a(t)) − ra(t) (s(t + 1)s(t)) − γ Qπ (s(t + 1), a(t + 1))). (9)
.
We may further notice that the variable of T D is Q rather than .Qπ . This is because we employ an off-policy formulation which allows learning the value function and policy in a decoupled way. In general, off-policy can significantly improve data efficiency and offer a fast convergence [39]. On the other hand, on-policy learning is often leveraged on .Qπ = Qπ or .a(t + 1) ∼ π in (8). The multiple-objective reinforcement learning is defined as a total value function which can be decomposed as a linear combination of multiple individual value functions. Such as in the case of two, Vπ (s) = w1 Vπ(1) (s) + w2 Vπ(2) (s)
.
(1)
(10)
(2)
where .Vπ (s) and .Vπ (s) are two separate value functions following the same definition in (4); .w1 and .w2 are weights satisfying .w1 + w2 = 1.
4.2 RL Formulation We follow the routine of MDP to formulate our network parameter optimization.
4.2.1
Action Set A
The action set is considered as antenna tilt and CIO adjustment as discussed in Sect. 3, i.e., at time t, .a(t) = [b(t), O(t)]. .O(t) and .b(t) are characterized as random processes, i.e., . P (O(t))dI (t) = 1 and . P (b(t))db(t) = 1. The
596
Z. Zhou et al.
randomness of .O(t) and .b(t) is because of the fundamental trade-off on exploitation and exploration of online learning. 4.2.2
State Space S
We define the state vector as a stack of the cell loads and the ratio of cell edge users (denoted as .e(t)), i.e., .s(t) = [l(t), e(t)]. A user k is counted as an edge user if the throughput is smaller than a predefined threshold. This handcrafted state definition essentially captures the features of user geometry: The load reveals the density of users inside the cell, and the edge user ratio reflects the number of potential users for handover events. Comparing with directly using user locations as the states, this reduced state dimension can significantly lower the training cost on the feature extraction. 4.2.3
Objective (Value Function)
For convenience, we first let . n F (Ln (I (t), b(t))) (briefly noted as .F (t)) represent a function which can measure the level of cell load balance. Then, we define the multiple-objective optimization as, max
{I (t):t>0},{b(t):t>0}
E
∞
γ R(t) + λE t
t=0
∞
t
γ F (t)
t=0
s.t. I (t) = {In,k (t) : In,k (t) ∈ {0, 1}, n ∈ N, k ∈ K} .
b(t) = {bn (t) : bn (t) ∈ {θ0 , θ1 , · · · , θL−1 }, n ∈ N } I (t) = I (t )
∀t, t ∈ (pT , (p + 1)T ] p ∈ N
b(t) = b(t )
∀t, t ∈ (pT , (p + 1)T ] p ∈ N
(11)
where the expectation .E takes into account all of the environmental randomness, T is the time constraint to changing antenna tilt and .T represents the minimum handover time, and .t = 0 stands for the initial operation time. To be consistent with our action space definition, we have an alternative formulation as follows: ∞ ∞ t t max Eπ γ R(t) + λEπ γ F (t) {a(t):t>0}∼π
s.t. .
t=0
t=0
O(t) = {On,n (t) : On,n (t) ∈ [Omin , Omax ], n ∈ N, n ∈ N} b(t) = {bn (t) : bn (t) ∈ {θ0 , θ1 , · · · , θL−1 }, n ∈ N } O(t) = O(t )
b(t) = b(t )
∀t, t ∈ (pT , (p + 1)T ]
p∈N
(12)
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
597
Comparing (11) to (12), we make the following changes: • The variable .I (t) is replaced by .O(t) which alternatively defines the user association via inter-cell handover. The algorithm is expected to be able to automatically adjust .O(t) to proactively change the user association. • The time constraint on CIO is altered to T . We make this modification to synchronize the network adjustment for ease of RL-based action operations. However, the user association is still determined at every .T according to the definition of the A3-handover events. • We use .a(t) to briefly represent the action policy on .O(t) and .b(t).
4.2.4
Rewards
t We note that the objective in (12) can be merged as .E[ ∞ t=0 γ U (t)], where .U (t) := R(t)+F (t). By following the definition of the single-objective value function in (4), we can consider directly setting .U (t) as the rewards. However, this simple treatment can result in the following ambiguity issue: Suppose we obtain a reward U . U can be either composed by .U = R1 + λF1 or .U = R2 + λF2 , but .R1 = R2 and .F1 = F2 . This gives us an intuitive understanding that directly merging the two objectives into one and using scalar reward can potentially cause convergence issues to the RL-based approaches. To solve the ambiguity issue, we consider defining the reward as a vector:
(1) (2) .r a (t) = ra (t), ra (t) = Rn (t), Fn (t) . n
Now, we scale the objective and define .w = [w1 , w2 ], where .w1 = λ 1+λ . Then, we have the following action-value function: .
max
{a(t):t>0}∼π
w1 E π
∞
γ t ra(1) (t)
t=0
+ w2 E π
(13)
n
∞
1 1+λ
and .w2 =
γ t ra(2) (t)
.
(14)
t=0
We name the above formulation as a vector reward-based multiple-objective RL. Intuitively, we can consider .F (t) as the peak cell load, i.e., the maximum value of the average cell load over a period of T . Penalizing the peak load can avoid overloading users in particular cells which alternatively evenly distributes user associations. Accordingly, the reward vector is defined as r a (t) =
.
1 1 Ln (t) Rn (t), − max n T T t∈T n∈N
t∈T
(15)
598
Z. Zhou et al.
Moreover, we can consider a composite term to characterize the cell load balance. For instance, let .F (t) = − maxn T1 t∈T Ln (t) − σ ({Ln (t)}n∈N,t∈T ), where .σ is the standard deviation function. Accordingly, the two terms of .F (t) can be jointly treated as the last entry in the vector reward. But more generally, the second term of .F (t) can be included to the vector reward as a new dimension. Therefore, the optimization on the value-action function becomes
.
max
{a(t):t>0}∼π
w1 E π
∞
γ t ra(1) (t)
+w2 Eπ
∞
t=0
γ t ra(2) (t)
+w3 Eπ
t=0
∞
γ t ra(3) (t)
t=0
(16) Thus, the vector reward becomes 1 1 Rn (t), − max Ln (t), −σ ({Ln (t)}n∈N,t∈T ) .r a (t) = n T T t∈T n∈N
(17)
t∈T
Note that the above formulation still addresses the same multiple-objective optimization defined in (12) but using a vector reward with more entries. Therefore, the basic idea of vector reward-based RL is to map the features of multiple objectives to multiple dimensions of the vector reward.
4.3 Pareto Deterministic Policy Gradient Now, we consider how to solve vector reward-based RL problems. For simplicity, we only elaborate the case of two-dimensional rewards, where the reward vector with more than two dimensions can be easily obtained. According to the definition of the action-value function in (7), we can rephrase (14) as (2) Qπ (s, a) = w1 Q(1) π (s, a) + w2 Qπ (s, a)
= w1 E ra(1) (s(1)s) + γ Q(1) . π (s(1), a(1))
+ w2 E ra(2) (s(1)s) + γ Q(2) π (s(1), a(1))
By rearranging the above equation, we have
(1) (1) w1 E Q(1) π (s, a) − ra (s(1)s) − γ Qπ (s(1), a(1))
(2) (2) (s, a) − r (s(1)s) − γ Q (s(1), a(1)) . = −w2 E Q(2) π a π .
.
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
599
This equation implies that the temporal differences on .Q(2) and .Q(2) are proportional to each other. Motivated by this fact, we define a new TD objective for the vector reward RL as T D(Q(1) , Q(2) ) = w1 T D (1) (Q(1) ) + w2 T D (2) (Q(2) ).
.
(18)
where (1)
T D (1) (Q(1) ) = l(Q(1) π (s(t), a(t)) − ra(t) (s(t + 1)s(t))
.
− γ Qπ
(1)
(s(t + 1), a(t + 1))) (2)
T D (2) (Q(2) ) = l(Q(2) π (s(t), a(t)) − ra(t) (s(t + 1)s(t)) − γ Qπ
(2)
(s(t + 1), a(t + 1))).
By minimizing .T D(Q(1) , Q(2) ), we can simultaneously optimize the temporal difference of .T D (1) (Q(1) ) and .T D (2) (Q(2) ) with weights .w1 and .w2 , respectively. Moreover, we use neural networks to represent the Q function and policy .π , where they are denoted as .Q(s, aθQ ) and .A(sθA ), respectively. Overall, the learning of .θQ and .θA are briefly summarized as follows: • Fix .θA and alternatively update .θQ(1) and .θQ(2) using the gradients of (1) (θ (2) .T D(Q Q(1) ), Q (θQ(2) )). • Fix .θQ(1) and .θQ(2) , and update .θA through the chain rule of .Q(s t , A(s t θA )). • Repeat the above two stages until convergence. This alternative updating rule is also called the actor-critic algorithm in the context of RL. As the name suggested, it operates in an actor-critic way: The actor (policy network) performs actions and the critic (value network) evaluates the action and critiques the actions when low reward values are received. In addition, the output dimension of .A(sθA ) is the same as the number of parameters defined in (11). Meanwhile, parameters which have integer constraint are quantized afterward. Overall, the algorithm is summarized in Algorithm 0. We name this method as Pareto deterministic policy gradient algorithm as it optimizes a deterministic policy via using gradient descent according to the action-value objectives. The general framework for the vector reward with more than two dimensions is illustrated in Fig. 3. We can also notice that our method is based on optimization of a single policy network rather than employing cross-validation over multiple policy networks. In addition, the following experimental techniques are included to stabilize the convergence of our algorithm: • Experiment replay: We use a replay buffer .R which is a finite-sized cache to record the transition tuples. The replay buffer is a queue structure that can time out the oldest tuple. At each learning step, networks are updated by sampling a minibatch uniformly from the buffer. This is because PDPG is an off-policy
600
Z. Zhou et al.
Fig. 3 The framework of Pareto deterministic policy gradient
algorithm. Using a large replay buffer can store more uncorrelated transitions thus improving the learning convergence. • Feature scaling: We scale each entry of the state vector into a certain range and adjust the rewards through shifting and rescaling. Our experiments prove this technique can make the network learning more effectively. • Noise exploration: We add an independent Gaussian process with .σ variance and zero mean to the output of the deterministic policy network to form an exploration policy, i.e., .a(t) = A(s(t)) + N (t). Some other random processes, such as Ornstein-Uhlenbeck process, is also widely utilized in the RL literature [33]. Note that the transition history in the experiment replay is based on this exploration policy rather than our target deterministic policy. • Soft target update: From TD-based learning, it is important to choose an updating rule for the target value function and policy to ensure a stable convergence. We follow the “soft” target update strategy, rather than replacing the coefficients of the target networks directly copying from previous steps. The weights .τ are chosen as a very small value to assure the target networks are with a slow change.
5 Evaluation We evaluate our algorithm on a simulator which is developed based on 3GPP TS38.331 and 3GPP TS-38.213 [17, 18]. The simulation environment is a city area with massive MIMO deployment at BSs. Particularly, we consider applying a RL agent to a 400 m .× 400 m sub-area with 4 BSs. The RL control signal for the 4 cells are assumed synchronized as they are neighboring to each other. In the coverage area of the 4 cells, the average number of users is set as 80. In the simulator,
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
601
Algorithm 0 Pareto deterministic policy descent Require: Coefficients of policy and actor networks, cQ , cA ; discounted factor, γ ; soft update parameters; τ ; exploration random process, N (t); normalized tradeλ off weights, w = 1+λ Ensure: Online action output: a(t) = [I˜(t), B(t)] 1: Initialization: Replay buffer R; critic network Q1 (s, aθQ1 = cQ ); critic network 2 Q2 (s, aθQ2 = cQ ); actor network A(sθA = cA ); target critic networks Q1 (s, aθQ1 = cQ ) and Q2 (s, aθQ2 = cQ ) and target actor network A (sθA = cA ); t = 0 2: while t until the ends do 3: Select action a(t) = A(s(t)θA ) + N (t) 4: Render environment and obtain r(t)(1) , r (2) (t), and s(t + 1) 5: Queue (s(t), a(t), r (1) (t), r (2) (t), s(t + 1)) into a buffer R 6: Sample a minibatch from R 7: Set {y (1) (t ) : y (1) (t ) = r (1) (t ) + γ Q1 (s(t + 1), A (s(t + 1)θA )θQ1 ), t ∈ } and {y (2) (t ) : y (2) (t ) = r (2) (t ) + γ Q2 (s(t + 1), A (s(t + 1)θA )θQ2 ), t ∈ } 8: Update the two critic networks’ coefficients θQ1 and θQ2 individ ually by using the gradient ∇θQ l(y(t ), Q1 (s(t ), a(t )θQ1 )) and t 1 t ∇θQ2 l(y(t ), Q2 (s(t ), a(t )θQ2 )), respectively 9: actor network θA using the gradient Update w t ∇a Q1 (s(t ), A(s(t )θA )θQ1 ∇θA A(s(t )θA ) + (1 − w) t ∇a Q2 (s(t ), A(s(t )θA )θQ2 )∇θA A(s(t )θA ) 10:
θQ1 ← τ θQ1 + (1 − τ )θQ1 θQ2 ← τ θQ2 + (1 − τ )θQ2 θA ← τ θA + (1 − τ )θA
(19)
11: end while
the user mobility data is generated according to predefined mobility models. Here, we use the random way point (RWP) model by default. The model assumes users walking towards random directions with random step sizes. Given users’ locations and BS antenna tilt angles, the downlink RSRPs are calculated according to our measurement data which is stored in a lookup table as a 3-mode tensor, where the three modes are the index of anchor locations, tilt angle, and BSs, respectively. The number of measurement anchor locations we have in the sub-area is 24,573. The number of tunable antenna tilt angles for each BS is 11. The maximum time length for the mobility data is set as 200 days. Given SINR values on the user side, the corresponding CQI values can be obtained. Then the user rate is determined according to a lookup table. In our
602
Z. Zhou et al.
simulator, the traffic model for each user is set as a constant bit rate (CBR), i.e., Ck = 1 Mbps. The maximal load for each user is set as 6 PRBs. The total number of PRBs in one cell is 100. The CIO range is [.−12 dB, 12 dB]. When the associated users require more PRBs than the maximal PRBs per cell, the simulator will operate a resource scheduling algorithm afterward, where the eventual PRB allocation is propotional to the rate ranking of scheduled users. The parameters in Algorithm 0 are configured in the intervals .γ = [0.1, 0.6] and .τ = [0.001, 0.01]. The variance of .N (t) is set as .0.1 at the beginning of the learning stage to conduct the exploration policy. When the number of iterations is beyond a predefined threshold (our experimental value varies from 100 to 500), it is set as a smaller value (such as 100) to promote the exploitation policy. Both the actor network and critic network are chosen as a fully-connected feed-forward neural network with three layers, where the activation functions for the intermediate layers are set as ReLU, the output activation function for the actor neural network is chosen as Tanh function, and the critic network output activation is set as linear. The number of neurons for the two intermediate layers are set 50 and 100, respectively. For the RL state vector, the throughput threshold for edge user definition is set as 550 kbps. Finally, in the RL algorithm, the period of the antenna tilt and CIO change is set as 2 h in terms of the mobility model.
.
5.1 Comparison to Different RL Rewards We first evaluate the performance using different rewards in the RL algorithm. We consider the CDF curves of network throughput samples from every 15 mins in 200 days amid operating the algorithms to the environment. As illustrated in Fig. 4, the x-axis is chosen as normalized network throughput, where the denominator is chosen as 100 Mbps. The labelson the legend respectively represent: DDPG [33] algorithm using scalar reward . T1 t∈T n∈N Rn (t), DDPG algorithm using scalar reward .− maxn T1 t∈T Ln (t), DDPG algorithm using scalar reward 1 1 .w1 t∈T n∈N Rn (t)−w2 maxn T t∈T Ln (t), our introduced PDPG algorithm T using vector reward (15), PDPG algorithm using vector reward (17), and DDPG Fig. 4 Throughput CDF of using different RL algorithms
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
603
Fig. 5 Cell load result of using different RL algorithms
Fig. 6 RL convergence curves of using different rewards
algorithm using scalar reward .w1 T1 t∈T n∈N Rn (t) − w2 maxn T1 t∈T Ln (t) − w3 σ ({Ln (t)}n∈N,t∈T ). In this figure, all the weights inside the rewards are well cross-validated to optimize the throughput distribution. In our experience, the vector reward-based methods have no much difference on the throughput distributions under different combinations of .w1 , .w2 , and .w3 . However, scalar reward-based approaches require more cross-validation tests to obtain the resulting distribution. In Fig. 5, we have cell load box plots of the underlying 4 cells, where the corresponding throughput is the same with Fig. 4. As we can see, the scalar reward with only throughput reward has less balanced cell load. Consequently, the unbalanced cell load affects the overall throughput optimization, in which the algorithm can only converge to a local optimum with less performance advantage. Meanwhile, the scalar reward with only maximum load-based approach scarifies the throughput gain to favor a more balanced load which also deviates from the joint optimization purpose. Using linear combination of the previous scalar rewards to form another scalar reward can lead to convergence issue as demonstrated in Fig. 6. This is because of the ambiguity issue by mixing scalar rewards as we discussed in Sect. 4.2. The convergence curve together with the throughput and cell load distribution corroborates the advantage of choosing the vector rewardbased approaches. Overall, the vector reward-based RL approaches can achieve more balanced cell load and high throughput which yield the joint optimality on our objective.
604
Z. Zhou et al.
5.2 Comparison to Static Approach To make a self-contained evaluation, we add static optimization-based brute-force algorithms to the comparison, where the details of these algorithms can be found in the Appendix. Accordingly, the resulting CDF curves of the network throughput are presented in Fig. 7 (see Algorithms 0 and 1). Regarding the labels in this figure, “static optimization 1” stands for Algorithm 2; “static optimization 2” stands for Algorithm 1, where the antenna tilt angles are only allowed to be changed per 2 h in terms of the mobility model for these two algorithms, while “static optimization 3” stands for Algorithm 2 using the best cross-validated w, but the antenna tilt angles can be changed every 15 min. We can observe that the CDF curve of our introduced RL method is very similar to the static optimization 3. Here, the CDF curve of the RL algorithm is with respect to when the RL algorithm converged (the convergence is considered as when the number of iterations are beyond 10,000 as shown in Fig. 8). Figure 9 shows that the RL approach achieves a balanced cell load as similar as the ideal static case—static optimization 3. The cell load obtained from other static optimization approaches has high variance. Thus, the corresponding throughput performance is deteriorated. Moreover, it is important to note our utilized solvers for the static optimization problem are still sub-optimum approach. The computational complexity of finding the true global optimum for the static optimization problem is prohibitively high. Overall, we believe the RL approach achieves good performance on cell load balancing as well as throughput maximization. Fig. 7 Throughput CDF of static optimization and vector reward-based RL algorithm
Fig. 8 The throughput changing over time for static optimization and vector reward-based RL algorithm
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
605
Fig. 9 Cell load distribution of static optimization and vector reward-based RL algorithm
Fig. 10 The time dynamic of RL convergence in SLAW model, where the metric for the upper one is the network peak load and the lower one is the throughput
5.3 Mobility Models To fully evaluate our introduced RL algorithm, we test the algorithm using another mobility model: self-similar least-action walk (SLAW) model [40], where the SLAW model assumes that user mobility is based on clusters (users only move among these clusters). Figure 10 shows the convergence of the vector reward-based RL algorithm under the two mobility models. Similarly as our previous findings, using vector reward requires less parameter retuning over different mobility models. This is because the vector-formed reward conveys the objective features to the RL value network through a more efficient way. Overall, we conclude that our introduced vector reward-based RL algorithm is a robust learning approach with less cross-validation requirements.
6 Conclusion In this chapter, we introduced a vector reward-based multiple-objective reinforcement learning algorithm. It is utilized to jointly optimize the cell load balance and throughput in mobility management tasks. We choose massive MIMO antenna tilt and handover CIO adjustment as the RL action space. In addition, the RL reward is designed as a vector, where each entry represents a feature of the joint optimization
606
Z. Zhou et al.
objective. Accordingly, to promote the Pareto optimality, the introduced RL agent is configured with multiple value networks along with a single policy network. The weighting parameters on the multiple objectives are embedded into the value network which is used for guiding the policy network learning afterward. Moreover, we developed a static formulation for the same joint optimization problem and compare it to our introduced RL algorithm. The algorithm evaluations are presented in different ways including the cell load distribution, throughput distribution, and the learning curves of RL algorithms. For future work, we can consider asynchronous action-based multiple-objective learning. Meanwhile, we can include other handover protocols into the framework, such as A2- and A5-event-based inter-frequency handover. Moreover, other network features can be incorporated to test the performance of using high-dimension vector rewards.
A Comparison: A Static Formulation We now consider a static formulation of the joint optimization on load balancing and throughput maximization. To do so, we directly drop the expectation and time constraint in (11) and set .γ = 0. Therefore, we have the following formulation: max
I (t),b(t) .
n
Rn (t) +
Fn (t)
n
s.t. I (t) = {In,k (t) : In,k (t) ∈ {0, 1}, n ∈ N, k ∈ K}
(20)
b(t) = {bn (t) : bn (t) ∈ {θ0 , θ1 , · · · , θM−1 }, n ∈ N }. Comparing the above formulation to (11), we can notice the following difference and practical limitations: • To solve this static problem, it requires perfect knowledge of all .pn,k (bn (t)) at every sample time which imposes a large overhead to the user feedback. • Due to the integer constraints, the complexity becomes very high. Moreover, it requires BSs to solve out .I (t) and .b(t) on every time slot. • The formulation treats user association as a variable. However, the user association cannot be directly translated to the values of the CIOs in A3 events. Therefore, it is not compatible to the handover operations in current cellular systems. Therefore, we only consider the above formulation as an evaluation approach to our RL algorithm in the simulation. Ideally, the static way can yield the optimal solution at every time. Thus, it can serve as an upper bound for the RL algorithm.
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
607
A.1 Heuristic Brute-Force Solvers Note that (20) can be equivalently written as max
I (t),b(t) .
s.t.
Fn (I (t), b(t))
n
I (t) = {In,k (t) : In,k (t) ∈ {0, 1}, n ∈ N, k ∈ K}
(21)
b(t) = {bn (t) : bn (t) ∈ {θ0 , θ1 , · · · , θM−1 }, n ∈ N } Rn (I (t), b(t)) > φ, n ∈ N where .φ is a parameter to avoid trivial solutions, such as all users are disconnected to BSs (in this case, all cell loads are zero). Since we do not have an analytical expression of the objective, brute-force search is considered as our primary approach. However, the number of user association combinations is prohibitively large. To narrow down the searching region, we consider a heuristic approach to approximate .
max I (t)
Fn (I (t), b(t))
(22)
n
for a given .b(t). We consider determining the user association in a round-robin manner: Each BS is allocated with an equal number of associated users, where the associated user for each BS is based on the ranking of user rates. Intuitively, this can be considered as a heuristic way to average the throughput .Rn (I (t), b(t)) over all cells. Accordingly, the algorithm is summarized in Algorithm 1.
Algorithm 1 Relaxed brute force for a small .λ Require: B(t) (the set of BSs’ tilt angle combinations) Ensure: I (t) and b(t) 1: Initialization : O = ∅, I (t) = ∅ 2: while b(t) ∈ B(t) do 3: while ∃In,k (t) == ∅ do 4: while n ∈ N do 5: In,k (t) = 1 and In ,k (t) = 0, where k = arg max {rn,k (t) : In,k (t) = ∅} and n ∈ N/n 6: end while 7: end while 8: end while 9: I (t), b(t) = arg min O
608
Z. Zhou et al.
Algorithm 2 Relaxed brute-force for a fair .λ Require: The set of BSs’ tilt angle combinations: B(t), A threshold for association λ policy decision w = 1+λ . Ensure: I (t) and b(t) 1: Initialization : O = ∅, I (t) = ∅ 2: while b(t) ∈ B(t) do 3: while ∃In,k (t) == ∅ do 4: Draw a random number c from [0,1] 5: if c < w then 6: while n ∈ N do 7: In,k (t) = 1 and In ,k (t) = 0, where k = arg max {rn,k (t) : In,k (t) = ∅} and n ∈ N/n 8: end while 9: else 10: Randomly choose k ∈ {k : In,k = ∅} 11: In,k (t) = 1 and In ,k (t) = 0, where n = arg max pn,k (bn (t)) and n ∈ N/n 12: end if 13: end while 14: O = O ∪ Un (I (t), b(t)) 15: end while
Alternatively, the inner loop for user association assignment can be considered as solving .
max I (t)
Rn (I (t), b(t))
(23)
n
A heuristic approach is assigning users to the BS with maximum transmission power. This association strategy may break the cell load balance but avoid user failure links. Therefore, we can combine the previous two heuristic association strategies and obtain Algorithm 2. Particularly, the two user association strategies are mixed through a random binary decision, where the decision threshold is λ proportional to the weight ratio between the two objectives, i.e., . 1+λ .
References 1. Shafin, R., Liu, L., Chandrasekhar, V., Chen, H., Reed, J., Zhang, J.C.: Artificial intelligenceenabled cellular networks: A critical path to beyond-5G and 6G. IEEE Wireless Communications 27(2), 212–217 (2020) 2. Klaine, P.V., Imran, M.A., Onireti, O., Souza, R.D.: A survey of machine learning techniques applied to self-organizing cellular networks. IEEE Communications Surveys & Tutorials 19(4), 2392–2431 (2017)
Pareto Deterministic Policy Gradients and Its Application in 6G Networks
609
3. Mao, Y., You, C., Zhang, J., Huang, K., Letaief, K.B.: A survey on mobile edge computing: The communication perspective. IEEE Communications Surveys Tutorials 19(4), 2322–2358 (2017) 4. Nam, Y.-H., Ng, B.L., Sayana, K., Li, Y., Zhang, J., Kim, Y., Lee, J.: Full-dimension mimo (FD-MIMO) for next generation cellular technology. IEEE Communications Magazine 51(6), 172–179 (2013) 5. Zhang, H., Liu, N., Chu, X., Long, K., Aghvami, A., Leung, V.C.M.: Network slicing based 5G and future mobile networks: Mobility, resource management, and challenges. IEEE Communications Magazine 55(8), 138–145 (2017) 6. Imran, A., Zoha, A., Abu-Dayya, A.: Challenges in 5G: how to empower son with big data for enabling 5G. IEEE network 28(6), 27–33 (2014) 7. Ruiz-Aviles, J.M., Toril, M., Luna-Ramírez, S., Buenestado, V., Regueira, M.: Analysis of limitations of mobility load balancing in a live lte system. IEEE wireless communications letters 4(4), 417–420 (2015) 8. Andrews, J.G., Singh, S., Ye, Q., Lin, X., Dhillon, H.S.: An overview of load balancing in hetnets: Old myths and open problems. IEEE Wireless Communications 21(2), 18–25 (2014) 9. Ye, Q., Rong, B., Chen, Y., Al-Shalash, M., Caramanis, C., Andrews, J.G.: User association for load balancing in heterogeneous cellular networks. IEEE Transactions on Wireless Communications 12(6), 2706–2716 (2013) 10. Damnjanovic, A., Montojo, J., Wei, Y., Ji, T., Luo, T., Vajapeyam, M., Yoo, T., Song, O., Malladi, D.: A survey on 3GPP heterogeneous networks. IEEE Wireless communications 18(3), 10–21 (2011) 11. Lopez-Perez, D., Guvenc, I., Chu, X.: Mobility management challenges in 3GPP heterogeneous networks. IEEE Communications Magazine 50(12), 70–78 (2012) 12. Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., et al.: Mastering atari, Go, chess and shogi by planning with a learned model. arXiv preprint arXiv:1911.08265 (2019) 13. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013) 14. Vinyals, O., Babuschkin, I., Czarnecki, W.M., Mathieu, M., Dudzik, A., Chung, J., Choi, D.H., Powell, R., Ewalds, T., Georgiev, P., et al.: Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019) 15. Zhou, Z., Li, X., Zare, R.N.: Optimizing chemical reactions with deep reinforcement learning. ACS central science 3(12), 1337–1344 (2017) 16. Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014) 17. 38.331, G.T.: NR; radio resource control (RRC); protocol specification 18. 38.213, G.T.: NR; physical layer procedures for control 19. Bethanabhotla, D., Bursalioglu, O.Y., Papadopoulos, H.C., Caire, G.: User association and load balancing for cellular massive mimo. In: 2014 Information Theory and Applications Workshop (ITA), pp. 1–10 (2014). IEEE 20. Razaviyayn, M., Hong, M., Luo, Z.-Q.: Linear transceiver design for a mimo interfering broadcast channel achieving max–min fairness. Signal Processing 93(12), 3327–3340 (2013) 21. Singh, S., Dhillon, H.S., Andrews, J.G.: Offloading in heterogeneous networks: Modeling, analysis, and design insights. IEEE Transactions on Wireless Communications 12(5), 2484– 2497 (2013) 22. Hasan, M.M., Kwon, S., Na, J.: Adaptive mobility load balancing algorithm for lte small-cell networks. IEEE Transactions on Wireless Communications 17(4), 2205–2217 (2018) 23. Wang, H., Ding, L., Wu, P., Pan, Z., Liu, N., You, X.: Dynamic load balancing and throughput optimization in 3gpp lte networks. In: Proceedings of the 6th International Wireless Communications and Mobile Computing Conference, pp. 939–943 (2010) 24. Son, K., Chong, S., De Veciana, G.: Dynamic association for load balancing and interference avoidance in multi-cell networks. IEEE Transactions on Wireless Communications 8(7), 3566– 3576 (2009)
610
Z. Zhou et al.
25. Ao, W.C., Psounis, K.: Approximation algorithms for online user association in multi-tier multi-cell mobile networks. IEEE/ACM Transactions on Networking 25(4), 2361–2374 (2017) 26. Mwanje, S.S., Mitschele-Thiel, A.: A q-learning strategy for lte mobility load balancing. In: 2013 IEEE 24th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), pp. 2154–2158 (2013). IEEE 27. Mwanje, S.S., Schmelz, L.C., Mitschele-Thiel, A.: Cognitive cellular networks: A q-learning framework for self-organizing networks. IEEE Transactions on Network and Service Management 13(1), 85–98 (2016) 28. Kudo, T., Ohtsuki, T.: Q-learning based cell selection for ue outage reduction in heterogeneous networks. In: 2014 IEEE 80th Vehicular Technology Conference (VTC2014-Fall), pp. 1–5 (2014). IEEE 29. Xu, Y., Xu, W., Wang, Z., Lin, J., Cui, S.: Deep reinforcement learning based mobility load balancing under multiple behavior policies. In: ICC 2019–2019 IEEE International Conference on Communications (ICC), pp. 1–6 (2019). IEEE 30. Xu, Y., Xu, W., Wang, Z., Lin, J., Cui, S.: Load balancing for ultradense networks: A deep reinforcement learning-based approach. IEEE Internet of Things Journal 6(6), 9399–9412 (2019) 31. Wang, Z., Li, L., Xu, Y., Tian, H., Cui, S.: Handover control in wireless systems via asynchronous multiuser deep reinforcement learning. IEEE Internet of Things Journal 5(6), 4296–4307 (2018) 32. Zappone, A., Sanguinetti, L., Debbah, M.: User association and load balancing for massive mimo through deep learning. In: 2018 52nd Asilomar Conference on Signals, Systems, and Computers, pp. 1262–1266 (2018). IEEE 33. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015) 34. Van Moffaert, K., Nowé, A.: Multi-objective reinforcement learning using sets of pareto dominating policies. The Journal of Machine Learning Research 15(1), 3483–3512 (2014) 35. Miettinen, K., Mäkelä, M.M.: On scalarizing functions in multiobjective optimization. OR spectrum 24(2), 193–213 (2002) 36. Van Seijen, H., Fatemi, M., Romoff, J., Laroche, R., Barnes, T., Tsang, J.: Hybrid reward architecture for reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 5392–5402 (2017) 37. Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of artificial intelligence research 4, 237–285 (1996) 38. Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine learning 3(1), 9–44 (1988) 39. Munos, R., Stepleton, T., Harutyunyan, A., Bellemare, M.: Safe and efficient off-policy reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 1054– 1062 (2016) 40. Lee, K., Hong, S., Kim, S.J., Rhee, I., Chong, S.: Slaw: A new mobility model for human walks. In: IEEE INFOCOM 2009, pp. 855–863 (2009)
Ultra-Reliable and Low-Latency Communications in 6G: Challenges, Solutions, and Future Directions Changyang She and Yonghui Li
1 Introduction One of the most challenging objectives in beyond the 5th-generation (5G) or so-called the 6th-generation (6G) networks is to achieve ultra-reliable and lowlatency communications (URLLC), which are the foundation for enabling many mission-critical applications with stringent requirements on end-to-end (E2E) delay and reliability [3]. For example, autonomous vehicles, factory automation, Tactile Internet (TI), and virtual/augmented reality (VR/AR) require a delay bound of −5 ∼ 10−7 [4]. Although the sum of .1 ∼ 10 ms and a packet loss probability of .10 the uplink and downlink transmission delays can be reduced to 1 ms in the coming 5G New Radio, the randomness in wireless networks like dynamic traffic loads and uncertain channel conditions will result in serious network congestions. As a result, the E2E delay is much longer than the transmission delays in the air interface. Given this fact, it is natural to raise the following question: Question 1: Is 5G Ready for URLLC Applications in Various Vertical Industries? In addition to URLLC, applications in different vertical industries have their unique key performance indicators (KPIs), which cannot be fulfilled by 5G. The KPIs may include jitter in teleoperation, global connectivity for URLLC, high data rate URLLC, massive connection URLLC, high-mobility URLLC, and round-
This chapter is a part of [1] and [2] C. She · Y. Li (O) School of Electrical and Information Engineering, The University of Sydney, Sydney, NSW, Australia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_24
611
612
C. She and Y. Li
trip delay requirement. However, these KPIs have not been taken into account in URLLC in 5G New Radio that considers three application scenarios separately, i.e., enhanced broadband, massive machine-type communications, and URLLC. 6G mobile networks are advocating to achieve .10−9 packet loss probability and .0.1 ms and ensure extreme URLLC together with requirements of high data rate, massive access, and high mobility. Therefore, we need to address the following question in 6G. Question 2: What Are the Promising Technologies That Can Fulfill the Unique Requirements of URLLC Applications? To this end, we introduce some technologies that have the potential to meet the specific requirements in typical URLLC applications. As shown in Table 1, implementing these technologies in the URLLC applications is not straightforward. For instance, millimeter wave (mmWave) and terahertz (THz) communications can be deployed to provide larger data rate for VR/AR applications, but the signal strength of mmWave and THz bands will decrease considerably if there is no lineof-sight (LoS) path between the transmitter and the receiver. To support URLLC applications with mmWave and THz bands, the service interruption caused by blockage between the transmitter and the receiver should be avoided. For some other applications and technologies, as will be introduced in Sect. 2, some issues also remain open. To handle these issues, we need novel methodologies. As shown in [8], to improve the KPIs of communication systems, a promising approach is to design communication systems in an anticipatory manner. The basic idea is to predict the behavior of users and the status of the networks and exploit predicted information to make decisions accordingly. For example, in a teleoperation system with communication delay of 50 ms, the observed location of the slave device at the controller side is delayed. If the controller predicts the movement of the slave in a prediction horizon of 50 ms, then it can estimate the current location of the slave device and generate haptic feedback with zero latency [5]. Nevertheless, prediction errors have significant impacts on the KPIs of the URLLC applications and hence introduce new research challenges. Inspired by [5, 6, 8], we further investigate the following question. Question 3: What Are the Promising Methodologies in Communication System Design and Network Management? The successful development of all the previous generations of mobile networks is mainly based on model-based methods, which are very useful in performance analysis and network optimization [9]. However, the model-based methods alone cannot address the challenges in 6G networks. To obtain tractable results, some ideal assumptions and simplifications are inevitable in model-based methods. Thus, the obtained solutions cannot satisfy the quality-of-service (QoS) requirements in real-world networks. Moreover, the optimization problems for resource allocation, scheduler design, and network management are either nondeterministic or nonconvex. To optimize the related policies according to dynamic parameters in
Requirements Typical applications Promising technologies Research challenges
Global connectivity Internet of Skills Satellite communications Long propagation delay
High Massive data rate connections VR/AR Factory enabled TI automation mmWave and THz bands, location-aware services Control-plane overheads, positioning accuracy
Table 1 Requirements, technologies, and challenges of URLLC in 6G [5–7] High mobility Aerial and terrestrial vehicular networks Orthogonal time frequency space Immature and only for physical layer
Round-trip delay Control and automation systems Edge intelligence Partial and delayed observations
Jitter Multimodal teleoperation Intelligent network functions No quality of service guarantee
Ultra-Reliable and Low-Latency Communications in 6G: Challenges. . . 613
614
C. She and Y. Li
wireless networks, the system needs to execute searching algorithms every few milliseconds. This will bring high computing overheads and long computing delay. With recent advances in data-driven deep learning, it is possible to learn a wide range of policies for wireless networks [10]. However, data-driven deep learning methods need a long training phase and a large number of training samples. To evaluate whether a policy satisfies the reliability requirement or not (e.g., packet loss probability of .10−5 ∼ 10−7 ), a device needs to send more than .105 packets in the real-world network. It takes a very long time to obtain enough training samples and will not be possible if the packet arrival rate of a device is low. To apply deep learning in URLLC, well-established models and theoretical formulas in communications and networking are helpful [11]. Merging model-based and datadriven methods is a promising approach in 6G networks. In this chapter, we will take TI as an example application to illustrate the KPI requirements, promising technologies, open issues, and future directions. Unlike existing Internet and communication systems, the Tactile Internet (TI) aims to support haptic media including Tactile (sense of touch) and kinesthetic (muscle movement) interactions and hence has the potential to revolutionize users’ experience [12]. On the one hand, the TI will enable a variety of human-type communications, such as teleoperation, online gaming, and Internet of Skills [5]. All these applications have stringent requirements on round-trip delay because haptic feedback is much more sensitive to latency than auditory and visual senses. On the other hand, the TI is expected to provide reliable connectivity for mission-critical machine-type communications, such as industrial automation, automatic guided vehicles, and the Internet of drones [12]. For these applications, packet losses in communication systems may lead to severe accidents, and hence, the reliable is crucial.
2 Unique Requirements and Promising Technologies In this section, we review the unique requirements of different TI applications and introduce promising technologies to fulfill the requirements. Furthermore, we discuss open problems of these technologies and summarize new research challenges when applying them in TI.
2.1 Global Connectivity for URLLC 2.1.1
Applications
Internet of Skills The vision of Internet of Skills is to reliably deliver physical skill sets across the world enabled by ultra-low latency, AI, and robotics. To allow global skill
Ultra-Reliable and Low-Latency Communications in 6G: Challenges. . .
615
set delivery, global connectivity is crucial. The applications of Internet of Skills include assisting remote technical experts to conduct emergency repairs in factory automation, diagnosing medical conditions for a remote patient, and providing instructions from a remote educator. As such, the skill set of experts like technicians, doctors, and educators can be delivered globally. However, supporting URLLC in long-distance communications is very challenging, because the propagation delay is a big hurdle. If the communication distance is ten thousand kilometers, the propagation delays of wireless communications and fiber communications is 33 ms and 50 ms, respectively [13]. Thus, wireless communications can outperform wired communications in terms of propagation delay. Nevertheless, 5G terrestrial networks can hardly provide global connectivity for URLLC and the propagation delay remains as one of the bottlenecks.
2.1.2
Technology
Low Earth Orbit (LEO) Satellite Communications Integrating satellite communications into 5G systems, i.e., 5G non-terrestrial networks, has been considered as a promising approach to global connectivity. Compared with geostationary earth orbit and medium earth orbit satellites, LEO satellites with altitudes .200−2, 000 km can achieve lower propagation delays, but are still unsatisfactory for URLLC applications [14]. In addition, LEO satellites travel around the earth with high velocity. Thus, high Doppler shift, handover among different satellites, and routing via inter-satellite links will result in delay violations and packet losses if not properly handled.
2.2 High Data Rate or Massive Connections for URLLC 2.2.1
Applications
VR/AR-Enabled TI To provide immersive experience in highly interactive online education, video games, and remote healthcare, VR/AR services are expected to be integrated into TI applications. Since the data rate in a VR/AR service is extremely high, e.g., 4K or 8K .360o × 180o video stream, future wireless networks should achieve high data rate and URLLC at the same time. In 5G, however, high data rate services and URLLC are treated as two separate application scenarios. Thus, novel technologies and design methodologies that can support VR/AR-enabled TI applications are needed.
616
2.2.2
C. She and Y. Li
Applications
Massive Connections in Factory Automation The TI will also play a critical role in the fourth industrial revolution, i.e., Industry 4.0. It has the potential to enable timely and reliable status updates among massive sensors and actuators. The licensed bandwidth of 5G is limited compared with the density of devices in factory automation. To address this challenge, governments of several countries plan to assign dedicated local spectrum for industries.
2.2.3
Technology
Above 6 GHz Band Communications To provide sufficient bandwidth for the emerging TI applications, a potential solution is to use mmWave band (.100−300 GHz) and THz band (.300−3000 GHz) [15]. One of the major issues of high-frequency bands is that they are susceptible to blockage, molecular absorption, and rain attenuation. These uncertain factors in dynamic communication environments will deteriorate the reliability and coverage of wireless links. LOS is critical for ensuring high reliability in THz communications [15]. Although we can improve the reliability and coverage by serving each device with multiple access points to improve the probability that there is a LoS path, i.e., multi-connectivity, frequent handovers and control signaling will lead to extra overheads.
2.2.4
Technology
Location-Aware Services With shorter wavelengths and larger bandwidth, mmWave and THz can provide higher localization resolutions, bringing new opportunities for accurate wireless localization. To further improve the number of connections in the above 6GHz bands, location-aware communication services become possible, such as locationbased beamforming, resource allocation, and mobility management. This concept can be implemented in the joint mmWave radar and communication system, which is capable of delivering data packets and sensing a large field of view of the environment simultaneously [16]. Intuitively, the performance of location-aware communication services highly depends on localization accuracy. However, how to characterize the relationship between localization accuracy and KPIs in joint mmWave radar and communication systems deserves further investigation.
Ultra-Reliable and Low-Latency Communications in 6G: Challenges. . .
617
2.3 High-Mobility URLLC 2.3.1
Applications
Outdoor Mission-Critical Applications Supporting mission-critical applications with high mobility (e.g., remote driving, fleet management of autonomous vehicles, and Internet of drones) in wireless communication systems will bring significant research challenges. Considering the Doppler frequency shift is large and highly dynamic, it can cause severe inter-symbol interference in orthogonal frequency-division multiplexing (OFDM) systems. As such, OFDM-based 5G New Radio can hardly ensure ultra-low latency and ultra-high reliability in high-mobility scenarios. Moreover, frequent handover will lead to service outages in high-mobility scenarios. With dual connectivity in 5G, the handover interruption time can be reduced to zero, but the targeting base station after handover may not have enough resources to maintain the reliability and latency requirements for mission-critical applications.
2.3.2
Technology
Orthogonal Time Frequency Space (OTFS) Modulation OTFS modulation has been considered as a promising physical-layer technology in high-mobility scenarios. Unlike OFDM, OTFS can make use of Doppler shift by modulating the symbols in the delay-Doppler domain. In this way, the orthogonality of signals in the delay and Doppler domain will not be affected by the Doppler shift. Thus, OTFS achieves better performance than OFDM in high-mobility scenarios [17]. It is worth noting that the development of OTFS systems is still in its infancy. Whether OTFS can ensure other KPIs of TI applications remains unclear and deserves further study. Besides, as a physical-layer technology, OTFS is not designed to address the issues in the upper layers.
2.4 Round-Trip Delay 2.4.1
Applications
Control and Automation Systems Round-trip delay is one of the KPIs of several URLLC applications like closedloop control and automation systems [12]. In closed-loop control systems, the long latency in backhaul and core networks remains as the bottleneck for providing timely feedback and maintaining the control stability. In automation systems, mobile
618
C. She and Y. Li
devices have limited storage and computing resources and hence can hardly react to the environment in a real-time manner.
2.4.2
Technology
Edge Intelligence As shown in [18], edge intelligence is a promising approach to enable distributed, low-latency, and reliable decision-making. To reduce the round-trip delay in a closed-loop control system, we can build a digital twin in the edge server near the controller. Digital twin is a simulation environment that mimics the behavior of the real-world system [19, 20]. Once the controller uploaded its command, the edge server executes the command in the digital twin and generates a prompt response to the controller. Nevertheless, the edge server can only monitor parts of the environment via delayed observations, the digital twin is not exactly the same as the real-world system. As a result, the server needs to make decisions with partial or outdated observations.
2.5 Jitters 2.5.1
Applications
Multimodal Teleoperation Jitter is defined as the uncontrolled fluctuation of latency and was first considered in analog voice communications since the voice distortion is sensitive to jitter. In multimodal teleoperation, the controller sends kinetics of movement to the slave and receives visual, audio, tactile, and force feedback from the slave. If the jitter is large, the sequence of commands (or feedback) received at the slave device (or the controller) can be different from that generated by the controller (or the slave device). If this happens, the behavior of the slave will be different from the controller, and the control system can be unstable. Therefore, jitter is a critical performance metric for teleoperation systems, but it has not been taken into account in 5G.
2.5.2
Technology
Intelligent Network Functions in Software-Defined Networks (SDNs) Most of the existing network functions (e.g., user association, routing, and radio resource management) are not tailored for TI applications. For instance, wireless
Ultra-Reliable and Low-Latency Communications in 6G: Challenges. . .
619
schedulers like proportional fair and maximum throughput are mainly developed to achieve higher throughput and cannot guarantee latency, reliability, and jitters for teleoperation. Considering that network operators need to reprogram these network functions according to the unique KPIs of specific applications, the SDN architecture has been adopted by 5G standards. Furthermore, even with programmable network functions, one can hardly obtain the optimal policies with traditional optimization algorithms. One promising approach is to use deep learning/deep reinforcement learning to modify network functions. Nevertheless, most of the existing learning algorithms do not guarantee the reliability of their outputs.
2.6 Summary of New Research Challenges The new research challenges discussed in this section are summarized below. • End-to-end (E2E) delay in long-distance communications is unsatisfactory for URLLC. • Signaling overheads in mmWave and THz communications are high with multiconnectivity. • Positioning accuracy for location-aware communication services may not be accurate enough. • As a physical-layer technology, OTFS cannot address upper layer issues. • Partial/outdated observations in edge intelligence may result in poor performance in URLLC. • How to guarantee quality of service with intelligent network functions remains unclear.
3 Methodologies in 6G To address the research challenges in the above section, we introduce some methodologies for communication system design and network management.
3.1 Wireless AI A policy in wireless networks can be described by a function that maps the network state to the decision on routing, scheduling, access control, resource allocation, and so on [21]. Such a function is denoted by .y = f (x), where x and y are the state and the decision, respectively. In general, the closed-form expression of .f (·) is hard to derive, and hence, the system needs to search the optimal decision with when the state varies, e.g., channel states. Searching algorithms usually have high complexity and cannot be implemented in real time.
620
C. She and Y. Li
To reduce the computing delay for executing searching algorithms, one can use a DNN, denoted by .yˆ = o(x; o), to approximate the optimal policy [22], where .o represents the parameters of the DNN and .y ˆ is the output of the DNN with a given input x. By training the parameters, we can obtain an accurate approximation .o(x; o) ≈ f (x). According to the universal approximation theorem, for a deterministic and continuous function .f (·), the approximation error approaches to zero as the scale of the DNN increases [22]. There are some critical issues when applying learning algorithms in wireless networks, especially for URLLC: • Without QoS guarantee. Unlike optimization problems that ensure QoS requirements by including some constraints, DNN does not have any constraint. Although we can design some heuristic rewards and punishments that take QoS into account, whether the achieved QoS can satisfy the requirements of URLLC is not clear [23]. • Scalability. In general, a communication system design or network management problem is non-convex. The complexity of the problem grows extremely fast as the scale of the system increases. Since the execution delay of a specific algorithm is not negligible for URLLC, scalability remains as an open issue when applying deep learning in URLLC. • Generalization. A DNN is usually trained offline. However, wireless networks are not stationary. As a result, the DNN can neither maximize the performance in terms of resource utilization efficiency nor guarantee the QoS of URLLC. Increasing the generalization ability of a DNN is the bottleneck of implementing deep learning algorithms in wireless networks.
3.2 Multi-Level Architecture To address the above issues, we integrate deep learning algorithms into a multilevel architecture as shown in Fig. 1. Specifically, with user-level intelligence and edge intelligence, we can avoid the propagation latency and routing latency in the core network. The computing and storage resources in the central cloud are mainly used for offline training as the cloud server has diverse data from different APs. To better illustrate the multi-level architecture, mobility and traffic prediction for each MU, scheduler design at each AP, and user association in a multi-AP network are investigated.
3.2.1
Device Intelligence at User Level
With device intelligence, MUs are able to make decisions based on local predicted information, such as traffic state and mobility. Since the prediction reliability is crucial for making decisions, the prediction error probability should be extremely low.
Device intelligence at user level
Central cloud Core network
MU Local DNNs
Cloud DNNs
AP
…
Edge DNNs
MU
…
Mobility prediction
MUs Traffic state prediction
621
Cloud Intelligence at network level
Edge intelligence at cell level MEC
…
Tasks at different levels
Multi-level Architecture
Ultra-Reliable and Low-Latency Communications in 6G: Challenges. . .
Scheduler design
Low-density area High-density area AP User association
Fig. 1 Multi-level Architecture in 6G
To analyze the prediction error probability, we can use a model-based method for prediction and derive the prediction error probability [13]. If data-driven methods outperform the model-based method, the prediction error probability achieved by the model-based method can serve as an upper bound of data-driven methods.
3.2.2
Edge Intelligence at Cell Level
A scheduler at an AP maps the channel state information and queue state information to resource allocation among different MUs. With edge intelligence, DRL can be used to optimize the scheduler. The basic idea is to use two DNNs to approximate the optimal scheduler and the value function, respectively. With model-free DRL, the AP needs to evaluate the delay and reliability of a certain action by transmitting a large number of packets in the network. This leads to long convergence time of the DRL. To handle this issue, theoretical formulas can be used to evaluate the decoding error probability in the short blocklength regime and the queueing delay violation probability [24]. Besides, by exploring in a numerical environment, the exploration safety can be improved remarkably [23].
3.2.3
Cloud Intelligence at Network Level
User association schemes depend on the large-scale channel gains from MUs to APs as well as the packet arrival rate of each MU. With cloud intelligence, a centralized control plane uses a DNN to approximate the optimal user association scheme that maps the large-scale channel gains and the packet arrival rates of MUs to the user association scheme [11]. With a strong computing capacity, the central cloud can build a numerical platform that mirrors the behavior of the whole network. From the numerical platform, the system can explore labeled training samples with
622
C. She and Y. Li
optimization algorithms and then train the DNN with the optimal solutions. After the training phase, the DNN is saved at the control plane for online implementation. Considering that real networks are highly dynamic, updating the states of all the network components to the central cloud will lead to unaffordable communication overheads. To avoid high overheads, the cloud only requires the information that is static or predictable in a large area or a long prediction horizon (hours). For example, the topology of APs, backhauls, and core networks are static and the density of MUs (or service requests) is predictable with spatial and temporal correlation.
3.3 Federated Learning for Scalable Training To apply deep learning at MUs and MEC servers in the proposed architecture, there are two open problems. First, the number of local training samples may not be enough to train a DNN. Second, the computing capacity of an MU or an MEC is limited; thus, if the DNN is trained locally, the training time will be long. A straightforward approach is to train DNNs at the central cloud by collecting data from all MUs and MEC servers. However, this approach is not scalable to the number of devices in the network. To enable device intelligence and edge intelligence, the edge-assisted hierarchical federated learning method proposed in [25] can be adopted in our multi-level architecture. As shown in Fig. 1, MUs update parameters of local DNNs, .oLk , k = 1, 2, 3, ..., to MEC servers. The edge DNN at the lth MEC server, denoted by .oEl , is E obtained from the weighted sum of local DNNs, .oEl = k wkL oLk . The definitions of the weight coefficients can be found in [25]. Then, the edge DNN is shared among all the MUs associated with this AP. Meanwhile, edge DNNs are sent to the central cloud periodically. In the central cloud, the global DNN is obtained by aggregating E the parameters of local DNNs, i.e., .oG = l wlE oEl . Finally, the global DNN is shared to all MUs and APs. With federated learning, the central cloud and MEC servers do not collect training samples from MUs and hence will not cause privacy issue due to sharing data. Besides, the communication overheads for sharing DNNs are much lower than sharing data.
3.4 Meta-Learning for Generalization The performances of deep learning algorithms are very sensitive to hyperparameters including initial values of parameters (e.g., weights and bias), learning rates, and structures of DNNs. In traditional deep learning, these hyper-parameters are obtained from the trial-and-error method, which requires considerable human efforts. As a result, a well-trained neural network cannot adjust itself fast enough in dynamic wireless networks, such as vehicular networks, asset tracking, and
623
Traditional deep learning: trial-and-error (human efforts) Find proper hyper-parameters by trial-and-error Find proper hyper-parameters by trial-and-error Find proper hyper-parameters by trial-and-error ...
Task 1: scheduling for V2X Task 2: scheduling for XR Task 3: scheduling for UAV
Hyper-parameters of deep learning
Initial values of parameters
Learning rates
Activation functions
Numbers of layers/neurons
...
Meta learning for deep learning: optimizing hyper-parameters (computing resources) Training of meta learning Outputs Inputs Task 1: scheduling for V2X Hyper-parameters for V2X Task 2: scheduling for XR Hyper-parameters for XR Task 3: scheduling for UAV Hyper-parameters for UAV ... ...
Implementation of meta learning to a new task Input of meta learning: Scheduling for IIoT Output of meta learning : Hyper-parameters for IIoT
Learning to learn without human efforts
Optimizing parameters (e.g., weights and bias) given hyper-parameters
Ultra-Reliable and Low-Latency Communications in 6G: Challenges. . .
Fig. 2 Illustration of meta-learning for scheduler design with different applications
UAV communications. Even a pre-trained DNN can guarantee the KPIs of URLLC in one communication environment; it may not work when the communication environment becomes different. To handle this issue, meta-learning has been applied in the existing researches to optimize hyper-parameters, including initial parameters, learning rates, and the structures of neural networks [26, 27]. With the help of meta-learning, pre-trained DNNs can be adjusted according to dynamic communication environments by few-shot learning. By taking scheduler design as an example, we illustrate the difference between meta-learning and traditional deep learning in Fig. 2. The target is to find an optimal mapping from channel and queue states (e.g., head-of-line delay in timesensitive networks) to the amount of resources allocated to different mobile devices. Such a mapping is determined by both hyper-parameters and parameters. By using meta-learning, the hyper-parameters are optimized by using existing tasks and related hyper-parameters as training samples. After the training phase, metalearning generates proper hyper-parameters for a new task without human efforts. The bottleneck of meta-learning is the requirement of vast computational resources. As shown in [27], hundreds of graph processing unites (GPUs) are used to train the meta-learning algorithm for image recognition over several days.
4 Experiments and Simulation In this section, we carry out some experiments and simulation to evaluate the performance of using deep learning in URLLC. As shown in the experiments and
624
C. She and Y. Li
User-level Experiment: Mobility prediction of a tactile device
Cell-level Experiment: Scheduler design in LTE systems
Network-level Simulation: User association in 5G New Radio networks
400 m
Region 1 Region 2
MU AP
200 m
200 m
Fig. 3 Experiments and simulation at different levels
simulation in Fig. 3, we carry out user-level and cell-level experiments with a real tactile device and transceivers with long-term evolution (LTE) protocols. To further evaluate the performance in the coming 5G or 6G networks, we provide some network-level simulations with 5G New Radio.
4.1 User-Level Experiment: Mobility Prediction A real 3D system touch tactile device is applied to control a virtual robotic arm. The location given by the device is updated every time slot with a duration of 1 ms (i.e., the same as the transmission time interval in LTE systems) and is recorded for training and testing. The performance of the two prediction methods is evaluated. The first one is a model-based method based on Newton’s laws of motion [13]. The second method uses a fully connected DNN to predict future locations from past locations. The reason why we use a fully connected DNN is that it can achieve good performance for a small-scale prediction problem, and there is no need to use other types of DNNs. The inputs are the locations of the device in the past 50 ms and the outputs are the predicted locations in the coming 20 ms. There are two hidden layers, each of which has 100 neurons. The parameters of the DNN are trained
Ultra-Reliable and Low-Latency Communications in 6G: Challenges. . .
625
Table 2 Prediction error probability Requirement on prediction accuracy: 2 cm Prediction horizon 5 ms −6 .6.61 × 10 Model-based −5 Data-driven .> 10 −6 Data-driven .4.59 × 10
.3.85
10 ms × 10−5
.3.20
10 ms
20 ms
.6.86
× 10−6
20 ms × 10−3
.2.25
× 10−5
with .104 training samples. After the training phase, the DNN is used to predict the mobility of the tactile device in an experiment with a duration of .2 × 106 ms. The performance of model-based and data-driven methods for mobility prediction is shown in Table 2, where the error probability is defined as the probability that the distance between the predicted location and the actual location is larger than the required prediction accuracy. The results in Table 2 show that it is possible to achieve high prediction reliability (i.e., .10−5 prediction error probability) with either model-based or data-driven methods, and the accuracy achieved by the data-driven method is better than that achieved by the model-based method. This is because the model used in the model-based method is not accurate enough to achieve high prediction accuracy [13].
4.2 Cell-Level Experiment: Scheduler Design In the cell-level experiment, we apply an actor-critic DRL algorithm for scheduler design, where one AP serves two MUs [28]. The radio transceivers are universal software radio peripheral B210. The AP has an Intel i7-8700 CPU with 6 cores and each MU is equipped with an Intel i7-6700 CPU with 4 cores. The actor has two hidden layers, each of which has 40 neurons. The critic also has two hidden layers, and each layer consists of 60 neurons. To reduce the training time and improve exploration safety, the actor and the critic are pre-trained in a simulation environment, where a digital model that mirrors the behavior of the real-world network is used to generate feedback to the DRL algorithm. The packet size is 200 bytes, and the average packet arrival rate is 100 packet/s. The total bandwidth is 5 MHz. The DRL algorithm minimizes the overall packet loss probability subject to the requirements on delay and jitter, which are characterized by two delay bounds, .Dmin = 9 ms and .Dmax = 11 ms. The E2E delay should be higher than .Dmin and lower than .Dmax , which means the jitter should be less than 2 ms. To avoid long feedback delays and high jitters, retransmission is not allowed.
626
C. She and Y. Li
Table 3 Reliability achieved by DRL Evaluated in simulation environment Pre-trained actor in real-world system Fine-tuned actor in real-world system
Delay violation × 10−5 −2 .7.15 × 10 −2 .1.29 × 10 .2.05
Decoding error × 10−4 −3 .5.73 × 10 −3 .4.55 × 10 .2.25
Overall packet loss × 10−4 −2 .7.74 × 10 −2 .1.75 × 10 .2.46
The overall packet loss probabilities achieved by DRL in both the simulation and the real-world network are provided in Table 3. To evaluate the packet loss probability in the simulation, both decoding errors and delay bound violations are taken into account. In addition to these two factors, hardware impairment in the realworld network will cause packet losses. The results in Table 3 show that if the actor trained in the simulation environment is directly applied in the real-world system, the achieved reliability is worse than that evaluated in the simulation. After finetuning, the reliability can be improved, but is still worse than that in the simulation environment. There are two reasons: (1) The scheduler cannot control jitter caused by signal processing and data transmission in the practical system, while in the simulation environment this part of delay is fixed. (2) The modulation and coding scheme in LTE systems cannot achieve the minimum decoding error probability, which is computed from the normal approximation in simulation [13]. To reduce the user-experienced delay, we can combine user-level mobility prediction with cell-level scheduler design. According to the results in Table 2, the prediction horizon can be up to 10 ms with reliability better than .10−5 . The latency evaluated in the real-world network lies in .[9, 11] ms with packet loss probability around .10−2 . If the AP sends the predicted locations to the MUs 10 ms in advance, the user-experienced delay will be .[−1, 1] ms (a negative user-experienced delay means that the user can predict the mobility of the transmitter). However, to improve the overall reliability, 5G New Radio and more advanced computing systems are needed.
4.3 Network-Level Simulation: User Association We consider a wireless network with 5G New Radio, where two APs serve five delay-tolerant MUs and five URLLC MUs [20]. The network topology can be found in the simulation and experiment of the network level in Fig. 3. To reflect the impacts of nonstationary hidden variables on the performance of different schemes, we change the ratio of the number of MUs in Region 1 to that in Region 2 from .5 : 5 to .9 : 1 after .2,000 simulation trials. Packets generated by each MU are either processed at the local server of the MU or offloaded to an MEC server. Since batteries of MUs have limited capacities, we minimize the maximal normalized energy consumption of MUs subject to QoS constraints. The normalized energy consumption of each MU is defined as the ratio
Normalized energy consumption (J/Mbit)
Ultra-Reliable and Low-Latency Communications in 6G: Challenges. . .
627
0.8 Highest SNR (5:5) Highest SNR (9:1) Game (5:5) Game (9:1) Deep learning (5:5->9:1) Optimal (5:5) Optimal (9:1)
0.7 0.6 0.5 0.4 0.3 0.2 0
500
1000
1500
2000
2500
3000
3500
4000
Number of tests
Fig. 4 Maximal normalized energy consumption with two APs
of the energy consumption to the number of bits that have been processed. To train the DNN, 8000 training samples were explored from the numerical platform with the method in [11]. The DNN includes one input layer, one output layer, and four hidden layers, each of which consists of 100 neurons. The inputs of the DNN include the large-scale channel gains and average packet arrival rates of all the MUs. The output of the DNN is the user association scheme. To illustrate the advantages of deep learning, it is compared with three baselines: (1) Each MU accesses to the AP with the highest large-scale channel gain (with legend ‘Highest SNR’); (2) a game theory approach developed in [29] (with legend ‘Game’), which is one of the most recent work on user association with hybrid services; and (3) the optimal user association scheme obtained with the exhaustive searching method (with legend ‘Optimal’). Simulation results in Fig. 4 show that the performance of the deep learning approach is better than the two existing schemes and is close to the optimal scheme. When the distribution of MUs locations changes, 500 new training samples are used to fine-tune all the layers of the DNN. It is much smaller than the number of training samples that are needed to train a new DNN, e.g., 8000 in our simulation.
5 Future Directions 5.1 From Small-Scale to Large-Scale Networks As the numbers of devices and APs increase in the 6G networks, the number of parameters in the feed-forward DNN will be large. To avoid training a large number of parameters, one can use convolutional neural networks (CNNs). Since the number of parameters of a CNN does not increase with the dimension of the input, CNNs are
628
C. She and Y. Li
very convenient in image processing. However, the topology of a wireless network is much more complicated than a two-dimensional image that can be represented by a matrix. To optimize radio resource allocation according to the topology of the network, we can use graph neural networks (GNNs) [30]. For example, in URLLC, the graph representation of the network could be the interference and the input features could be the packet arrival rates of MUs. Based on the graph representation and the input features, the GNN outputs the optimal repetition policy that minimizes the packet loss probability of URLLC.
5.2 From Centralized to Distributed Learning Algorithms With a global view of the wireless network, centralized learning algorithms can achieve better performance than distributed learning algorithms in terms of E2E delay, reliability, and resource utilization efficiency. However, centralized learning algorithms need to collect information from all the APs and MUs and thus bring high overheads in backhauls and lead to long control-plane latency. With a distributed learning algorithm, each AP or MU can make its decisions according to local information and hence can reduce the response latency. Nevertheless, with local information, it is hard to guarantee the reliability. Ensuring the QoS requirement of URLLC with distributed algorithms deserves further study.
5.3 From Wireless Networks to Interdisciplinary Research 6G networks are expected to support applications in different vertical industries, such as vehicle networks, mission-critical Internet of Things, and VR/AR applications. The specific QoS requirements and traffic features of different applications are very different. Thus, interdisciplinary research is crucial for achieving the target requirements. By formulating theoretical models of both communication systems and specific applications in vertical industries, model-based methods enable us to understand, predict, and optimize the performance of the application from an interdisciplinary perspective. Based on the model-based analysis, we can design practical solutions with data-driven deep learning methods to approach fundamental limits and handle model mismatch problems, which have significant impact on the latency and reliability in practical systems.
6 Conclusion In this chapter, we investigate how to apply deep learning in URLLC by taking TI as an example. We first reviewed the unique requirements of different TI applications
Ultra-Reliable and Low-Latency Communications in 6G: Challenges. . .
629
and the promising technologies for them. From these applications and technologies, we summarized the new research challenges. To address these challenges, novel methodologies are introduced, including wireless AI, a multi-level architecture, federated learning, and meta-learning. Some ideas are demonstrated in the case study section. Finally, we discussed some future directions.
References 1. C. She, R. Dong, Z. Gu, Z. Hou, Y. Li, W. Hardjawana, C. Yang, L. Song, and B. Vucetic, “Deep learning for ultra-reliable and low-latency communications in 6G networks,” IEEE Network, vol. 34, no. 5, pp. 219–225, 2020. 2. Z. Hou, C. She, Y. Li, D. Niyato, M. Dohler, and B. Vucetic, “Intelligent communications for tactile internet in 6G: Requirements, technologies, and challenges,” IEEE Commun. Mag., vol. 59, no. 12, pp. 82–88, 2021. 3. P. Schulz, M. Matthe, H. Klessig, et al., “Latency critical IoT applications in 5G: Perspective on the design of radio interface and network architecture,” IEEE Commun. Mag., vol. 55, no. 2, pp. 70–78, Feb. 2017. 4. A. Aijaz and M. Sooriyabandara, “The tactile internet for industries: A review,” Proc. IEEE, vol. 107, no. 2, pp. 414–435, Feb. 2019. 5. M. Maier, M. Chowdhury, B. P. Rimal et al., “The tactile internet: vision, recent progress, and open challenges,” IEEE Commun. Mag., vol. 54, no. 5, pp. 138–145, May 2016. 6. C. She, C. Sun, Z. Gu et al., “A tutorial of ultrareliable and low-latency communications in 6G: Integrating domain knowledge into deep learning,” Proc. IEEE, vol. 109, no. 3, pp. 204–246, March. 7. J. Park, S. Samarakoon, H. Shiri et al., “Extreme URLLC: Vision, challenges, and key enablers,” arXiv preprint arXiv:2001.09683, 2020. 8. N. Bui, M. Cesana, S. A. Hosseini et al., “A survey of anticipatory mobile networking: Contextbased classification, prediction methodologies, and optimization techniques,” IEEE Commun. Surveys Tuts, vol. 19, no. 3, pp. 1790–1821, Apr. 2017. 9. C. Sun, C. She, C. Yang, T. Q. Quek, Y. Li, and B. Vucetic, “Optimizing resource allocation in the short blocklength regime for ultra-reliable and low-latency communications,” IEEE Trans. on Wireless Commun., vol. 18, no. 1, pp. 402–415, Jan. 2019. 10. Y. Sun, M. Peng, Y. Zhou, Y. Huang, and S. Mao, “Application of machine learning in wireless networks: Key techniques and open issues,” IEEE Commun. Surveys Tuts., vol. 21, no. 4, pp. 3072–3108, 2019. 11. R. Dong, C. She, W. Hardjawana et al., “Deep learning for hybrid 5G services in mobile edge computing systems: Learn from a digital twin,” IEEE Trans. Wireless Commun., vol. 18, no. 10, Oct. 2019. 12. O. Holland, E. Steinbach, R. V. Prasad et al., “The IEEE 1918.1 “Tactile Internet” standards working group and its standards,” Proc. IEEE, vol. 107, no. 2, pp. 256–279, Jan. 2019. 13. Z. Hou, C. She, Y. Li, Z. Li, and B. Vucetic, “Prediction and communication co-design for ultrareliable and low-latency communications,” IEEE Trans. Wireless Commun., vol. 19, no. 2, pp. 1196–1209, Feb. 2020. 14. M. Giordani and M. Zorzi, “Non-terrestrial networks in the 6G era: Challenges and opportunities,” IEEE Network, vol. 35, no. 2, pp. 244–251, Dec. 2020. 15. C. Chaccour, M. N. Soorki, W. Saad et al., “Can terahertz provide high-rate reliable low latency communications for wireless VR?” arXiv preprint arXiv:2005.00536, 2020. 16. P. Kumari, A. Mezghani, and R. W. Heath Jr, “JCR70: A low-complexity millimeterwave proof-of-concept platform for a fully-digital MIMO joint communication-radar,” arXiv preprint arXiv:2006.13344, Jan. 2020.
630
C. She and Y. Li
17. W. Yuan, Z. Wei, J. Yuan, and D. W. K. Ng, “A simple variational Bayes detector for orthogonal time frequency space (OTFS) modulation,” IEEE Trans. Veh. Technol., vol. 69, no. 7, pp. 7976– 7980, Apr. 2020. 18. J. Park, S. Samarakoon, M. Bennis, and M. Debbah, “Wireless network intelligence at the edge,” Proc. IEEE, vol. 107, no. 11, pp. 2204–2239, Oct. 2019. 19. M. Wise, “APM: Driving value with the digital twin.” GE Digit, 2017. 20. R. Dong, C. She, W. Hardjawana, Y. Li, and B. Vucetic, “Deep learning for hybrid 5G services in mobile edge computing systems: Learn from a digital twin,” IEEE Trans. Wireless Commun., vol. 18, no. 10, pp. 4692–4707, 2019. 21. F. Tang, Y. Kawamoto, N. Kato, and J. Liu, “Future intelligent and secure vehicular network toward 6G: Machine-learning approaches,” Proc. IEEE, vol. 108, no. 2, pp. 292–307, Feb. 2020. 22. H. Sun, X. Chen, Q. Shi et al., “Learning to optimize: Training deep neural networks for interference management,” IEEE Trans. Signal Process., vol. 66, no. 20, pp. 5438–5453, Oct. 2018. 23. C. Qi, Y. Hua, R. Li et al., “Deep reinforcement learning with discrete normalized advantage functions for resource management in network slicing,” IEEE Commun. Lett., vol. 23, no. 8, pp. 1337–1341, Aug. 2019. 24. C. She, C. Yang, and T. Q. S. Quek, “Cross-layer optimization for ultra-reliable and low-latency radio access networks,” IEEE Trans. Wireless Commun., vol. 17, no. 1, pp. 127–141, Jan. 2018. 25. L. Liu, J. Zhang, S. Song, and K. B. Letaief, “Edge-assisted hierarchical federated learning with non-IID data,” arXiv preprint arXiv:1905.06641, 2019. 26. C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” in Proc. ICML, 2017. 27. B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learning transferable architectures for scalable image recognition,” in Proc. IEEE CVPR, 2018. 28. Z. Gu, C. She, W. Hardjawana, S. Lumb, D. McKechnie, T. Essery, and B. Vucetic, “Knowledge-assisted deep reinforcement learning in 5G scheduler design: From theoretical framework to implementation,” IEEE J. Sel. Areas Communications, vol. 39, no. 7, pp. 2014– 2028, 2021. 29. J. Zhou, X. Zhang, and W. Wang, “Joint resource allocation and user association for heterogeneous services in multi-access edge computing networks,” IEEE Access, vol. 7, pp. 12 272–12 282, Jan. 2019. 30. Y. Shen, J. Zhang, S. Song, and K. B. Letaief, “Graph neural networks for wireless communications: From theory to practice,” arXiv preprint arXiv:2203.10800, 2022. Changyang She (S’12-SM’23) received his B. Eng degree in Honors College (formerly School of Advanced Engineering) from Beihang University (formerly Beijing University of Aeronautics and Astronautics, BUAA), Beijing, China, in 2012 and a PhD degree in School of Electronics and Information Engineering from BUAA in 2017. From 2017 to 2018, he was a postdoctoral research fellow at Singapore University of Technology and Design. From 2018 to 2021, he was a postdoctoral research associate at the University of Sydney. Since 2021, he has served as the Australian Research Council (ARC) Discovery Early Career Researcher Award (DECRA) Fellow at the University of Sydney. He served as a Guest Editor of IEEE Journal on Selected Areas in Communications Special Issue on Next Generation Ultra-Reliable and Low-Latency Communications (URLLC) and a Guest Editor of IEEE Wireless Communications Special Issue on Intelligent URLLC. His research interests lie in the areas of ultra-reliable and lowlatency communications, tactile Internet, Industrial IoT, deep learning in 5G, and beyond, and interdisciplinary research on Metaverse. Yonghui Li (M’04-SM’09-F’19) received his PhD degree in November 2002 from Beijing University of Aeronautics and Astronautics. Since 2003, he has been with the Centre of Excellence in Telecommunications, the University of Sydney, Australia. He is now a Professor in the School of
Ultra-Reliable and Low-Latency Communications in 6G: Challenges. . .
631
Electrical and Information Engineering, University of Sydney. He is the recipient of the Australian Queen Elizabeth II Fellowship in 2008 and the Australian Future Fellowship in 2012. His research interests are in the area of wireless communications, with a particular focus on MIMO, millimeter wave communications, machine-to-machine communications, coding techniques, and cooperative communications. He holds a number of patents granted and pending in these fields. He received the best paper awards from IEEE International Conference on Communications (ICC) 2014, IEEE PIMRC 2017, and IEEE Wireless Days Conferences (WD) 2014. He is a Fellow of the IEEE.
Deterministic Network Jessie Hui Wang, Yipeng Zhou, and Yuedong Xu
The Internet provides best-effort services using packet switching. It does not reserve resources for data flows in advance. Therefore, a packet of a data flow would have to queue for forwarding to its next hop if there are packets in the queue when it arrives, and the queue latency depends on the time needed to process these packets, which is nondeterministic. In the worst case, the packet might be dropped if the queue has been full. The applications on the Internet have always been dying for QoS (quality of service) guarantees. However, providing QoS guarantees is a nontrivial task for the Internet, and the cost of providing such a service, such as making the Internet operation prohibitively complex and reducing the scalability of the Internet, made it an undesirable choice. In recent years, people proposed “the industrial Internet” and started to use the Internet for industries instead of using it just for consumer users. Traditionally, there have been many fieldbus technologies for various industrial automation and control systems (IACSs). Each fieldbus technology is usually designed for a specific application and may require specific hardware. Researchers hope that these IACSs can be transited to packet switching networks, which can reduce the cost, solve the incompatibility issues, and improve the ability to accelerate growth. Furthermore, people would like to see the convergence of information technology (IT) and operation technology (OT) networks to reduce the complexity of maintaining the
J. H. Wang () Tsinghua University, Beijing, China e-mail: [email protected] Y. Zhou Macquarie University, Sydney, NSW, Australia e-mail: [email protected] Y. Xu Fudan University, Shanghai, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_25
633
634
J. H. Wang et al.
networks. Beyond industrial applications, many emerging applications, such as telesurgery and interactive virtual reality, also demand ultralow latency (ULL). These desires suggest that deterministic forwarding capability is required not only at layer 2 but also at layer 3. Therefore, DetNet WG was established, and its charter was approved in October 2015 [20]. The WG aims to address layer 3 aspects in support of applications requiring deterministic networking and provide bounds on latency, loss, packet delay variation (jitter), and high reliability, so that applications with critical timing and reliability requirements can be migrated to packet networks and their data flows can coexist with traditional data flows which are statistically multiplexed traffic. Before that, in 2012, the IEEE 802.1 Audio Video Bridging (AVB) Task Group was renamed to IEEE Time-Sensitive Network (TSN) Task Group (TG). It is responsible for layer 2 operations of deterministic services through IEEE 802 networks [9]. The charter of DetNet WG states that it collaborates with TSN TG to define a common architecture for both layer 2 and layer 3. A deterministic network refers to a network that can provide deterministic quality of service, e.g., delay, loss, and jitter. Generally, there are two interpretations of “deterministic quality of service.” Taking delay as an example, the first interpretation is to guarantee bounded delay, which means the packet must arrive before a certain time, while the second interpretation is to guarantee fixed delay, which means the packet must arrive exactly at a certain time. Obviously, the requirement specified by the first interpretation is easier to satisfy. From this point of view, DetNet is closely related to, but not exactly the same as ULL, since ULL aims to reduce latency as much as possible to satisfy real-time applications. Besides the two interpretations mentioned above, there are also some cases in which applications would like to have probabilistic guarantees [3], e.g., the delay upper bound is met with a high probability, say 95% of the time within a timeslot. These kinds of SLAs are widely used between real-time network (RTN) providers and live video streaming applications. Allowing a low probability of violating the performance requirement can significantly reduce the networking cost of these applications and does not significantly degrade the quality of experience of their users. Please note that DetNet concerns worst-case performance instead of average performance. Improving average performance can be achieved by priority-based queuing schemes, but the schemes may not be a suitable option for DetNet. When a data flow needs to be transmitted by the deterministic network, it sends a request with a specified performance requirement, and the network should reject this request if it thinks the requirement cannot be satisfied. Note that data flow of diverse applications can have different orders of magnitude of required latency values. Once DetNet admits a request, the data flow expects that the performance of this flow is guaranteed. We know that the Internet is a best-effort network, which means it admits all transmission requests but provides no performance guarantee. DetNet WG belongs to the IETF routing area, but the implementation of deterministic networks is not only a routing problem. Currently, the RFCs proposed in IETF DetNet WG mainly focus on problem statement, use cases, data plane framework, administration, and maintenance. The number of RFCs and active
Deterministic Network
635
Internet drafts within DetNet WG is increasing quickly. DetNet WG assumes layer 2 networks have achieved deterministic QoS and DetNet just needs to connect layer 2 segments to establish a multi-hop path over lower-layer technologies such as MPLS and time-sensitive networking (TSN) as defined by IEEE 802.1. Most requirements of DetNet are not unique, and DetNet WG tries to reuse existing technologies and IETF solutions instead of proposing exclusive solutions. For technologies that are not exclusively used by DetNet, DetNet WG just discusses required modifications or extensions of those technology to make them supportive for DetNet, and the work will be coordinated with appropriate groups in IETF, IEEE, and other Standards Development Organizations (SDOs) responsible for the technology. Therefore, DetNet serves as a focal point to maintain the consistency of all relevant works. Many DetNet RFCs are a collection of ideas and analyses, and DetNet WG now only aims to specify frameworks and identifies requirements, and the specific technical solutions are usually done by other WGs. DetNet WG currently focuses on providing deterministic QoS in networks that are under a single administrative domain, e.g., campus-wide networks and private WANs. Providing deterministic QoS in the open Internet is more complicated, as it involves multiple administrative domains, which makes resource reservation more challenging, and the scalability issue is a more serious consideration in the Internet. RFC 8578 lists representative DetNet use cases to help researchers understand the types of applications that can be supported by DetNet and help ensure that the concerns that lead to the establishment of DetNet are addressed by the WG. These use cases are for diverse industries, and the RFC describes the gap between their current capability and the expected capability in the future and briefly discusses how to achieve the goal. At the time of this writing, there have been 14 RFCs, among which 6 are proposed standard RFCs and 8 are informational RFCs. There are also dozens of active Internet Drafts.
1 Overview and Framework to Enable DetNet Over-provisioning is an easy and widely used solution for performance improvement in the Internet. Massive over-provisioning is ruled out as a solution for DetNet and RFC 8557 explicitly states that DetNet must be able to support DetNet flows with demands of more than half of the network’s available bandwidth. Definitely, DetNet flows would consume networking resources, which reduces the available bandwidth for non-DetNet flows, but starvation of non-DetNet traffic must be avoided. Furthermore, the efficiency of networking resource must be considered, which means the resources that are reserved for but not currently used by DetNet flows should be used by non-DetNet flows to provide best-effort service, so that the DetNet flows that blindly reserve large quantities of bandwidth or never unreserve resource after using cannot use up the networking resource and starve other flows.
636
J. H. Wang et al.
Fig. 1 The components of a hop latency in an end-to-end path
Resource reservation is necessary and unavoidable for guaranteeing deterministic QoS. DetNet satisfies the performance requirement of data flows by assigning queue priorities for flows, reserving link bandwidth and/or buffer space, and replicating packets along multiple paths. For example, as shown in Fig. 1, end-to-end latency in a packet switching network is composed of the latency on each link of the path, and the latency on a link is composed of transmission latency, propagation latency, processing latency, and queue latency. The former three are relatively fixed, but the processing latency depends on how many packets are in the queue when the packet arrives, and it can have great fluctuation. Therefore, guaranteeing end-to-end latency means we have to control the maximum length or priority of queues of a networking device. Taking another key performance metric loss rate as an example, almost all packet losses in wired networks are caused by active packet drops due to the lack of queue buffer; therefore, the key to guaranteeing a low packet loss rate is to ensure that the queue length is shorter than the buffer size. In summary, guaranteeing latency and packet loss rate in packet-switching wired networks is essential to control the queue length by proper scheduling algorithms including timing and priority with the queue buffer size in mind. A data flow that is with given timing and throughput requirements and to be forwarded along a multi-hop path is named as a DetNet flow. Figure 2 shows an overview of the framework and key techniques towards deterministic networking. Since there can be multiple flows, either DetNet flows or traditional flows, in a DetNet, logically there must be a controller to coordinate/orchestrate these flows according to their performance requirements and current networking states. The controller is also responsible for admission control, i.e., rejecting a flow if the resource is not sufficient to satisfy its requirement. Logically it should be a central control point, but it can be implemented either in a centralized or distributed way, and DetNet WG states that the centralized approach should be delivered first in RFC 8557. From the operation and management perspective, DetNet should support the following capabilities. First, each DetNet flow should be assigned and tagged
Deterministic Network
637
Fig. 2 Overview of the framework and key techniques towards deterministic networking
with a unique flow identifier for networking nodes to identify the flow. Second, DetNet service nodes should be able to describe the characteristics and specify the performance requirements of a DetNet flow and report this information to the controller for making control decisions. A DetNet service node can be the end node of the flow, which means the end node itself involves DetNet, or it can be an ingress node of the DetNet. Third, networking devices should be able to report their capability, available physical resources, and perceived dynamical topology to the controller, so that the controller can make admission control decisions. In the control plane, the logical controller should support the following capabilities. First, before making an admission control decision, the controller needs to compute a path or multiple paths for the flow, which can be done in either a centralized or distributed manner. The centralized way is preferred since distributed routing protocols suffer interruptions during routing convergence caused by network dynamics such as device failures or other topology changes. Second, DetNet should have a signaling protocol to enable the setup of the computed path. Third, together with the computed path, the controller also determines the latency contribution of each node/link and makes resource reservations on each node according to the node’s reported status and the determined latency contribution. In other words, DetNet guarantees end-to-end latency by controlling the latency of each hop. Resource reservation for each flow must be translated into parameters of the queueing/scheduling/shaping algorithms of involved devices, and these parameters need to be delivered to the devices, e.g., the reservation of timeslots for the transmission of the flow and also the buffer size if loss rate needs to be guaranteed. In the data plane, each device should support some queueing algorithms, shaping algorithms, and scheduling algorithms, which ensures the queueing latency is within the specified desired range. Furthermore, to achieve better reliability, sometimes DetNet replicates packets and delivers the duplicated packets along multiple paths to cope with packet losses, in which the relevant DetNet nodes should support duplication and deduplication of packets.
638
J. H. Wang et al.
2 Protocol Stack and Packet Encapsulation The protocol stack of DetNet consists of two sublayers [44]. The upper sublayer, i.e., service layer, is responsible for service protection, such as duplication and deduplication for fault mitigation. The service layer receives data flows from applications, and if required, it uses multiple paths to transmit multiple equivalent streams to protect the flow from being interrupted by buffer overflow, node failures, and link failures. The lower layer, i.e., forwarding layer, is closely related to congestion protection, which ensures low loss, guaranteed latency, etc. through the use of queuing techniques and traffic engineering methods. Currently, DetNet has defined standards to make use of three forwarding sublayers (i.e., three network types), which include native IP (packet switch network), MPLS (label switch network), and TSN (layer 2). Note that DetNet does not require all these forwarding sublayers to have equivalent capabilities. The details of TSN are standardized by IEEE 802.1. The operations of nodes in DetNet IP [43] and DetNet MPLS [46] are being standardized by IETF DetNet WG. Two DetNet nodes using the same technologies can be connected by a DetNet subnetwork with different DetNet technologies. For example, two DetNet IP service nodes (and their corresponding subnetworks) might be connected using a DetNet MPLS domain or a TSN subnetwork, which are referred to as DetNet IP over MPLS and DetNet IP over TSN, respectively. It is also possible that two TSN subnetworks are connected by a DetNet IP domain. Figure 3 illustrates the network model and these possible scenarios. No matter which network type is used, a packet of a DetNet flow must encode sufficient information in its packet header for a DetNet node to process the packet properly. First, a packet should have its flow identifier for a DetNet node to identify which DetNet flow it belongs to, so the DetNet node can know how to treat this packet. Second, a packet should have a sequence number to uniquely identify itself in its flow, so the DetNet node that is responsible for packet deduplication can correctly recover the original data flow. Furthermore, when a packet traverses across two DetNet domains with different forwarding technologies, the encapsulation of the packet needs to be modified accordingly. We introduce how these problems are solved in the current IETF DetNet WG.
Fig. 3 Data plane network model: DetNet A over DetNet B, in which A and B can be IP, MPLS, or TSN (RFC 8655 [14])
Deterministic Network
639
2.1 DetNet IP A simple way of operating the DetNet IP data plane is using a native approach without any further encapsulation, i.e., encoding the flow ID using a 6-tuplebased flow identification approach and encoding sequence number using the native sequence number of IP packets. Six-tuple refers to the six fields of IP packets, i.e., source address and port, destination address and port, protocol (next header in IPv6), and DSCP field. The first five fields are the same fields used to identify an IP flow, and the DSCP field is used in differentiated services (Diffserv, RFC 7657), which was proposed to implement QoS in the Internet but never widely supported. In case that some fields cannot be used, (e.g., IPsec, ESP), a 3-tuple or even a 2-tuple (i.e., source and destination addresses) can be used, and it is also feasible to use IPSec header SPI with exact matching. IP fragmentation must be avoided in DetNet IP networks since fragments have no port fields which are in transportation layer header. Furthermore, fragmenting a packet into multiple fragments makes it more complex to guarantee the latency of the packet. In DetNet IPv6 networks, fragmentation must not be used, and in DetNet IPv4 networks, IPv4 packets must be sent with the DF bit set, which signals the routers on the path that fragmentation of this packet is not allowed. Since fragmentation is not allowed, a DetNet sender must ensure that the size of the packet to be sent must not exceed the MTU of the DetNet path. Although in theory PMTU can be detected using ICMP packets, the result may not be correct because in practice ICMP packets can follow different paths from the data packets. In DetNet IPv6, besides the above method, there can be other ways to encode flow identifiers. For example, the suffix of an IPv6 address can be used to encode the information required by DetNet, since a node can have a /64 IPv6 address block. The other choice is using IPv6 flow label field for the identifier of flows. The way to identify a flow may limit the number of DetNet flows that can be supported by the DetNet. Due to the scalability concern, DetNet requires that flow aggregation must be supported. In general, only flows with the same (or similar) performance requirements can be aggregated. In DetNet IP networks, DetNet flows can be aggregated using wildcards, masks, lists, prefixes, or value ranges. IP tunnel is also a feasible way to support flow aggregation, in which an extra packet header is added to the packets of the flows. In summary, DetNet IP reuses part of already existing fields of the header to encode information, so non-DetNet and DetNet IP packets have the same IP packet header format, but DetNet-enabled devices need to associate each IP packet with its flow to understand its performance requirement and treat the packet accordingly to ensure its performance requirement is satisfied. RFC 8939 [43] aims to provide a detailed specification for the data plane of DetNet IP.
640
J. H. Wang et al.
2.2 DetNet MPLS MPLS is a label-switched network technology, in which a data packet can have one or more labels to carry information with the packet. The length of a label is fixed (32 bits) and its semantics should be predefined before it is used. In DetNet MPLS, there are two types of labels, i.e., S-Label and F-Label. S-Label is a kind of “service” label which is used between DetNet nodes to identify a DetNet flow at the DetNet service sublayer, while F-Label is a kind of “forwarding” label, which is hop-by-hop label transmitted between label switching routers (LSRs) at the DetNet forwarding sublayer to indicate the label switched path (LSP) used to forward a DetNet flow. To provide DetNet service over MPLS networks, an MPLS-encapsulated packet can use d-CW (DetNet control word) and one or multiple S-labels to encode the necessary information. An S-Label can be from the label space of the platform (i.e., global label) or from the label space of the receiver. S-Label is set by the sender to indicate the next hop (the receiver or a downstream node), the DetNet flow identity, and the MPLS payload type of this packet, so the next hop knows how to deal with it. If an S-Label is from the platform label space, it is similar to a pseudowire (PW) label [5], and the label space needs to be maintained by the controller. If an S-Label is allocated from the receiver’s label space, the S-Label is used only locally for the receiver node, so the receiver should advertise its label space and the semantics of each label to the sender. The S-Label should be set to different values if the packet is replicated and sent to multiple downstream nodes. DetNet MPLS needs to explicitly add more header fields to encode the metadata required for DetNet processing. The DetNet Control Word (d-CW) conforms to the Generic PW MPLS Control Word (PWMCW) defined in RFC 4385. Its first 4 bits have to be set to zeros, and the remaining 28 bits are used to carry the DetNet sequence number. It also supports a 16-bit sequence number for legacy clients, but a 28-bit sequence number is more friendly for high-speed networks due to the concern of wrapping. The length of the sequence number field should be notified by the controller to DetNet nodes. The aggregation of DetNet MPLS flows can be implemented via the use of hierarchical LSPs in MPLS, which was typically used for aggregating resources. The other way for aggregation is to define an aggregation label (A-Label) and insert the A-Label into the label stack, which is similar to encapsulation technology. The A-Label can be viewed as a special case of S-Label used for aggregation.
2.3 Packet Forwarding over Different DetNet Domains In general, DetNet domains should be able to provide DetNet services for any DetNet flows, no matter which protocol the DetNet end systems are using. In other words, two end systems with the same encapsulation formats (say DetNet IP) can communicate with each other using DetNet services over a DetNet domain with
Deterministic Network
641
Fig. 4 DetNet IP over DetNet MPLS
different subnetwork technologies (say DetNet MPLS), and the DetNet domain that connects them is required to provide appropriate service to their DetNet flows, i.e., DetNet flows must be mapped to flows that use the flow semantics supported by the underlying subnetwork technology. Let us take DetNet IP over MPLS as an example to illustrate how the packet encapsulation is modified when the packet is transited over a different DetNet domain. A simple way is that the DetNet domain is viewed as a tunnel (or an overlay) and the original flow is viewed as the payload of this domain, which means a new header for this domain is added to the packets of this original flow. Figure 4 illustrates the packet encapsulation for DetNet IP over DetNet MPLS. Here, “application flow” is the traffic generated by the applications that require deterministic performance guarantees, and it is the payload for DetNet IP networks. To transit this flow over DetNet IP networks, a DetNet IP packet header is added to encode its flow identifier, as described in Sect. 2.1. At the border between the DetNet IP domain and the DetNet MPLS domain, there must be a DetNet node that can understand both DetNet technologies and make the packets of this flow ready for transmission over the DetNet MPLS domain. According to Sect. 2.2, d-CW and a stack of labels should be added as the DetNet MPLS header to indicate its sequence number, flow identifier, and forwarding path. The sequence number can be translated from its IP sequence number, and the flow identifier can be assigned according to the operation of this MPLS domain. However, the forwarding path within the MPLS domain needs to be determined by control plane protocols and F-Labels should be generated to indicate the selected path. The other way is replacing the original header with a new header for the DetNet domain that connects two ends, which essentially can be viewed as a translation technology. The “active stream identification function” in RFC 9023 [47] can be used to solve the DetNet IP over the TSN mapping problem. TSNs usually exploit VLAN ID and PCP (priority code point) as flow identifiers, so it replaces these fields of original packets according to the ID of the mapped TSN stream, and it also needs to modify the destination MAC address accordingly.
642
J. H. Wang et al.
Besides the modification of packet encapsulation, flow-related requirements and parameters have to be set or tuned accordingly in control plane and management plane. The underlying subnetwork that connects two DetNet ends/subnetworks is just a single hop from the perspective of the end-to-end DetNet path. The latency requirement of this overlay hop has to be converted to the hop latency within the underlying domain, and the bandwidth requirement, if exists, also has to be tuned due to some factors such as the encapsulation overhead. IETF DetNet WG has standardized the data plane operations of various combinations of different DetNet technologies as the underlying and overlying technologies, including DetNet IP over MPLS [42], DetNet IP over TSN [47], DetNet MPLS over UDP/IP [45], DetNet MPLS over TSN [48], and DetNet TSN over MPLS [49].
3 Specifying Flow Characteristics and Requirements When an application data flow uses DetNet services, it needs to specify its traffic characteristics and performance requirements, and then DetNet nodes can associate the identified data flow with its performance requirement and determine how to treat its data packets. The most important characteristics that have to be specified are packet size (maximum size and minimum size), transmission time interval (an integer number of nanoseconds), and maximum packets per time interval, which essentially define the worst-case scenarios to be used for resource reservation and queueing algorithm selection. A data flow with random packets must be shaped to a flow with determined characteristics before it can be transmitted in a DetNet domain, and flows exceeding the traffic specification will be regarded as malicious or malfunctioning and prevented from entering the DetNet domain. In this way, resource contention among flows is avoided or mitigated and the performance can be guaranteed. Besides these traffic specifications, an application flow also needs to specify its flow type (IP, MPLS, or TSN), flow status (ready, failed, etc.), flow rank (which is the priority of this flow relative to other flows), etc. Flows with low priority can be bumped if the resource of some links is insufficient due to unexpected failures. The performance requirements can be specified for a data flow in terms of its desired minimum bandwidth, maximum latency, maximum latency variation, maximum packet loss rate, the maximum number of consecutive packets whose loss can be tolerated, and the tolerable maximum number of packets that can be received out of order. The flow can also specify whether it requires the same data path and physical path for both directions through the network. If the flow would like to have congruent paths in the two directions, the control plane needs to consider this requirement when making routing decisions for this flow, and the configuration module also needs to be aware of this requirement when it chooses the nodes for duplication and deduplication for two directions of the flow. Some of the above attributes to specify performance requirements are correlated. For example, if the delivery latency of a packet is too long, the effect is the same as a
Deterministic Network
643
packet loss for the application; if more bandwidth is allocated to a flow, the flow can exploit the bandwidth to replicate its packets and achieves a lower packet loss rate. Whether it is necessary and how to extend these attributes to allow comprehensive performance requirement specification is still under discussion.
4 Shaping and Queueing When a DetNet node receives a packet from its upstream nodes, it first checks whether this packet belongs to a configured DetNet flow. If no configured DetNet flow is matched, the packet must be dropped. If there is a matching DetNet flow, the node will exploit the resource reserved for this flow to process and forward this packet. If all nodes along the path have reserved sufficient queue buffer for this flow, the DetNet thus achieves zero packet loss rate for this flow. The latency a packet experiences in a node depends on the queueing mechanism the node uses and the transmission timeslots it reserves for this packet. Therefore, given a fixed path, we can calculate the latency a packet experiences under different queueing mechanisms and queueing parameters. Before the transmission of a DetNet flow, the logical controller can decide whether this path can be selected and how to set the parameters for this flow according to the computation results. In this section, we will introduce the possible operations of DetNet nodes, including various shaping mechanisms, queueing mechanisms, and policing mechanisms. IETF has published RFC 9320 [13] to discuss the information about this problem, and most of these algorithms are first proposed for time-sensitive networks in the IEEE TSN Task Group.
4.1 Operations of a DetNet Node Figure 5 shows the operations of a DetNet node in data plane to process and forward arriving packets. When a packet of a DetNet flow arrives, its queue assignment function shapes the packets of each flow and selects which queue this packet should be put in. Within an individual queue, packets are usually served in a FIFO (first in and first out) manner. Priority queues, in which the packets with higher priority can be served earlier although they arrive later, are possible, but it is difficult to implement complex queues and it is also difficult to control the queueing time of packets in complex queues; therefore, in practice FIFO queues are widely used. There can be multiple queues in a node, and these queues compete for transmission timeslots to be sent out. The transmission selection function of the node decides which queue is to transfer its packet at the queue head to the output port in each transmission timeslot. A DetNet node can use either per-flow queueing mechanism or per-class queueing mechanism. With the per-flow queueing mechanism, each DetNet flow
644
J. H. Wang et al.
Fig. 5 Operations of a DetNet node
is assigned to a separate queue and the queue has its guaranteed data rate and a pernode latency bound; therefore, the performance of a DetNet flow is deterministic and can be calculated from its traffic specification (such as the data rate of the flow) and the guaranteed rates and per-node latencies of the queues of the nodes along its path. However, only a limited number of DetNet flows can be supported with this mechanism. The per-class queueing mechanism is proposed to solve the scalability concern, in which multiple DetNet flows are aggregated into a “class” according to some rules and their packets are put in the same queue. In other words, a queue provides service for a class of flows instead of a single flow. The flows that share a queue compete for the resource allocated to the queue and may interfere with each other; therefore, the complete view of all flows is required to compute the delay bound of each DetNet flow. Once a flow joins or leaves the class, the delay bound of each flow changes and must be recomputed. Furthermore, the per-class queueing mechanism also brings a problem of “burstiness cascade”, i.e., a burst of a flow in a node A influences the flows that share a resource with this flow in A, which in turn causes bursts in downstream nodes of these flows. Burstiness cascade must be avoided; otherwise, guaranteeing performance becomes extremely difficult, if not infeasible. A practical solution to this problem is using interleaved regulator to reshape the flows at every hop. With the interleaved regulator, the packet at the head of a queue is regulated based on its own flow shaper constraints.
4.2 Credit-Based Shaping (CBS) Traffic burst is always the root challenge for deterministic performance. If a flow has a burst in a period, it is possible that some packets cannot be queued because too many packets arrive at the queue which makes the queue full or there are too many packets to be finished transmitting in the allocated time cycle, which
Deterministic Network
645
makes troubles for the scheduler to achieve deterministic processing delay. The goal of shaping algorithms is to limit the resource consumption of a flow to its pre-configured bandwidth limit, and all extra packets can be processed only when there is an available resource. The credit-based shaper described in IEEE 802.1Qav can limit the number of packets/frames one node sends to the next downstream node to avoid the downstream buffer being overflowed or its queue being too long to be sent out within a single timeslot. It uses a parameter “credit” to control the access to transmission media, which can be viewed as the quota for transmitting. Only when the credit is equal to or larger than zero and the channel is not occupied by conflicting traffic, the packets in the queue can be transmitted. The credit is increased at the specified rate (named idleSlope) when there are waiting packets, and it is decreased at the rate of sendSlope when queued packets are transmitted. idleSlope can be viewed as the maximum transmission data rate allowed for this queue. It must be smaller than the rate that can be supported by the port and the next downstream node associated with this queue. sendSlope is the transmission rate, which is usually equal to the rate that can be supported by the port and the next downstream node. The credit cannot be larger than a threshold highcredit, which is considered “a high watermark.” When there are no waiting packets, the credit limit is zero and cannot be increased. These two points ensure that the queue cannot accumulate too many credits, so it avoids the next downstream node receiving burst traffic flows.
4.3 Asynchronous Traffic Shaping (ATS) Asynchronous traffic shaping is essentially an interleaved regulator. At each output port of a node, there is one interleaved regulator per input port and per class, and the packets received at an input port for a given class are enqueued in the respective interleaved regulator at the output port. Normally, there are eight queues for eight traffic classes, of which two queues are for DetNet flows. Other queues are for control-data traffic (which is the most critical) and best-effort traffic, and these queues do not employ shaping algorithms. The asynchronous traffic shaping method described in IEEE 802.1Qcr reshapes the flow at each node and aims to achieve low latency even at high link utilization. It is based on the urgency-based scheduler (UBS) proposed in [38] and [39]. ATS requires that all flows must satisfy the following condition: wi (d) ≤ bˆi + d rˆi .
.
Basically, this condition is a leaky bucket constraint, in which .wi (d) is the cumulative packet data in the time interval of duration d, .bˆi is the upper bound of the burstiness of the flow, and .rˆi is upper bound of the leak rate. It means that for any arbitrary time interval of duration d, the cumulative packet data is bounded by a function of the burstiness and leak rate. Limiting the burstiness of each flow is the
646
J. H. Wang et al.
key idea of ATS that allows us to ensure the upper bound of its per-hop delay for its flows, and the leak rate is the required bandwidth of this flow. An ATS-enabled port has at least one mandatory queue for each incoming port to receive the packets. This is to ensure that an upstream node cannot attack the flows from other upstream nodes by sending a huge number of packets. Each mandatory queue contains packets of the same priority level. If the port has more physical queues, it can allow multiple priority levels from the same node. The packets are queued in multiple physical queues and these queues are FIFO queues. They are competing for sending out packets. ATS brings the concept of “pseudo-queue” to solve the contentions of these queues. The queues for receiving packets are merged into a pseudo-queue for sending packets to the next hop. The scheduler decides the head packet of which queue can be sent out. It uses strict priority scheduling and an interleaving shaping algorithm to ensure that each outputted flow satisfies the leaky bucket algorithm. In other words, while a previous packet is currently under transmission, the scheduler decides the next packet for transmission by comparator networks or by linear iteration over the HQ (headof-queue) packets of all queues. An HQ packet is considered to be eligible for transmission if it has satisfied the leaky bucket algorithm constraint. The eligible packet with the highest priority level is selected and becomes the next packet to be transmitted. Figure 6 illustrates the whole procedure. There are two representative types of interleaved shaping algorithms to judge whether a packet has been eligible for transmission. The first one, length rate quotient (LRQ), is a packet-by-packet leaky bucket algorithm. The per-flow state of a flow .fi contains a time stamp .ti to store the eligibility time for the next packet of flow .fi . Once a packet p of .fi is sent out, .ti is set to be the quotient .l/ˆri , where l is the length of the packet p. The next packet of flow .fi after p is at least delayed until the time reaches .ti . It means that the burstiness parameter .bˆi is the maximum packet length of this flow. The second one, token bucket emulation (TBE), is a token-based leaky bucket algorithm. With this algorithm, the per-flow state of a flow .fi contains
Fig. 6 Asynchronous traffic shaping (ATS) and pseudo-queues
Deterministic Network
647
a time stamp .ti to store the output time of a packet p and also contains a variable bi which can be interpreted as “remaining burstiness” of .bˆi . .bi is decreased by the packet length l when the packet is sent out and it accumulates at a rate of .rˆi . The packet p of .fi becomes eligible when the bucket level .bi recovers to the length of the current head-of-queue packet p. The flows outputted by both interleaved shaping algorithms satisfy the leaky bucket constraint mentioned above. It is a per-class queueing mechanism and each queue contains packets from multiple flows of the same class. Both interleaved shaping algorithms introduce cross-dependencies among different flows. If the current HQ packet is a non-eligible packet and it is from the flow .fi , later packets in this queue cannot be transmitted even if the packets are from other flows and have been eligible for transmission. The authors of [38] show that the additional delays caused by cross-dependencies in these two algorithms are zero or rather small and, most importantly, do not accumulate over time. The authors also show that ATS can ensure the upper bound of per-hop latency. Taking the LRQ algorithm as an example, the bound of its perhop latency, i.e., the period from the arrival at the receiving queue to the departure time of leaving the pseudo-queue, is
.
P Q,SO .d i
≤ maxj ∈I
bˆH + bˆC(j ) + lˆL lˆj + r r − rˆH
.
Here, .j ∈ I indicates all flows that share the queue with the flow .fi , and .bˆH is the total burstiness of all flows with a higher priority than .fi . Similarly, .bˆC(j ) is the total burstiness of all flows that are competing with .fj in the sending pseudoqueue, and .lˆL the upper bound of the length of packets in all flows with a lower priority than .fj . The first item can be viewed as the latency to handle and clear away all packets with higher priority (.bˆH ), earlier packets from competing flows in the sending pseudo-queue (.bˆC(j ) ), and a single packet from flows with a lower priority (.lˆL ). The second item is the latency to send out this packet of .fi to the next hop. Please note this bound could be slightly different if shaping algorithms with other leaky bucket algorithms are used because the shaping algorithm affects the eligible time to send out a packet. Since the outputted flow of this node satisfies the leaky bucket algorithm constraint, the next downstream node can also use ATS to ensure its per-hop latency. In ATS, the scheduler is easy to implement and it can achieve good scalability, but how to configure the system, such as the rate, burstiness, and priority, to satisfy the performance requirements of DetNet flows is a nontrivial problem.
4.4 Time-Aware Shaping (TAS) ATS can be viewed as an event-triggered system [35], in which an HQ packet becomes eligible for transmission once the flow associated with this packet satisfies
648
J. H. Wang et al.
Fig. 7 An example of gate control list
the leaky bucket constraint. IEEE 802.1Qbv describes time-aware shaping (TAS), which is a time-triggered shaper. It works in a way similar to TDMA (time division multiple address). Time is split into timeslots and each queue is controlled by a “gate.” Whether queued packets in a queue are eligible for transmission is determined by the status of the gate in this timeslot, i.e., “open” means this queue is eligible for being considered by the transmission selection algorithm and “close” means packets in this queue have to wait. Each port can have up to eight queues, and it is associated with a gate control list (GCL), which consists of a list of gate control entries (GCEs). Each entry specifies the status of all gates in a timeslot. Figure 7 shows a gate control list. In the first timeslot, queues 7 and 4 are closed, while all other queues are open. In the second timeslot, queues 6 and 4 are closed. The gate control list is usually cyclic. Assume the gate control list in Fig. 7 is a cyclic list with only four entries. If the length of a timeslot is 125 us, then the time length to complete a cycle is 500 us, and the states of the gates for queues 7 and 6 are switched every 125 us, and the gate states for queues 5 and 4 are switched every 250 us. To prevent any low-priority flows from affecting the high-priority flows, there must be a guard band in which no low-priority flows are allowed to be transmitted. In other words, a low-priority packet is allowed to be transmitted only when the transmission can be completed by the start of the scheduled window for highpriority flows. If the frame preemption specified in IEEE 802.3Qbu is not allowed, the size of the guard band should be sufficient to complete the transmission of a max-size packet. Otherwise, with frame preemption, the size of the guard band can be reduced to the time length for the transmission of the smallest fragment. Frame preemption can improve the performance of high-priority flows with smaller guard band overhead. As the gate control list can be cyclic, this scheme is especially friendly to period traffic flows, such as the flows of the data that is regularly collected by the monitors
Deterministic Network
649
in some industrial applications or sensors. Irregular traffic can be shaped at the monitors or devices to obey a certain regular distribution to facilitate subsequent processing in the DetNet domain. If a packet arrives at a timeslot in which the gate is closed, it has to wait for the next open timeslot. The sender that sends out the packet and all nodes along the path should be time-synchronized and orchestrated to achieve the lowest latency. The gate control list needs to be configured by the control/management plane according to the delay requirements of data flows. It is not easy to implement a dynamic control list, and most existing works are simply assuming the control list has been determined by a global controller in advance to satisfy the performance requirements of all data flows. However, solving the scheduling problem with all DetNet flows traversing the network in consideration is an NP-hard problem [8].
4.5 Cyclic Queuing and Forwarding (CQF) IEEE 802.1Qch describes the cyclic queuing and forwarding (CQF) method, which delivers deterministic and easily calculated latency for time-sensitive data flows. The essence of this method is that each packet that arrives at the cycle i is guaranteed to be delivered to its next hop in the cycle .i + 1. As a result of this per-node guarantees, the maximum end-to-end latency is .(h + 1) × Tc , in which h is the number of hops on the path and .Tc is the length of a cycle for this flow class; the minimum end-to-end latency is .(h − 1) × Tc + DT , in which DT is the time taken for a packet from the output buffer of a node A to a buffer of the next node B, which includes the output delay of A, the delay of the link from A to B, the preemption delay if preemption allowed, and the processing delay of B. With CQF, a given class of DetNet flows has two queues. We can assume that one queue is named as “even queue” and it is allowed to transmit packets in even cycles; the other queue is named as “odd queue” and it is allowed to transmit packets only in odd cycles. In other words, in an even (odd) cycle, the odd (even) queue collects packets from its upstream nodes, and these packets will be transmitted in the next odd (even) cycle. Therefore, all packets have a deterministic per-node latency which is approximately one cycle. CQF requires that all packets must be sent, transmitted, and received as scheduled. It indicates that all nodes in a DetNet domain need to be strictly synchronized to have a common understanding of the start time of each cycle, which is nontrivial in large-scale networks. Note that ATS described in the last subsection is an asynchronous scheduler that does not require time synchronization. It also assumes that the length of a cycle is enough to transmit all data packets collected during the period from the last open cycle, which means the cycle time .Tc should be carefully designed. Furthermore, if the link between two nodes is so long that the transmission cannot be completed within a single timeslot, the packets cannot arrive at the scheduled timeslot. Researchers proposed CQF-3 and CQF-3 introduced a
650
J. H. Wang et al.
third packet queue to act as a buffer to avoid packet drops when packets arrive in the wrong cycle. CQF is only suitable for small-scale, light-loaded networks, as it cannot deal with topology changes and path changes. Its robustness is also a serious problem. If there are traffic micro-bursts or traffic overload, the time jitter performance can deteriorate sharply. As a synchronized scheduler, it requires packets must be transmitted within scheduled timeslots, which results in bandwidth wastes. It is unsuitable for low-rate real-time traffic, as its latency is related to the assigned rate (bandwidth). For any flows that desire low latency, the scheduler must allocate more frequent timeslots, which means larger bandwidth.
4.6 Shaping Algorithms for Large-Scale DetNets As mentioned above, CQF is capable of providing an easily calculable latency bound, but it is only suitable for small-scale networks. There are some ongoing research efforts to propose shaping algorithms for large-scale deterministic networks [27], such as large-scale deterministic network (LDN) [36], Cycle Specified Queuing and Forwarding (CSQF) [7], Tagged Cyclic Queuing and Forwarding (TCQF) [10], Global Cyclic Queuing and Forwarding 3-Queue (GCQF-3) [33], and asynchronous DetNet framework [23]. Most of these algorithms are variants of CQF. Without changing the fundamental logic of providing deterministic latency, they enhance CQF to support higherspeed long links by exploiting more queues (greater than two queues) and cycle identification. Nodes in a DetNet domain synchronize frequency (cycles) among themselves instead of synchronizing time. Taking CSQF as an example. Each packet explicitly specifies the identifier of the cycle in which each node along the path should forward it to the next hop. Figure 8 shows an example of a packet with CSQF. Using the Segment Routing Identifier (SID), the packet X specifies a cycle list .< 1, 2, 5, 6, 8 >, which means the first hop, i.e., .R1 , is required to forward the packet to .R2 in the first cycle, .R2 is required to forward the packet to .R3 in the second cycle, and so on. It can be seen that CSQF alleviates the dependency on strict time synchronization in CQF. Furthermore, it does not require that packets sent by upstream nodes should be received in a single cycle, so long links can be supported. An egress port is associated with three queues, i.e., one queue for receiving packets from upstream nodes, one queue for sending packets to the next node, and one queue for tolerating the packets that arrive at “wrong” cycles. The roles of these queues rotate at each cycle. SDF is slightly different from CSQF. In SDF, each node maintains a cycle mapping relationship table that maps incoming packets with a cycle identifier to the cycle in which the packets should be forwarded. The relationship tables are pre-computed and configured in advance by the control plane. A packet carries the identifier of the cycle in which the packet is sent out by the upstream node, so the downstream node gets the instruction to determine the queue in which the packet
Deterministic Network
651
Fig. 8 Cycle Specified Queuing and Forwarding (CSQF)
should be put. In other words, no matter in which cycle the packet arrives, its packet header tells the sending cycle at the upstream node, so the downstream node can perform the right actions according to the cycle identifier in the packet header and the mapping table. In these variants, the data plane has to be enhanced to allow the packet header to carry cycle identifiers. [26] defines a new IPv6 option that can work with some variants for DetNet. It leverages the IPv6 HbH (hop-by-hop) options or DOH (destination option header) to carry a cycle identifier. When a node receives a packet with this type of IPv6 options, it extracts the cycle identifier and uses the identifier to enqueue the packet to the correct queue. The cycle identifier needs to change every hop, and a local cycle identifier should be assigned to the packet before the packet is forwarded. Although it is not explicitly explained, this option can also be used with CSQF to carry multiple cycle identifiers in a single packet header.
5 Redundancy for Service Protection The algorithms described in Sect. 4 assume that each queue has sufficient buffer and only focuses on controlling latency. However, if the queue buffer is used up, packets have to be discarded and packet loss occurs. Besides, there is also non-contentioncaused packet loss, such as wireless signal interference, cable breakage, and node failures. For some DetNet applications, the consequences of packet loss can be extraordinarily serious. The applications that cannot tolerate packet loss require a mechanism to handle packet loss and achieve high reliability. The key idea is to trade redundancy for reliability, i.e., replicate the packets, and transmit them on multiple separate/disjoint paths. It is implemented at the DetNet service sublayer by Packet Replication, Elimination, and Ordering Functions (PREOF), which is similar to Frame Replication and Elimination for Reliability (FRER) described in IEEE 802.1CB. PREOF includes three functions, i.e., PRF (packet replication function), PEF (packet elimination function), and POF (packet ordering function).
652
J. H. Wang et al.
5.1 Packet Replication and Elimination Figure 9 shows an example DetNet flow with PREOF protection, whose packets are replicated and transmitted on multiple disjoint paths for service protection. “R” indicates that it is a replication node, which replicates the packets of this flow for each outgoing path. “E” indicates an elimination point, and it is responsible for eliminating any duplicated packets of this flow it receives. If there are n copies sent on n disjoint paths, it can protect against link failures on .n − 1 paths. The receiver node may receive multiple duplicated packets as a result of the replication, so we should have a node that is responsible to remove duplicated packets, and the last elimination node should be on all paths used for transmission to make sure the receiver would not be confused by duplicated packets. To remove duplicated packets, each original packet should be assigned to a packet identifier or a sequence number, and its duplicated packet keeps the service sublayer identifier and packet identifier unchanged. On the other hand, each flow of duplicated packets should have its own forwarding sublayer flow identifier; otherwise, relay nodes cannot forward them on different paths according to the precomputed configuration. Balazs et al. [50] presents an example of how the DetNet IP data plane supports PREOF. Different paths taken by replicated flows of an original application flow can take different times to reach the elimination point. An elimination point needs to remember which packets have arrived and has been submitted to the next hop or the upper layer, which consumes the memory of the elimination point. To reduce memory consumption, an entry to remember the identifier of a received packet can be expired after a configured history window. If the latency difference among the paths is too large and exceeds the history window size, the elimination node may submit duplicated packets. Therefore, the history window size must be larger than the maximum latency difference among all used paths, or advanced algorithms are exploited to save the information about received packets. The PREOF configuration can be much more complex for a DetNet Flow than the example in Fig. 9. Figure 10 shows an example in which the DetNet flow has three pairs of replication points and elimination points. This example is used in multiple documents related to PREOF. It can protect the flow from failures of multiple links. Even under a situation in which all dotted links are broken, the destination still can receive the DetNet flow.
Fig. 9 Transmitting replicated packets on multiple disjoint paths for service protection
Deterministic Network
653
Fig. 10 A complex PREOF configuration for a DetNet flow (replicating three times can defeat the failures of three links in different segments)
Configuring the replication points and elimination points for a DetNet flow to satisfy the requirement to protect the communication from failures of particular links is a nontrivial task. The information of the global network is needed, so centralized solutions might be preferred, similar to the path computation and resource reservation decision problem. With this kind of duplication and elimination, the network can provide seamless reliable service and fast recovery proactively at the cost of additional network resources for time-sensitive applications.
5.2 Packet Ordering Let .pn denote the packet with the sequence number of n of a DetNet flow and the DetNet flow is served by two disjoint paths for PREOF. If the copy of .pn on the faster path is lost and the copy of .pn+1 on the faster path arrives at the elimination point successfully, out-of-order packet delivery occurs, which is intolerable for most DetNet applications. Therefore, the packet ordering function needs to be executed after the elimination point or on the elimination point before packets are sent to the receiver of the DetNet flow. It is assumed that there are no duplicated packets sent to POF. POF should have a buffer to cache all out-of-order packets. It remembers the sequence number of the last forwarded packet. A buffered packet will be forwarded when all packets before it have been forwarded, or the packet has been buffered for a predefined time threshold “POFMaxDelay,” or no packet was received for a predefined time “POFTakeAnyTime.” POFMaxDelay must be larger than the delay difference among the paths used by this flow. If .pn+1 arrives at time t and .pn does not yet arrive at .t+POFMaxDelay, it suggests that .pn has been lost on all paths, so .pn+1 can be forwarded. POF buffer increases the latency experienced by packets. To reduce the negative influence, POF can set the value of POFMaxDelay for each individual path according to the latency of this path and the total latency budget requirement of this flow. In this way, POF regards the goal of meeting the deadline as more important than the goal of strict in-order delivery. A flow can specify its quality
654
J. H. Wang et al.
Fig. 11 Traffic burst caused by PREOF. (a) The situation when a failure occurs. (b) The situation some time after path recovery
requirement such as the maximum number of consecutive out-of-order packets. The other problem is that POF cannot know whether the first received packet is in order or out of order. The POF algorithm has to make a trade-off between the experienced latency and the number of out-of-order packets. An analysis of how to determine the required buffer size can be found in [34], and the authors also analyze the effect of POF buffer on worst-case delay, jitter, and propagation of arrival curves. Another negative influence of POF is that POF may increase the burstiness of the flow, because some packets may experience buffering latency at POF, while other packets may not. A case is illustrated in Fig. 11, in which the data rate of the flow doubles in a period. In this case, two paths with different path latencies are used. The source emits packets to these two paths synchronously. Obviously, the number of packets in flight on a path is the traffic rate times the path latency, which means there are more packets in flight on the longer path. Let us assume the longer path has 40 more in-flight packets. Assume that the short path fails when the destination receives the packet with sequence number 100 from this path, which suggests the destination is receiving the packet 60. After the short path fails, the receiver cannot get any valid packets until 40 packets later, because the packets from 60 to 100 on the long path have been received from the short path. Let us assume that the short path recovers when the source emits the packet 200. Later, the receiver receives the packet 200 from the short path, and at the same time, it also receives the packet 160 from the long path. Both packets are valid, which means the destination is receiving packets at twice the sending rate. This situation continues until the 40 more packets on the long path are all received. In this procedure, the destination has a silent period of .40/r and a burst with 80 packets in a period of .40/r, in which r is the emitting data rate at the source. The delay variation and bursts caused by the packet ordering can be removed by placing a de-jitter buffer or flow shaper after the POF node. However, a de-jitter buffer or flow shaper further increases communication latency.
Deterministic Network
655
Since POF increases the latency and jitter experienced by flows, some researchers try to remove POF from the above service protection scheme. In [37], the authors formulate the routing and scheduling problem for .1 + 1 protected DetNet flows and achieve reliable in-order packet delivery by carefully scheduling packet transmissions on the two paths based on CSQF. In their scheme, each node on the two paths forwards packets in carefully scheduled cycles which are computed in advance by solving the scheduling problem.
5.3 Network Coding RFC 8655 mentions network coding techniques as an alternative way for service protection. Similar to PREOF, it also consumes extra bandwidth to achieve reliable packet delivery without any packet retransmission. The source, or a node close to the source, encodes original packets into a sufficient number of coded packets. At the destination, coded packets are decoded to recover the original packets. Network coding is often used in wireless networks where packet loss is more popular. With networking coding, a sender encodes original packets into coded packets and sends more coded packets to defend the flow from packet loss. As long as the receiver receives a sufficient number of coded packets, no matter which coded packets are received, it can decode these packets and recover the original packets. The coded packets can be sent over a single path or multiple paths. Different from PREOF in which packets are replicated and the destination must receive a copy of each original packet, network coding just cares about the number of packets the destination receives, so it is more powerful and efficient in protecting data flows from packet loss. However, encoding and decoding require computing resources and incur additional computing latency.
6 Computing and Establishing Paths In the previous sections, we assume that the path for a DetNet flow has been determined and focus on how to implement latency control on the nodes along the path, which is the operations of data plane. In this section, we briefly discuss how to compute and establish paths for a DetNet flow, which is considered to be tasks of the control plane.
6.1 Conventional Routing Paradigms The Internet, a typical packet-switched network, takes a hop-by-hop routing paradigm and routing paths are derived via dynamic routing protocols. When the
656
J. H. Wang et al.
network topology changes due to various reasons such as failures or configuration changes, dynamic routing protocols can help nodes exchange information about the new topology, and then a new path is computed for the new network topology. Dynamic routing enables the routing in the Internet that can adapt to various changes automatically and there is no need for manual interventions of network operators. Once all nodes in the network learn the information via dynamic routing protocols, such as topology or configured link weights, the nodes can compute a new usable path for any source-destination pairs. There are two conventional types of dynamic routing protocols. One type is named as link state protocol. Each node broadcasts information about each of its neighboring links to all other nodes and receives information about each link in the network from other nodes. Therefore, each node has a complete global view of the whole network, based on which it can compute paths for any node pairs using any shortest path algorithm such as the Dijkstra algorithm. With this type of routing protocol, the information about network configuration and topology is exchanged among nodes, which is working in a distributed manner, but the routing computation is completed by each individual node themselves, which can be viewed as working in a centralized manner. OSPF (Open Shortest Path First) and IS-IS (Intermediate System to Intermediate System) [12] are two representative protocols of this type and they are widely used in the Internet to solve the intradomain routing problem. The other type of routing protocols, which is named as distance vector, exploits a fully distributed paradigm. In this paradigm, each node advertises its knowledge of the routing in the whole network (instead of only neighboring links in linkstate protocols) to its neighbors (instead of to all nodes in link-state protocols). For example, a node A knows it can reach a destination d within h hops (or with a cost of c). According to distance vector protocols, it will advertise this capability to its neighbors, say node B. B receives this information and then knows that it can reach the destination d within .h+1 hops (or with a cost of .c +cAB wherein .cAB is the cost incurred for the transmission on the link between A and B). BGP (Border Gateway Protocol) and RIP (Routing Information Protocol) are two representative protocols of this type. RIP has been not commonly used because it is only suitable for small networks. BGP is the de facto interdomain routing protocol in the Internet. When the network changes, no matter which dynamic routing protocol is used, each node must advertise the changes. In link-state algorithms, the node that is close to the change broadcasts the change to all nodes, and all nodes recompute paths. In distance-vector algorithms, the node that is close to the change advertises the change to its neighbors, and then each neighbor updates their knowledge and must further advertise the updated knowledge to its own neighbors. It can be seen that it takes time for the nodes in the network to update their routing computation results and reach a new stable routing state, which is referred to as routing convergence problem. During routing convergence, the nodes in the network do not have a consistent routing decision; therefore, packets might be lost during hop-by-hop forwarding. It is an important classical problem, which is unsolved although there have been a lot of research efforts.
Deterministic Network
657
In recent decades, a centralized routing paradigm, i.e., SDN (software-defined networking) [6, 21], was proposed. Its original motivation is that inexpensive devices can be used for forwarding packets after it disassociates the forwarding process of network packets (data plane) from the routing process (control plane). In an SDN domain, a controller is responsible for routing computation of all sourcedestination pairs. It distributes its routing decisions to the nodes and these nodes just forward packets according to the instructions from the controller. Although researchers still have serious concerns about its security, scalability, and elasticity, as a centralized solution, SDN solves, at least significantly mitigates, the routing convergence problem.
6.2 Routing Extensions for DetNet DetNet requires that explicit paths be used, i.e., paths would not change with topology changes, at least paths would not change immediately. Explicit paths are established and resource reservation is made before traffic transmission of each DetNet flow, and these paths are fixed during the lifetime of the DetNet flow to avoid the impact of network convergence on DetNet flows. PROEF mechanism, instead of dynamic routing, can be used to handle the interruptions due to network changes. IETF DetNet WG does not specify any particular ways to compute explicit paths. In RFC 8938 and the IETF draft [28], it is mentioned that multiple ways can be used to provide explicit routes and resource allocation in the DetNet forwarding sublayer, e.g., via a centralized approach similar to SDN or via distributed approaches by extending current routing protocols or via hybrid approaches. The goal of conventional routing is to solve the reachability issue and find a usable path between each source and each destination. In general, all flows between the same source and destination node share a single path, and the shortest path, in terms of the number of hops or the sum of configured weights of all links on the path, is selected as the best path to be used. In 1998, IETF has published RFC 2386 to discuss the framework for QoSbased routing in the Internet. At that time, the motivation for proposing QoS-based routing is primarily to enable multiple paths between a source-destination pair to mitigate the limitation of single best path routing. It argues that spreading traffic among multiple paths can help improve the total network throughput and utilize networking resources efficiently. Furthermore, when the best path between the source and destination changes, the flow can choose to use the previous path instead of switching to the new path as long as the previous path can satisfy the performance requirement of this flow. Although the primary motivation is different from the goal of DetNet routing, which is to provide paths with deterministic performance guarantees, the framework and enabling techniques are similar, since they all require computing feasible paths according to the capability or resource availability of nodes, as well as the performance requirement of flows. Unfortunately, RFC 2386
658
J. H. Wang et al.
just summarizes a framework and discusses the necessary considerations. There are no specific ways defined and implemented to achieve the goal. It can be seen that there are at least two problems when using conventional routing for DetNet or QoS-based routing, i.e., nodes should be able to describe their own capability or available resource, and each individual flow should send its performance requirement to the entity that is responsible for making routing computation. There have been some IETF drafts to extend current routing protocols to define messages that can convey such information. Please note they are ongoing works, and we just briefly introduce their basic ideas below. In addition to reachability information, to enable DetNet, routing announcements exchanged between nodes must also carry information about nodes’ resources or capabilities. ITEF draft [15] extends RFC 7752 [18] which defines a network layer reachability information (NLRI) encoding format for using the BGP routing protocol to collect link-state and TE information from networks and share with other nodes. Especially, draft [15] defines four new TLVs (type-length-value triplets) to encode one node attribute and three link attributes. The node attribute is to encode and announce the max and min processing delay of the node and the maximum processing delay variation. The three link attributes are to carry the maximum bandwidth that may be reserved for DetNet, the available bandwidth that can be reserved on this link for now, and its time resource. The expression of the time resource of a node is a little complex and it is different according to the different shaping algorithms used by nodes. For example, if the underlying technology is a time-scheduling-based mechanism such as time-aware shaping, CQF, and CSQF, the TLV is defined to include the length, the start time, and the end time of timeslots in microseconds. With the TLVs defined in [15], nodes can exchange their resource information among themselves (in distributed paradigms) or report the information to the controller (in centralized paradigms) using the BGP routing protocol. Similarly, the draft [16] by the same authors proposes to extend the IS-IS protocol by defining the same four TLVs to encode link and node attributes. In RFC 5440 [51], the Path Computation Element Protocol (PCEP) is defined for the communication between a Path Computation Element (PCE) and a Path Computation Client (PCC) (or other PCE). To carry deterministic latency constraints and distribute deterministic paths for end-to-end path computation in DetNets, researchers have proposed various extensions to PCEP to include more message types. The IETF draft [53] proposes two types of metric objects, i.e., end-to-end bounded delay metric and end-to-end bounded jitter metric, to specify performance requirements in path computation request (PCReq) messages. They can also be used in path computation reply (PCRep) messages to describe the characteristic of computed paths. The draft also defines a deterministic path object to include deadline sub-TLVs and cycle sub-TLVs to indicate the deadline time or the cycle ID for a node to forward a DetNet flow. This extension allows PCE to distribute the information about computed paths to corresponding nodes to establish the paths. Similarly, the IETF draft [54] defines the RP (request parameter) object which includes bounded latency information (BLI) TLVs to send performance requirements to PCE, and proposes to use a list of BLI TLVs to express the paths
Deterministic Network
659
resulting from the computation. Furthermore, [54] also proposes bounded latency capability (BLC) TLVs for a PCEP speaker to advertise its support for bounded latency features and proposes traffic model object for a PCC to specify the traffic features in its PCReq messages.
6.3 Resource Reservation As mentioned above, finding a path for a DetNet flow can be done in a decentralized manner or a centralized manner, but centralized approaches are preferred since they do not have the performance degradation problem caused by routing convergence. A path computed for a DetNet flow does not only include a series of relay nodes from the source to the destination but also include the amount of resource needed to be allocated for this flow to guarantee the performance of this DetNet flow. How to specify the requested resource is related to the queueing and latency control techniques used by these nodes, e.g., allocate the number and index of cycles if CSQF is used and/or specify the size of buffer for this flow. After the path is computed, the nodes on the path need to be notified to reserve the requested resource. If all nodes can accomplish the reservation of requested resources, the path establishment is completed successfully; otherwise, a failure message needs to be delivered and the path computation procedure is conducted again to get a new path. Routing and resource allocation can be done by a single entity via the same communication protocol or they can be done separately by different entities and different approaches. There are two ways to notify and configure the nodes to establish the path. One way is a centralized approach in which a controller (usually the same controller for path computation) sends instructions to these nodes directly. It can use the extended communication protocols introduced in the last subsection, such as BGP, IS-IS, or PCEP, or use RESTCONF [1] or NETCONF (Network Configuration Protocol) [11] as the communication protocol to distribute the node configurations specified according to the YANG model [2]. YANG is a data modeling language widely used for conveying information in various network management tasks, such as specifying network configuration and reporting network state. Researchers have proposed the IETF draft [17] which extends the YANG model to satisfy the needs in providing DetNet services. The other way is a decentralized approach via modifying the traditional resource reservation protocol, i.e., RSVP (Resource ReSerVation Protocols) [4], which is designed for IntServ (Integrated Services) in the Internet. In RSVP, after the path from the sender to the receiver has been determined, the sender transmits RSVP “Path” messages downstream along the path to ask all nodes along the path to store the “path state” for this flow, which at least includes the address of the previous hop node. Once the receiver receives the “Path” message, it initiates a reverse procedure, i.e., it transmits RSVP “Resv” messages upstream towards the sender. The “Resv” messages follow exactly the reverse of the path and request the nodes along the
660
J. H. Wang et al.
path to reserve a specific amount of resources. It is a receiver-initiated procedure, which is not suitable for DetNet. Furthermore, it only considers the reservation of bandwidth resources. Therefore, RSVP needs to be modified and extended to be used for DetNet. There has been a draft [41] in which more RSVP objects are defined for using RSVP for resource reservation in TSN. But it is still ongoing work and only focuses on a specific layer 2 technology, i.e., TSN.
7 Management and Security of DetNet Network management is of primary importance for operating and maintaining networks as expected. It is a traditional problem in packet-switched networks and many protocols/techniques have been proposed. Network operators need solutions for various network management tasks, including but not limited to the discovery of DetNet-capable nodes and their capability, performance measurement of networks and services, status monitoring of nodes and services, and fault detection and localization. For example, operators need a toolset that can track the route for a specific DetNet flow to monitor whether the DetNet flow is transmitted over the planned path and validate whether the service effectively achieves the requested performance guarantees. Most of these management tasks are also required in the Internet, but DetNet requires that these tasks should be completed at extremely high quality because the Internet is essentially a best-effort network, while DetNet needs to guarantee service performance. Thus, the solutions for traditional management tasks may not solve the DetNet management problem and new challenges arise. Let us take connectivity verification and performance measurement of a DetNet flow as an example. In the Internet, a tool named as traceroute based on ICMP (Internet Control Message Protocol) can be used to track the path between a source and destination pair. But it is a well-known problem that network nodes may not treat ICMP packets in the exactly same way as the packets of the data flow, which means that the measurement result may not reflect the truth about the service performance. It is intolerable in DetNet, and it must be guaranteed that packets for management tasks should share fate with the monitored data traffic without introducing congestion. The In-band method of telemetry, in which measurement results are injected directly into data packets of the monitored flow, can solve this issue. However, the amount of information that can be encoded into data packets is limited, and additional resources must be reserved to accommodate such information to ensure that the performance of the monitored data flow is not affected. IETF DetNet WG has had some ongoing work on the OAM (operations, administration, and maintenance) issues of DetNet. The draft [31] describes the specific OAM requirements recommended to maintain a deterministic network, and how to encapsulate and carry OAM packets with MPLS data plane and IP data plane are briefly discussed in [29] and [30]. An extension to the echo request/reply
Deterministic Network
661
mechanisms is proposed to allow the ping initiator to discover DetNet relay nodes, collect configuration information from DetNet relay nodes, and discover the locations of PREOF functions [40]. In [22], it is proposed to extend BFD (bidirectional forwarding detection [24]) to realize accurate and timely on-path remote defect detection in DetNet. Basically, it adds three value-name pairs to the existing diagnostic codes specified in BFD and requires that messages should be triggered once the measured latency, the ratio of lost packets, or the ratio of out-oforder packets reaches the specified limit. Security is of particularly high importance among various management tasks; thus, it is often discussed separately from network management. Security in the context of DetNet has extreme significance because DetNet is designed to enable industrial control applications, and its failures in the control of physical devices (e.g., power grid devices) can have high operational costs. Therefore, DetNet has the potential to become an attractive target for cyberattackers. Obviously, all techniques that can be exploited to attack the underlying networks of DetNet can have an impact on the security of DetNet. On the other hand, all existing techniques to defend networking applications from attacks, such as authentication, authorization, encryption, and DDoS mitigation mechanism, are all useful for DetNet security. Besides that, there are some security concerns that are exclusively specific to DetNet, and we list some of them below. First, complete resource isolation among DetNet flows is essential for avoiding the risk of flows interacting and interfering with one another to provide deterministic performance, and physical resources must be effectively associated with a given flow at a given point in time. Otherwise, a malfunctioning device or a malicious flow can disrupt or subvert other relevant real-time flows without having to crack the application flows. Complete resource isolation is a challenging task and might be extremely costly in some cases. Second, DetNet relies on packet replication and elimination to mitigate the negative influence of unexpected failures, but transmitting a single DetNet flow on multiple paths also increases the attack surface of the network, i.e., there are more nodes being subject to attacks. Third, precise time synchronization is critically important for many DetNet shaping algorithms and the DetNet architecture, and the accuracy, availability, and integrity of time synchronization have significant implications for DetNet security. The security requirements of time protocols in packet-switched networks have been discussed in RFC 7384 [32]. DetNet also raises extra privacy concerns. DetNet requires that a data flow must expose additional information for a DetNet node to identify the data flow, and knowledge about the delivery time or scheduling calendar of a flow can also facilitate potential attacks. This is an inherent property in DetNet and cannot be avoided. Therefore, whether DetNet technology is suitable for a given use case needs to be carefully analyzed. A detailed discussion on the security impact of different attacks on some common use cases can be found in [19].
662
J. H. Wang et al.
8 Summary Deterministic network is considered to be a promising and preferred way to implement network slicing in 6G. In this chapter, we introduced the works towards providing deterministic performance guarantees at layer 3 proposed by the network community in IETF. Basically, there are two evolution paths. The first evolution path is to gain insight from the algorithms and mechanisms proposed by IEEE TSN TG. TSN TG aims to provide deterministic performance at layer 2. It has a longer history and is more mature than DetNet. Since DetNet should consider providing guaranteed service performance in large-scale networks instead of only small local networks which is the focus of TSN, DetNet WG is trying to develop new queueing algorithms and revise existing queueing algorithms based on the insights from TSN to make them suitable for high-speed long-distance links and a large number of flows. The second evolution path is to inherit the protocols developed by the computer network community. The goal of providing quality of services in packet-switched networks (e.g., IntServ and DiffServ) has been proposed two decades ago and has always been pursued by network operators. The difference between QoS and DetNet is that QoS is still a best-effort goal, while the goal of DetNet is required to be guaranteed. This difference brings challenges to existing protocols for QoS, and DetNet WG tries to solve the challenges by developing new techniques and extending existing protocols. Some gaps between the goal of DetNet and the capability of existing technologies are summarized and briefly analyzed in [52]. Some challenges are classical problems in packet-switched networks which have been studied for a long time, e.g., the computation complexity of solving the optimization problem to derive scheduling and routing decisions [25] and the capability of fast adaption to network changes. Some challenges are unique problems for DetNet, e.g., the theoretical analysis and the performance evaluation by simulations for various queueing algorithms and latency control techniques in large-scale networks. Although it is a promising and important technique for many use cases, DetNet still has a long way to achieve mature implementation and wide deployment in real-world networks.
References 1. A. Bierman, M. Bjorklund, and K. Watsen. RESTCONF Protocol. RFC 8040, RFC Editor, January 2017. http://www.rfc-editor.org/rfc/rfc8040.txt. 2. M. Bjorklund. The YANG 1.1 Data Modeling Language. RFC 7950, RFC Editor, August 2016. http://www.rfc-editor.org/rfc/rfc7950.txt. 3. Massieh Kordi Boroujeny and Brian L. Mark. Design of a stochastic traffic regulator for endto-end network delay guarantees. IEEE/ACM Transactions on Networking, 30(6):2531–2543, 2022. 4. Bob Braden, Lixia Zhang, Steve Berson, Shai Herzog, and Sugih Jamin. Resource ReSerVation Protocol (RSVP) – Version 1 Functional Specification. RFC 2205, RFC Editor, September 1997. http://www.rfc-editor.org/rfc/rfc2205.txt.
Deterministic Network
663
5. S. Bryant and P. Pate. Pseudo Wire Emulation Edge-to-Edge (PWE3) Architecture. RFC 3985, RFC Editor, March 2005. http://www.rfc-editor.org/rfc/rfc3985.txt. 6. D. Ceccarelli and Y. Lee. Framework for Abstraction and Control of TE Networks (ACTN). RFC 8453, RFC Editor, August 2018. http://www.rfc-editor.org/rfc/rfc8453.txt. 7. Mach Chen, Xuesong Geng, and Zhenqiang Li. Segment Routing (SR) Based Bounded Latency. Internet-draft, May. 2019. https://datatracker.ietf.org/doc/html/draft-chen-detnet-srbased-bounded-latency. 8. Silviu S. Craciunas, Ramon Serna Oliver, Martin Chmelík, and Wilfried Steiner. Scheduling Real-Time Communication in IEEE 802.1Qbv Time Sensitive Networks. In Proceedings of the 24th International Conference on Real-Time Networks and Systems, RTNS ’16, page 183–192, New York, NY, USA, 2016. Association for Computing Machinery. 9. Libing Deng, Guoqi Xie, Hong Liu, Yunbo Han, Renfa Li, and Keqin Li. A Survey of RealTime Ethernet Modeling and Design Methodologies: From AVB to TSN. ACM Comput. Surv., 55(2), jan 2022. 10. Toerless Eckert, Stewart Bryant, Andrew G. Malis, and Guangpeng Li. Deterministic Networking (DetNet) Data Plane - Tagged Cyclic Queuing and Forwarding (TCQF) for bounded latency with low jitter in large scale DetNets. Internet-draft, Nov. 2022. https://datatracker.ietf.org/doc/ draft-eckert-detnet-tcqf/. 11. R. Enns, M. Bjorklund, J. Schoenwaelder, and A. Bierman. Network Configuration Protocol (NETCONF). RFC 6241, RFC Editor, June 2011. http://www.rfc-editor.org/rfc/rfc6241.txt. 12. J. Farkas, N. Bragg, P. Unbehagen, G. Parsons, P. Ashwood-Smith, and C. Bowers. IS-IS Path Control and Reservation. RFC 7813, RFC Editor, June 2016. http://www.rfc-editor.org/rfc/ rfc7813.txt. 13. N. Finn, J.-Y. Le Boudec, E. Mohammadpour, J. Zhang, and B. Varga. Deterministic Networking (DetNet) Bounded Latency. RFC 9320, RFC Editor, November 2022. https://www. rfc-editor.org/rfc/rfc9320.txt. 14. N. Finn, P. Thubert, B. Varga, and J. Farkas. Deterministic networking architecture. RFC 8655, RFC Editor, October 2019. 15. Xuesong Geng, Zhenbin Li, and Tianran Zhou. BGP - Link State (BGP-LS) Advertisement of IGP DetNet Extensions. Internet-draft, Jul. 2022. https://datatracker.ietf.org/doc/draft-gengidr-bgp-ls-enhanced-detnet/. 16. Xuesong Geng, Zhenbin Li, and Tianran Zhou. ISIS-TE Extensions for Enhanced DetNet. Internet-draft, Jul. 2022. https://datatracker.ietf.org/doc/draft-geng-lsr-isis-te-extensionenhanced-detnet/. 17. Xuesong Geng, Yeoncheol Ryoo, Don Fedyk, Reshad Rahman, and Zhenqiang Li. Deterministic Networking (DetNet) YANG Model. Internet-draft, Oct. 2022. https://datatracker.ietf.org/ doc/draft-ietf-detnet-yang/. 18. H. Gredler, J. Medved, S. Previdi, A. Farrel, and S. Ray. North-Bound Distribution of LinkState and Traffic Engineering (TE) Information Using BGP. RFC 7752, RFC Editor, March 2016. http://www.rfc-editor.org/rfc/rfc7752.txt. 19. E. Grossman, T. Mizrahi, and A. Hacker. Deterministic Networking (DetNet) Security Considerations. RFC 9055, RFC Editor, June 2021. https://www.rfc-editor.org/rfcrfc9055.txt. 20. IETF DetNet Working Group. Deterministic Networking (detnet). https://datatracker.ietf.org/ wg/detnet/history/, 2022. [Online; accessed 19-Nov-2022]. 21. E. Haleplidis, K. Pentikousis, S. Denazis, J. Hadi Salim, D. Meyer, and O. Koufopavlou. Software-Defined Networking (SDN): Layers and Architecture Terminology. RFC 7426, RFC Editor, January 2015. http://www.rfc-editor.org/rfc/rfc7426.txt. 22. Hongyi Huang, Ren Tan, and Tianran Zhou. BFD Extension for DetNet Remote Defect Indication (RDI). Internet-draft, Oct. 2022. https://datatracker.ietf.org/doc/draft-huang-detnetrdi/. 23. Jinoo Joung, Jeong dong Ryoo, Tae sik Cheung, Yizhou Li, and Peng Liu. Asynchronous Deterministic Networking Framework for Large-Scale Networks. Internet-draft, Oct. 2022. https://datatracker.ietf.org/doc/draft-joung-detnet-asynch-detnet-framework/.
664
J. H. Wang et al.
24. D. Katz and D. Ward. Bidirectional Forwarding Detection (BFD). RFC 5880, RFC Editor, June 2010. http://www.rfc-editor.org/rfc/rfc5880.txt. 25. Jonatan Krolikowski, Sébastien Martin, Paolo Medagliani, Jérémie Leguay, Shuang Chen, Xiaodong Chang, and Xuesong Geng. Joint routing and scheduling for large-scale deterministic IP networks. Computer Communications, 165:33–42, 2021. 26. Yizhou Li, Shoushou Ren, Guangpeng Li, Fan Yang, Jeong dong Ryoo, and Peng Liu. IPv6 Options for Cyclic Queuing and Forwarding Variants. Internet-draft, Oct. 2022. https:// datatracker.ietf.org/doc/draft-yizhou-detnet-ipv6-options-for-cqf-variant/. 27. Bingyang Liu, Shoushou Ren, Chuang Wang, Vincent Angilella, Paolo Medagliani, Sebastien Martin, and Jeremie Leguay. Towards Large-Scale Deterministic IP Networks. In 2021 IFIP Networking Conference (IFIP Networking), pages 1–9, 2021. 28. Andrew G. Malis, Xuesong Geng, Mach Chen, Fengwei Qin, and Balazs Varga. Deterministic Networking (DetNet) Controller Plane Framework. Internet-draft, Dec. 2022. https:// datatracker.ietf.org/doc/draft-ietf-detnet-controller-plane-framework/. 29. Greg Mirsky, Mach Chen, and David L. Black. Operations, Administration and Maintenance (OAM) for Deterministic Networks (DetNet) with IP Data Plane. Internet-draft, Aug. 2022. https://datatracker.ietf.org/doc/draft-ietf-detnet-ip-oam/. 30. Greg Mirsky, Mach Chen, and Balazs Varga. Operations, Administration and MaintenVersion field is neededance (OAM) for Deterministic Networks (DetNet) with MPLS Data Plane. Internet-draft, Dec. 2022. https://datatracker.ietf.org/doc/draft-ietf-detnet-mpls-oam/. 31. Greg Mirsky, Fabrice Theoleyre, Georgios Z. Papadopoulos, Carlos J. Bernardos, Balazs Varga, and János Farkas. Framework of Operations, Administration and Maintenance (OAM) for Deterministic Networking (DetNet). Internet-draft, Oct. 2022. https://datatracker.ietf.org/doc/ draft-ietf-detnet-oam-framework/. 32. T. Mizrahi. Security Requirements of Time Protocols in Packet Switched Networks. RFC 7384, RFC Editor, October 2014. http://www.rfc-editor.org/rfc/rfc7384.txt. 33. Yijun Mo, Zihan Yang, Huiyu Liu, and Tianliu He. Global Cyclic Queuing and Forwarding Mechanism for Large-Scale Deterministic Networks. In 2021 IEEE 23rd Int Conf on High Performance Computing and Communications; 7th Int Conf on Data Science and Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud, Big Data Systems and Application (HPCC/DSS/SmartCity/DependSys), pages 275–282, 2021. 34. Ehsan Mohammadpour and Jean-Yves Le Boudec. On Packet Reordering in Time-Sensitive Networks. IEEE/ACM Transactions on Networking, 30(3):1045–1057, 2022. 35. Ahmed Nasrallah, Akhilesh S. Thyagaturu, Ziyad Alharbi, Cuixiang Wang, Xing Shao, Martin Reisslein, and Hesham ElBakoury. Ultra-Low Latency (ULL) Networks: The IEEE TSN and IETF DetNet Standards and Related 5G ULL Research. IEEE Communications Surveys & Tutorials, 21(1):88–145, 2019. 36. Li Qiang, Xuesong Geng, Bingyang Liu, Toerless Eckert, Liang Geng, and Guangpeng Li. Large-Scale Deterministic IP Network. Internet-Draft draft-qiang-detnet-large-scale-detnet05, IETF Secretariat, September 2019. https://datatracker.ietf.org/doc/html/draft-qiang-detnetlarge-scale-detnet. 37. Gourav Prateek Sharma, Wouter Tavernier, Didier Colle, and Mario Pickavet. Routing and scheduling for 1+1 protected DetNet flows. Computer Networks, 211:108960, 2022. 38. Johannes Specht and Soheil Samii. Urgency-Based Scheduler for Time-Sensitive Switched Ethernet Networks. In 2016 28th Euromicro Conference on Real-Time Systems (ECRTS), pages 75–85, 2016. 39. Johannes Specht and Soheil Samii. Synthesis of Queue and Priority Assignment for Asynchronous Traffic Shaping in Switched Ethernet. In 2017 IEEE Real-Time Systems Symposium (RTSS), pages 178–187, 2017. 40. Ren Tan, Hongyi Huang, and Tianran Zhou. Echo Request/Reply for DetNet Capability Discovery. Internet-draft, Oct. 2022. https://datatracker.ietf.org/doc/draft-tan-detnet-capdiscovery/. 41. Dirk Trossen and Jürgen Schmitt. RSVP for TSN Networks. Internet-draft, Jul. 2022. https:// datatracker.ietf.org/doc/draft-trossen-detnet-rsvp-tsn/.
Deterministic Network
665
42. B. Varga, L. Berger, D. Fedyk, S. Bryant, and J. Korhonen. Deterministic Networking (DetNet) Data Plane: IP over MPLS. RFC 9056, RFC Editor, October 2021. https://www.rfc-editor.org/ rfc/rfc9056.txt. 43. B. Varga, J. Farkas, L. Berger, D. Fedyk, and S. Bryant. Deterministic Networking (DetNet) Data Plane: IP. RFC 8939, RFC Editor, November 2020. http://www.rfc-editor.org/rfc/rfc8939. txt. 44. B. Varga, J. Farkas, L. Berger, A. Malis, and S. Bryant. Deterministic Networking (DetNet) Data Plane Framework. RFC 8938, RFC Editor, November 2020. http://www.rfc-editor.org/ rfc/rfc8938.txt. 45. B. Varga, J. Farkas, L. Berger, A. Malis, and S. Bryant. Deterministic Networking (DetNet) Data Plane: MPLS over UDP/IP. RFC 9025, RFC Editor, April 2021. http://www.rfc-editor. org/rfc/rfc9025.txt. 46. B. Varga, J. Farkas, L. Berger, A. Malis, S. Bryant, and J. Korhonen. Deterministic Networking (DetNet) Data Plane: MPLS. RFC 8964, RFC Editor, January 2021. http://www.rfc-editor.org/ rfc/rfc8964.txt. 47. B. Varga, J. Farkas, A. Malis, and S. Bryant. Deterministic Networking (DetNet) Data Plane: IP over IEEE 802.1 Time-Sensitive Networking (TSN). RFC 9023, RFC Editor, June 2021. http://www.rfc-editor.org/rfc/rfc9023.txt. 48. B. Varga, J. Farkas, A. Malis, and S. Bryant. Deterministic Networking (DetNet) Data Plane: MPLS over IEEE 802.1 Time-Sensitive Networking (TSN). RFC 9037, RFC Editor, June 2021. https://www.rfc-editor.org/rfc/rfc9037.txt. 49. B. Varga, J. Farkas, A. Malis, S. Bryant, and D. Fedyk. Deterministic Networking (DetNet) Data Plane: IEEE 802.1 Time-Sensitive Networking over MPLS. RFC 9024, RFC Editor, June 2021. http://www.rfc-editor.org/rfc/rfc9024.txt. 50. Balazs Varga, János Farkas, and Andrew G. Malis. Deterministic Networking (DetNet): DetNet PREOF via MPLS over UDP/IP. Internet-draft, Nov. 2022. https://datatracker.ietf.org/doc/ draft-ietf-detnet-mpls-over-ip-preof/. 51. JP. Vasseur and JL. Le Roux. Path Computation Element (PCE) Communication Protocol (PCEP). RFC 5440, RFC Editor, March 2009. http://www.rfc-editor.org/rfc/rfc5440.txt. 52. Quan Xiong. Gap Analysis for Enhanced DetNet Data Plane. Internet-draft, Dec. 2022. https:// datatracker.ietf.org/doc/draft-xiong-detnet-enhanced-detnet-gap-analysis/. 53. Quan Xiong and Peng Liu. PCEP Extension for DetNet Bounded Latency. Internet-draft, Oct. 2022. https://datatracker.ietf.org/doc/draft-xiong-pce-detnet-bounded-latency/. 54. Li Zhang, Xuesong Geng, and Tianran Zhou. PCEP for Enhanced DetNet. Internet-draft, Oct. 2022. https://datatracker.ietf.org/doc/draft-zhang-pce-enhanced-detnet/.
UAV Communications and Networks Soohyun Park, Ju-Hyung Lee, Soyi Jung, and Joongheon Kim
1 Introduction Witnessing the recent developments of UAV-assisted networks, such as Loon by Loon LLC [1], Aquila by Facebook [2], and HAWK30 by HAPSMobile [3], we are at the cusp of a communication revolution where non-terrestrial networks (NTN) are envisaged to meet the terrestrial networks through the sky. Wireless connectivity has already been extending towards the sky by integrating unmanned aerial vehicles (UAVs). UAV-assisted communication can efficiently secure and utilize the favorable channel condition, thanks to its flexible maneuverability. The application of such mobile platform is not limited to supporting existing network infrastructure (i.e., data offloading), disaster management, data collection for IoT, and so on. However, to fully utilize such a flexible but energy-limited UAV platform, there are many considerations and challenging issues, which will be elucidated throughout this chapter. The rest of this chapter introduces the motivation of UAV communication while exploring the essential research topic of the UAV-assisted networks in beyond-5G network scenarios. In Secs. 2 and 3, the solution to each aforementioned challenge is described by dividing it into two perspectives: (i) physical layer and (ii) network layer. Last, Sec. 4 presents open issues of non-terrestrial networks (NTN), including
S. Park · J. Kim () Korea University, Seoul, South Korea e-mail: [email protected]; [email protected] J.-H. Lee University of Southern California, Los Angeles, CA, USA e-mail: [email protected] S. Jung Ajou University, Chuncheon, South Korea e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_26
667
668
S. Park et al.
UAV-assisted networks as well as low-Earth orbit (LEO) satellite (SAT) networks, and summarizes and concludes this chapter.
1.1 Motivation As mentioned above, it is apparent that UAVs place an essential platform in configuring wireless networks, which can facilitate wireless coverage and support enhanced mobile broadband (eMBB) [4, 5]. The ongoing development of wireless communication technology, such as eMBB in beyond-5G, has further been boosted along with UAV-based networks in a 6G network scenario. Besides, intelligent network management, which is vital for massive machine-type communications (mMTC) in the 6G network demanding ten times more seamless connections than 5G requirements, can also be aided by UAVs. UAVs, mobile communication platforms in the air [6], can offload existing network infrastructure in high traffic situations, enabling it to provide better communication services to end users. In emergency situations, such as fire or earthquake disasters, UAVs can also be a key solver. Moreover, thanks to its high mobility and flexible service coverage characteristics [7], a UAV-assisted network can address rural connectivity problems where the operating and deployment cost of network infrastructure is high.
1.1.1
Characteristics of UAV-Based Networks
However, in order to fully leverage the potential of UAVs in 6G networks, one needs to consider its own characteristics. In the following, we introduce the characteristics of UAVs, regarding them as a desirable alternative or supporter of existing ground networks. • LoS links: UAVs are connected to users on the ground via line-of-sight (LoS) links in the air. In the case of communication through LoS, stable transmission can be ensured even at the distance between UAV and user, which greatly affects communication quality [8]. Channel state information (CSI) determined by the location of the two nodes participating in the communication determines the recruitment characteristics of the LoS, which affects the design of the hightransmission communication system. For this reason, hovering positioning is very important to maintain communication quality through LoS channels in UAV networks [8]. • Dynamic deployment ability: As mentioned above, when UAVs are used as base station (BS) in the air, dynamic arrangement according to changes in real-time requirements is generally possible compared to infrastructure built on the ground. It is much more cost-effective than building and maintaining a new typical terrestrial BS because it is possible to construct a more flexible
UAV Communications and Networks
669
network according to the mobility of UAVs [8–10]. In addition, UAV-based networks can quickly establish communication infrastructures and respond to dynamic network change that are difficult to solve with fixed infrastructure in environments that require disaster rescue, remote sensing, and firefighting, etc. • UAV-based swarm network: Multiple UAV networks can be configured based on UAV flexibility, rapid provision, and other features. A swarm structure consisting of multiple UAVs provides ubiquitous connectivity to users on the ground while expanding communication services quickly and effectively. In an environment where multiple UAVs exist, the various interference problems arising from the connection between UAVs and the connection between UAVs and ground networks are one of the problems to be solved in UAV swarm networks [11, 12]. • Limited resources: Although UAVs have several characteristics that enable dynamic network configurations suitable for 6G environments, physical constraints exist, as mentioned several times in various existing studies. This is usually divided into (i) power required to maintain the network while performing tasks and (ii) communication, computation, and storage capability required for UAVs to perform actual tasks [13]. Energy, the power of UAVs, is determined by the body size of UAVs, and the weight and flight capability of the UAV are determined accordingly. Therefore, the balance between the power and weight of UAVs should be considered very important in UAV design [13].
1.2 Trends The characteristics of UAVs discussed in advance are also directly related to various problem situations occurring in UAV-based networks. We introduce various problems that need to be considered in UAV-based networks and research examples on them.
1.2.1
Charging Scheduling
In order to solve power limitation, which is one of the characteristics of UAVs discussed in the previous section, research on optimizing the UAV charging problem is important. In order to overcome the power limit of UAVs, it is necessary to replace or charge batteries frequently. However, since charging stations are usually fixed to the ground, the movement of UAVs due to frequent charging causes unexpected changes in the structure of the network already configured and additional tasks according to system adjustment [14]. For this reason, along with the method of efficiently using power by UAVs, charging scheduling that efficiently supplies energy resources to multiple UAVs and arrangement of mobile charging stations have been mainly studied. There is a research which is conducted to provide appropriate service to UAVs that need charging while reducing the time and power consumed by UAVs moving
670
S. Park et al.
Fig. 1 Schematic diagram of two-stage charging scheduling between drones
for charging. By using the vehicle as a mobile charging station, it is possible to receive the necessary power nearby even if the UAV does not move to the territorial infrastructure. In this case, the scheduling problem is controlled to maximize the overall utility of the system by using an economic approach in the process of selecting a target to be provided with charging services among a number of UAVs [14–18]. In addition, there is a research that introduced UAVs as mobile charging stations to extend the operating time of UAV-based networks while using UAVs as mobile base stations (MBS) [19, 20]. However, since charging drones are also operated with limited batteries, a charging infrastructure consisting of ground-mounted charging towers is considered together [21, 22] As shown in Fig. 1, two-stage charging matching operation has been proposed to provide charging power to the MBS drone, which acts as a MBS in a network system composed of three elements: a charging tower, a charging drone, and a MBS drone [23]. At each stage, the scheduling optimization of (i) charging tower and charging drones and (ii) charging drones and MBS drones is carried out, respectively. The proposed method ensures that each charging drone gets energy efficiently from the charging tower and increases the energy efficiency in the process of transferring power to the MBS. Furthermore, it achieves performance improvement by supplying appropriate power to MBS drones in terms of extending UAV-based network coverage time.
1.2.2
Trajectory Optimization
When UAVs are utilized as mobile base stations to provide Internet access to ground users, it is important to optimize the UAV path problem in consideration of UAVenabled free space optical communication (FSOC) [24]. It is a problem that must be considered in order to use the limited power of UAVs efficiently [25, 26]. At this time, a new approach was proposed to alleviate the infinite occurrence of
UAV Communications and Networks
671
Fig. 2 Illustration of UAV flight time maximization
variables with the movement of UAVs. It takes the UAV energy efficiency of the discrete region into account and optimizes the UAV trajectory by converting the UAV-based network into three-dimensional coordinates as shown in Fig. 2 [24]. In this study, only the LoS link to FSOC is assumed, and multipathing nor fading is not considered. Energy efficiency is defined by considering values related to signal attenuation by weather conditions or UAV energy, and that is maximized using FSOC. The optimization problem is designed based on Taylor’s approximation of the velocity and acceleration formulation of UAVs and is solved using a high signal-to-noise ratio (SNR) approximation, a Slack variable, and a first Taylor approximation [25]. As a result, it can be seen that as the beam divergence effect increases, the channel condition deteriorates and the radius of the orbit narrows, and an appropriate UAV path can be designed in consideration of this. In another study, UAVs are used in urban air mobility systems such as drone taxis [27]. A trajectory optimization problem using deep reinforcement learning (DRL) that is suitable for UAV mobile networks that must operate in real time in uncertain environments is defined. Multi-agent deep reinforcement learning (MADRL) has been used to enable global optimal trajectory cooperative determination for one UAV in networks where multiple UAVs exist. QMIX is one of the distributed MADRL algorithms that ensures good performance in real-time distributed computing situations. Each element represented by the Markov decision process (MDP) [28] .{S, A, R, T , γ } refers to the state space, the action space that the agent can take, the reward value for individual actions, the transition probability, and the discount factor. The purpose of reinforcement learning (RL) is to maximize cumulative rewards until the scenario ends. The QMIX-based MADRL algorithm optimizes routes by considering passenger information and battery conditions to board the selected UAV. The performance shows that the QMIX-based algorithm flies over
672
S. Park et al.
a wide range of regions through various route derivation and thus serves more passengers [29].
1.2.3
Throughput Maximization
There are several advantages in using UAV as a mobile relay node (e.g., [i] when responding to unexpected events, simple deployment is possible; [ii] high mobility, providing a new approach in applications with communication delays; and [iii] it is possible to provide new degrees of freedom [DoF] for performance enhancement over conviction) [30]. In UAV-based mobile relay systems, free space optical communication (FSO) is attracting attention as a problem solution, and there is also a system study to utilize both RF and FSO links. In particular, in dual-hop mixed FSO/RF communication for air-to-ground (A2G) links, the problem of signal attenuation or the problem of rate imbalance occurring using other types of links is a major concern. For this problem, a dual-hop mixed FSO/RF backhaul using a buffer may be a solution [31, 32]. The architecture is as shown in Fig. 3 [33]. Since the transmission rate imbalance between FSO and RF links occurs depending on the location of the relay node, the throughput delay and UAV mobility are controlled by considering the buffer and average delay constraints of the relay node using UAV. Depending on the average delay constraint, it is divided into transmission methods that allow or limit delays. As the receiving SNR increases, the UAV trajectory moves from the user terminal towards the backhaul. The optimal average throughput is determined according to the average delay limit, and as the average delay request value increases, the average throughput increases. In other words, there is a tradeoff relationship between the two values. The optimal buffer size and optimal delay requirements depend on the given system, weather conditions, and FSO or RF
Fig. 3 Illustration of UAV-enabled mobile relays using dual-hop mixed FSO/RF communication
UAV Communications and Networks
673
channel parameters. In addition, the greater the delay requirement, the lower the transmission delay [33].
1.3 Potential Challenges 1.3.1
Precise Air-to-Ground Channel Model
While UAV communication systems are widely popular and deployed, a large number of fundamental questions remain open, in particular the reliability of wireless connections between ground stations and drones, which is critical for several applications (e.g., offloading, disaster management, etc.), as well as for the coexistence between UAV and terrestrial networks. The essential prerequisite for investigating all of these questions is an understanding of the wireless propagation channel between drones and ground stations. Thus, to assess how reliable such systems will be, it is critical to perform extensive measurement campaigns that provide not only field strength but also directional properties of such channels and considers effects such as shadowing by a timevarying environment.
1.3.2
Interference and Limited Backhaul Condition
Several studies, for instance, [34–39], have carried out the optimization for multiUAV communications. These works have improved various objectives for the multiUAV networks with several optimization variables. In [34], the authors investigate the two main optimization variables of cell association and optimal deployment for multi-UAV clusters aiming to minimize latency. Besides, in [35, 36], the problem of the path planning and resource allocation for multi-UAV networks has been jointly optimized for optimizing the networks. Existing works for UAV networks, however, have only focused on access links, although it is explicit that backhaul link is also crucial in the networks. A few works [40, 41] consider the backhaul link while optimizing the UAV networks; however, those works have not dealt with the backhaul link requirement in-depth. Moreover, most studies have focused only on low-altitude UAV network without fully leveraging a large-scale 3D connectivity. It is still unclear to understand the optimization problem in multi-UAV networks encountering ground-to-air uplink (UL) and air-to-ground downlink (DL) interference. Thus, for each of DL and UL in UAV networks, the interference-aware network throughput maximization problem should be jointly addressed with respect to several aspects, e.g., association, power allocation, deployment, and altitude.
674
S. Park et al.
2 Solution: Physical Layer Perspective How can we tackle such challenges in UAV communications? Here, we introduce some underlying solutions with the lens of the physical layer (PHY) perspective: A more precise channel modeling can be a first step as it enables us to understand the link budget, outage, and communication performance metrics. Another step can be the resource allocation optimization taking into account of the association, transmit power for DL and UL, and UAV trajectory (or deployment) considering a practical UAV network scenarios.
2.1 Practical UAV Channel Modeling UAVs are an important part of future wireless communications, either as user equipment that needs communication with a ground base station or as a base station in 3D network scenario. For both the analysis of the useful links and the investigation of possible interference to other ground-based nodes, an understanding of the A2G channel is crucial (see the survey [42] and references therein). Measurements of propagation channels in real-world environments construct the basis of all realistic system performance evaluations as a foundation of statistical channel models. This is also true for the analysis of UAV networks. However, such experimental data are difficult to obtain due to the complexity and expense of deploying tens or hundreds of channel sounder nodes across the wide area where UAV networks are expected to cover. Here, we introduce the following two approaches of using UAV channel sounding: 1. Channel sounding in particular for air-to-ground link [43] 2. Channel sounding for wide-coverage terrestrial networks (e.g., cell-free massive multiple-input multiple-output (MIMO) systems) [44]
2.1.1
Channel Sounding for Air-to-Ground Link
As discussed earlier, extensive channel sounding is required before the eventual deployment of UAV networks. Therefore, we discuss the development of an airto-ground (A2G) communication channel by using a directional massive MIMO channel sounder. For sounding the A2G channel, remote drone communication end is used, exploiting components of our existing real-time massive MIMO sounder for ground-based communication end. Here, the drone-assisted A2G channel sounder is discussed in which singleinput, multi-output (SIMO) massive MIMO scenarios are measured. The channel sounder is composed of an aerial transmitter (TX) on a drone and a ground receiver (RX) with 64-antenna dual-polarized massive MIMO cylindrical array. Due to the
UAV Communications and Networks
675
Fig. 4 UAV channel sounding systems. (a) Drone-assisted aerial transmitter. (b) Ground receiver with 64-antenna dual-polarized cylindrical array
channel reciprocity, measurements taken in this configuration are equally valid for a ground (TX)-to-drone (RX) link. The hardware list of the major sounder components (e.g., aerial transmitter: drone and ground receiver—64-antenna dualpolarized cylindrical array) is shown in Fig. 4. Sounding Signal Design and Processing of the System A summary of the sounder configuration parameters is given in Table II [43]. The procedure used to generate the sounding signal is explained in [45, 46]. To characterize the frequency response of the system, it is necessary to perform back-to-back (B2B) calibrations on it. To perform this, the connections to the antennas are replaced with an RF cable and measure the response of the system. To obtain the final “antenna .+ channel” response for the post-processing, the procedure in [13] is followed:
676
S. Park et al.
H (f ) =
.
Ym (f ) × Ga (f ), Yr (f )
(1)
where .H (f ) is the “antenna+channel” response, .Ym (f ) is the measured transfer function, and .Ga (f ) is the frequency response of the attenuator used during the B2B measurement. Note that .Yr (f ) is the final system calibration result and depends on the gain setting of the RX. With the UAV channel sounder, a sample measurement campaign is executed in an urban scenario. See the detailed results in [43]. These results provide a platform for further channel measurements of drone-based extensions of massive MIMO systems and their eventual large-scale development, which is elaborated on the following subsection.
2.1.2
UAV-Assisted Channel Sounder
Extensive channel sounding is required before the eventual deployment of a wireless system is possible. Therefore, [44] introduces the development of an A2G directional massive MIMO channel sounder. In particular, the authors provide a novel method to obtain channel data for cell-free massive multi-input multi-output (CF-mMIMO) systems using a drone channel sounder. Such a method is efficient, flexible, simple, and low-cost, capturing channel data from thousands of different access point (AP) locations within minutes. In [44], the channel sounder consists of (1) a transmitter (TX) mounted on a drone with a lightweight software-defined radio (SDR) and a single dipole antenna and (2) a receiver (RX) on the ground with a cylindrical antenna array, digitizer, and storage, which are heavier and bulkier than the TX. UAV-assisted channel sounding proceeds as follows: In a selected environment, the locations for APs and UEs should be decided first. We fix the RX at the location of the first UE and fly the TX along a trajectory that includes locations of all APs. The measured channels from a single trajectory thus contain the channels between all APs and a single UE. UAV-assisted channel measurement has the following advantages: (1) Boundless AP locations: The drone, with its small body, can reach any position at any height flexibly and quickly. Such capability is especially useful when measurements for several different CF-mMIMO systems are conducted in multiple environments located far apart, since cumbersome installations of AP antennas at multiple rooftops and masts are not necessary. (2) Fast measurement speed and a large dataset: Thousands of AP locations can be swept within several minutes on a drone. In the measurement, the RX captures channel data every 50 [ms] and the TX moves at 4 [m/s]. With such measurement speed, channel data from 1200 AP locations distributed across a 240 [m] range to a UE can be measured per minute. The UAV sounder can either sample some of the spatial points among the whole drone trajectory to place a selected number of APs for a considered CF-mMIMO system, or utilize the ample size of the dataset to conduct data-hungry statistical analyses or machine learning applications.
UAV Communications and Networks
677
Such drone-based channel sounding method [44] provides a large dataset in a short period of time and costs little in comparison to the full setups with many antennas distributed across a wide area. In [44], a sample measurement campaign is demonstrated and analyzed for crucial parameters such as channel gain, SNR, and SINR, to provide some insights on realizing CF-mMIMO systems, such as height of APs, number of APs, and combinations APs may use, while the real-world channel data open source is distributed (public) for various other wireless system analyses.
2.2 Optimal Joint Trajectory and Resource Allocation Design Integrating terrestrial networks with UAV networks, especially with burgeoning UAVs, will be a disruptive challenge for beyond-5G systems provisioning largescale three-dimensional connectivity. The architecture of UAV networks consists of backhaul-UAV and UAV-BS. To design efficient UAV-assisted relaying networks, there are also several optimization variables to optimize the UAV networks which are as follows: • Association: Which base station should be connected to which UAV? • Power: How much DL and UL power should be allocated for each link? • Deployment: Where should UAV be deployed? (not only optimal 2D position, but also optimal altitude) Besides, there are several considerations in the dual-hop (or multi-hop) networks while optimizing such multiple variables. In the following subsection, the practical consideration of UAV relaying networks is firstly introduced and then the optimization problem for dual-hop multi-UAV-assisted relaying networks is tackled. Optimization of Multi-UAV-Assisted Relaying Networks Consider that packets are forwarded between faraway backhaul and BSs, relayed through UAVs, e.g., balloons, airships, fixed-wing UAVs, etc. In the multi-UAV-assisted relaying networks, there are two main practical considerations: • Limited backhaul condition • Interference effect on DL/UL in access link Not only two main practical considerations but also the following constraints should be considered for the end-to-end (E2E) optimization for such multi-hop UAV relaying networks. (Association)
.
aij ∈ {0, 1}, ∀i, j,
ai,j ≤ 1, j ∈ J, .
i∈I
(2) (Power)
D D U U D 0 ≤ pij ≤ Pmax , 0 ≤ pij ≤ Pmax , ∀i, j, PiD ≤ Pmax , ∀i, .
(3)
678
S. Park et al.
Hmin ≤ Hi ≤ Hmax , i ∈ I,
(Deployment)
(4) where the variables of a, p, q, and H denote the association, power allocation, xy position, and altitude variables, respectively, and the subscript indexes of i and j represent UAV and BS, respectively, and the superscript indexes of D and U represent DL and UL, respectively. Here, the association constraints given in (2) force that each BS can be supported by only one UAV. The altitude constraints in (4) condition the altitude of UAV in between the permitted altitude. The power constraints in (3) limit each transmit power for DL and UL within a predetermined value. The following problem corresponds to E2E (i.e., backhaul-UAV-BS) network throughput maximization under the constraints related to the practical condition of access and backhaul links given as .
max a, p, q, H
s.t.
j ∈J
RAccess,j
(2) − (3), D U U SINRD ij ≥ aij SINRmin,j , SINRij ≥ aij SINRmin,j , ∀i, j, . RAccess,ik ≤ RBackhaul,i , ∀i, j ∈J
(5) (6)
where .SINR is the signal-to-interference-plus-noise ratio (SINR), and .RBackhaul and RAccess represent the throughput of the backhaul link and access link, respectively. Here, (5) express the minimum SINR constraints for DL and UL, respectively, where SINR for every BS secures a certain SINR requirement. The SINR constraints guarantee the minimum fairness among BSs or UAVs. Otherwise, some BSs or UAVs only support poor data rates even though the network throughput is high enough. Besides, (6) represent the information-causality constraints, which satisfies the information causality in relaying for both DL and UL. To tackle such non-convex programming, the authors in [39, 47] introduce a novel algorithm, which is a surrogate function (SGF)-based block coordinate descent (BCD) algorithm. In [39, 47], after showing the proposed SGF-based BCD algorithm outperforms the other baselines (e.g., k-nearest neighbors and Lloyd’s algorithms with equal-power allocation), the authors answer the three questions: (1) What are the wireless backhaul requirements, (2) how does the number of UAVs and terrestrial terminals impact the network, and (3) what if a certain user terminal has a particular demand on the data rate? See [39, 47] for the detail of the algorithm and the simulation results.
.
UAV Communications and Networks
679
3 Solution: Network Layer Perspective In this section, we introduce some important problems of UAV systems and present solutions from a network layer (NET) perspective [4]. Basically, a given UAV-based problem assumes an uncertain and constantly changing environment and exchanges partially observed information from one UAV to another. A MADRL scheme is designed based on an effective approach to dealing with high dynamics of problems. Here, each UAV can be the agent, and MADRL is a type of RL in which multiple UAVs cooperate or compete in a particular environment to achieve the common goals of multiple agents.
3.1 Cloud-Assisted Multi-UAV Charging Systems The use of UAVs is essentially required for advanced networks in order to realize flexible and reliable network infrastructure [48]. Besides the benefits, it has significant problems which should be addressed because UAVs are power-hungry platforms. Therefore, a UAV-related system should be energy-efficient. The use of charging infrastructure/towers is one of major research directions for energyefficient UAV network operation and management. In order to provide energy sources to the devices using charging towers, wireless energy transfer (WET) technologies can be used in proximal distances. Therefore, charging towers (i.e., energy edges) which are equipped with wireless energy transfer functionalities are considered. The charging towers are also equipped with photovoltaic (PV) and energy storage system (ESS) functionalities in order to conduct self-configurable energy management [14]. Note that the energy resources in the ESSs of charging towers can be shared among charging towers. Instead of sharing the energy resources, the charging towers are able to purchase energy resources from their associated utility company if it is more cost-effective. The considered reference system with three major components, i.e., orchestration manager (cloud), charging towers (energy edges), and UAVs (ends), is designed to conduct two major energy-optimal intelligent system maintenance and operations as follows: • Convex-optimal charging scheduling: This is for convex-optimal charging scheduling decision-making between UAVs and charging towers and the corresponding energy resource allocation amount decision for the scheduled pairs. This operation requires to gather information from UAVs and charging towers, and high-performance computing resources are required, i.e., our orchestration manager needs to utilize cloud computing resources. • Cooperative MADRL-based learning for charging towers: This is for cooperative MADRL-based intelligent management of energy resources among charging towers. The CommNet-based cooperative MADRL algorithm is used because it enables cost-efficient energy trading and sharing [49]. The MADRL
680
S. Park et al.
needs data-intensive deep learning computation with huge dataset in real time, and a cloud-assisted orchestration manager is essentially required. In order to handle the two separated and correlated methods in an efficient way, a centralized administrator for the orchestration is desired. In the cloud, the orchestration manager would be able to compute the two methods with large amounts of UAVs and charging towers. It requires high-performance computing functionalities such as cloud platforms. The main objectives of an energy management system, operated by our orchestration manager, can be summarized as follows: • The orchestration manager minimizes the amounts of energy purchasing from the utility company due to the fact that each charging tower is with energy generation components such as PV for self-charging and the energy trading/sharing is available by sharing ESS resources among charging towers. • The energy status in each ESS should be between .E max and .E min . Thus, if the produced energy by PV is more than the residual capacity of ESS, the energy will be discarded. • The orchestration manager should be able to monitor the states of all given charging towers. Based on this, the manager is able to define rewards under the consideration of residual energy minimization and energy purchasing minimization from the utility company. For this workload-intensive MADRL learning computation, cloud-based computing resources are essentially desired.
3.2 Reliable Surveillance Autonomous Multi-UAV Systems An autonomous surveillance UAV management system is essential in imbuing more robust and resilient surveillance services into UAV-based network systems. It is essential to conduct a joint optimization of the energy consumption and enhance the reliability of the network surveillance services under the behavior uncertainty of the target object’s movements/deployments and neighboring UAVs. Our considering system consists of two units: the UAVs and the target of surveillance [50]. Some UAVs cooperatively exchange information for reliable monitoring services, while uncooperative UAVs do not. The objective of the surveillance UAV is to autonomously maneuver itself to an area with the highest number of users. Hence, it is essential to configure a system where UAVs automatically induce the optimal trajectories and coverage to achieve this objective. In this process, the autonomous optimization system needs to take into account the characteristics of UAVs (e.g., on-board battery). Suppose that N users, M multi-agent cooperative agents, and K non-agent UAVs are deployed. The reference system assumes a leader UAV agent for handling communications between UAV agents. The leader UAV agent receives information for multi-agent cooperation. In addition, it assumes that each user is associated with only one UAV, whereas each UAV can be associated with multiple users. A camera is embedded on each UAV. Each UAV captures surveillance coverage
UAV Communications and Networks
681
with the resolution .q ∈ Q, where .Q stands for the set of resolutions. This work presupposes that each camera sensor is not affected by zoom in, zoom out, or the UAV’s movement [51]. These assumptions are reasonable since the camera sensor dynamically controls the sizes of micro pixels, so that the resolution is not degraded even when zooming in. Therefore, the surveillance resolutions have an inverse relationship with surveillance areas. Each UAV can conduct multi-agent cooperation to manage its surveillance coverage autonomously. The objective of the autonomous coordination for reliability (ACR) is to increase the regions of monitoring areas for surveillance reliability with high resolution. Therefore, the algorithm tries to achieve reliable surveillance under the scenario that the number of users to be observed changes while considering the UAV’s conditions, e.g., the possibilities that the UAVs are dropped, malfunctioned, or energy-exhausted. At the same time, each UAV tries to optimize energy consumption to combat the power-hungry nature of UAVs. For the MADRL, the state space of ACR consists of location information (absolute position information and the relative position or distance information with other users/UAVs), energy information, and surveillance information (whether the user is monitored or not and which UAV monitors with which resolutions). The action space of ACR consists of discrete action (moving action and surveillance resolution level) [51]. The rewards of the proposed ACR are classified into two groups, i.e., UAV rewards and cooperation rewards. • UAV reward. For defining the rewards in UAVs, this chapter considers energy consumption, battery discharge, and the number of users. The UAV is powerhungry by nature, and the rewards for UAV with energy consumption is the scaled summation of energy consumption of the basic energy consumption. • Cooperation reward. In order to define the rewards for the cooperation among UAVs, we address the overlapped area among UAVs and the number of users. If there is a lot of overlapped areas regarding surveillance coverage, it is obviously not good in terms of energy and resource efficiency. Thus, the reward formulation also aims to minimize overlapped areas.
4 Open Issues and Concluding Remarks What is more, space is also emerging as the new frontier beyond fifth-generation (5G) communication. This trend has been encouraged by the recent launches of low-altitude Earth orbit (LEO) satellite (SAT) mega-constellations. In particular, SpaceX’s thousands of Starlink SATs have already been provisioning high throughput (.>100 Mbps) with low latency (. 0 are modeling parameters. Furthermore, in order to show the fact that the propagation environment varies even if the UAV moves within the same altitude, an approach is proposed in [35] to model the channel modeling parameters as functions of the elevation angle. (3) Probabilistic LoS Channel Model Considering the fact that the LoS link may be occasionally blocked, an approach is to model the LoS and NLoS propagations, separately, based on the statistical modelling of the urban environment (e.g., the density and height of buildings). One of the most well-known models is the 3GPP GBS-UAV channel model [36]. Its LoS probability is expressed as a piecewise
Non-terrestrial Network
699
function distinguished by the specified thresholds .H1 and .H2 in different scenarios, i.e., ⎧ ⎪ 1.5 m ≤ HU ≤ H1 , ⎪ ⎨PLoS,ter .PLoS = (12) PLoS,U (d2D , HU ) H1 ≤ HU ≤ H2 , ⎪ ⎪ ⎩1 H ≤ H ≤ 300 m, 2
U
where .PLoS,ter is the LoS probability for conventional terrestrial GBS-UE channels [37], and .PLoS,U (d2D , HU ) is expressed as ⎧ ⎨1
.PLoS,U (d2D , HU ) = ⎩ dd1 + exp −dp2D 1 − 2D 1
d1 d2D
d2D ≤ d1 , d2D > d1 ,
(13)
where .d2D is the two-dimensional (2D) distance between the GBS and the UAV, .p1 , and .d1 are logarithmic increasing functions of .HU as specified in [36]. Overall, the above models provide distinct tradeoffs between analytical tractability and modelling accuracy, where the selection of channel models relies on the communication situations and the objective of the study.
2.3 Link Budget The main objective of link budget is to ensure the availability of NTN communication links. As shown in Fig. 3, the received power can be expressed as [38] Pr =
.
Pt Gt Gr , Lf Lt Lr
(14)
where .Lf = Lb Lg Ls Le is the path loss, .Pt is the transmit power, .Lt is the transmit feeder loss, .Lr is the receive feeder loss, .Gt is the transmit antenna gain, and .Gr is the receive antenna gain.
Transmit Transmit Free feeder antenna space
Receive Receive antenna feeder
Transmitter
Pt
Receiver
Lt
Fig. 3 Link power analysis for NTN
Gt
Lf
Gr
Lr
Pr
700
X. Li et al.
2.3.1
CNIR
The carrier-to-noise-and-interference ratio (CNIR) is an important indicator to determine the quality of NTN link transmission. Referring to 3GPP on NTN, the CNIR of transmission link between satellite and UE can be calculated by the carrierto-noise ratio (CNR) and the carrier-to-interference ratio (CIR) [39]. Specifically, CNR is defined as .
EIRPGr C , = Lp N N
(15)
where .EIRP = Pt Gt is the effective isotropic radiated power (EIRP), and .Lp = Lf Lt Lr denotes the overall NTN link loss. .N = kTt B denotes the receive noise power with Boltzmann constant .k = 1.38 × 10−23 J/K, the equivalent noise temperature of the antenna .Tt , and the equivalent noise bandwidth B. CNR can also be expressed in decibels (dB) as [40] .
C [dB] = EIRP[dB] − Lp [dB] + Gr [dB] − 10lg(kTt B). N
(16)
Besides, CIR can be expressed as ⎛ ⎞ Ni C [dB] = EIRP[dB] − Lp [dB] + Gr [dB] − 10lg ⎝ . 100.1Ii [dB] ⎠ , I
(17)
i=1
where .Ii is the interference from the ith interfering beam, with .i = 1, ..., Ni and .Ni being the number of co-channel beams [41]. Based on CNR and CIR, the CNIR of the transmission link between the satellite and UE can be expressed as .
2.3.2
C NI
C −0.1 N −0.1 CI . = −10lg 10 + 10
(18)
CTR
In order to compare systems with different bandwidths, the overall carrier-to-noisetemperature ratio (CTR)(or called figure of merit) is defined in decibels (dB) as [42] C = [Gr ] − Nf − 10lg(T0 + (Ta − T0 )10−0.1[Nf ] ), . T
(19)
where .Gr is the receive antenna gain, .Nf is the receive noise figure, .T0 is the ambient temperature, and .Ta is the antenna temperature.
Non-terrestrial Network
701
2.4 Mobility Modelling 2.4.1
Orbit Characteristics of Satellites
GEO satellite orbits earth in a circular equator, with an orbital period equal to the earth’s rotation period, while MEO or LEO satellites orbit earth in a circular within a shorter period. The motion of all satellites obeys Kepler’s three laws [43]: (1) The First Law (Orbital Law) The satellite moves in an elliptical orbit with the center of the earth as a focus, and we have r=
.
P , 1 + e cos θ
(20)
where r is the distance from the satellite to the center of earth; .θ is the center angle; .e = 1 − (b/a)2 is the eccentrical rate; a and b represent the semimajor and semiminor axes of an ellipse, respectively; and .P = a(1 − e2 ) denotes the parameter of quadratic curve. As shown in Fig. 4, when .0 < e < 1, the orbit of the satellite is elliptical, while when .e = 0, the orbit of the satellite is a circular. (2) The Second Law (Area Law) As the satellite moves in orbit, it sweeps the same area during the same time block. As such, for the elliptical orbit, the satellite moves at the fastest velocity in perigee and moves at the slowest velocity in apogee. Therefore, the instantaneous velocity of a satellite is 2 1 , − μ .V = a r
(21)
where .μ is Kepler’s constant.
y
D
y
v b B
a F
'
r
r T
O
F
T
Ax
E (a)
Fig. 4 Earth satellite orbit [44]. (a) Elliptical orbit. (b) Circular orbit
O
(b)
x
702
X. Li et al.
(3) The Third Law (Orbital Period Law) The square of the satellite’s operating period is proportional to the third power of the semimajor axis of the orbit. According to the third law, the period T of a satellite revolving around the earth is a3 . .T = 2π (22) μ However, Kepler orbits are ideal orbits, in which the earth is assumed as a sphere with uniform mass distribution. Therefore, the satellite is only subjected to centrifugal force. In addition, there are some other important forces, such as the gravitational effect of the sun and the moon and the resistance of the atmosphere. To overcome the influence of perturbation, the satellite trajectory needs to be controlled, including position maintenance and attitude control. Specifically, the position maintenance is to keep the position of the satellite on the orbital plane unchanged. The attitude control aims to control the satellite to maintain a certain attitude, so that the antenna beam of the satellite always aligns to the service area of the earth’s surface, and the solar panel is always toward the sun.
2.4.2
Movement Characteristics of HAPS and Low-Altitude UAVs
Due to the unique properties of the stratosphere, as shown in Fig. 5a, HAPS stays at quasi-stationary positions with ubiquitous connectivity [4]. Therefore, HAPS can be employed as the static stations or the relays in the sky without mobility control. It is worth mentioning that balloons as a main type of HAPS need a long time to inflate and deflate, which requires higher climatic conditions. As shown in Fig. 5b, different from HAPS, the UAV’s position and trajectory can be flexibly designed in mobile communication network, which is also closely
H max q3
q1
qI q(t )
qN
qF
v(t )
H min
q2
(a)
(b)
Fig. 5 Movement of HAPS or UAVs. (a) The fixed positions of HAPS. (b) The flexible movement of UAVs
Non-terrestrial Network
703
coupled with communication resource allocation. Specifically, as a communication equipment, the rotary-wing UAV can hover over a terrestrial node or fly at a controllable trajectory, while the fixed-wing UAV requires a minimum flying speed. Besides, for the two types of UAVs, their trajectory .q(t) should satisfy the altitude constraint .Hmin ≤ [q(t)]3 ≤ Hmax , ∀t, the speed constraint .Vmin ≤ v(t) ≤ Vmax , ∀t (.Vmin = 0 for rotary-wing UAV), and maximum acceleration constraint .||a(t)|| ≤ amax , ∀t [6]. It is also important for a multi-UAV system to take into consideration the collision avoidance among the UAVs, which can be expressed as ||qk (t) − qk (t)|| ≥ D1 , ∀t, ∀k = k ,
(23)
.
where k and .k represent the different UAV indices, and .D1 is the minimum safety distance.
2.4.3
2D Single-Mobility Modelling
This section summarizes the approaches used to calculate the Doppler shift for NGSO satellites. It is assumed that a Cartesian coordinate system is adopted such that the moving satellite and the receiver are on the y–z plane, as shown in Fig. 6. The Doppler shift experienced by a stationary receiver can be calculated as a function of time [3] μ(t) =
.
f0 d(t) ∂x(t) , c |d(t)| ∂t
Fig. 6 2D single-mobility model [3]
(24)
z x(t)
h
d(t)
R o
x
y
704
X. Li et al.
where .f0 is the carrier frequency, .d(t) is the distance vector between the satellite and the receiver, and .x(t) is the vector of the satellite position. These vectors can be calculated as d(t) = [0, (R + h) sin (ωt) , (R + h) cos (ωt) − R]T ,
(25)
x(t) = [0, (R + h) sin (ωt) , (R + h) cos (ωt)]T ,
(26)
.
and .
where R is the earth radius, h is the satellite altitude, and .ω is the satellite angular velocity. (1) The Doppler shift is expressed as a function of the elevation angle, i.e., μ(t) =
.
f0 ωR cos θ (t), c
(27)
GME where the angular velocity is .ω = (R+h) 2 , with G the gravitational constant and .ME the earth mass. (2) The delay at instant t is updated as τ (t) =
.
d(t) . c
(28)
However, this result is only applicable for 2D NGSO-ground system. For highspeed UAVs and mobile users on the ground, it is necessary to design a new modelling method to characterize 3D double mobility for NTN.
2.4.4
3D Double-Mobility Modelling
If the user is placed on an aircraft or a high-speed train, there will be an additional term of Doppler shift resulting from its own velocity and the delay resulting from the link distance. As shown in Fig. 7, we consider 3D doublemobility modelling for NTN. We assume that the distance from the user to the NTN node at instant .tk is .d(tk ), and the velocities of the NTN node and the user at instant .tk are denoted as .vt (tk ) = vt (tk )[Ψv,t (tk ), Φv,t (tk ), Ωv,t (tk )]T , and T .vr (tk ) = vr (tk )[Ψv,r (tk ), Φv,r (tk ), Ωv,r (tk )] , respectively. The transmit steering direction at instant .tk is .qt (tk ) = [Ψq,t (tk ), Φq,t (tk ), Ωq,t (tk )]T . Similarly, denote by .qr (tk ) = [Ψq,r (tk ), Φq,r (tk ), Ωq,r (tk )]T the receive steering direction at instant .tk . When .t0 = 0, an NTN node is dropped into the network, and the updated distance of an NTN node should be limited within 1 meter, i.e., .vΔt < 1. The kth time slot is given as .tk = tk−1 + Δt.
Non-terrestrial Network
705
Fig. 7 3D double-mobility model [3]
(1) The Doppler frequency at instant .tk is updated as [45], μ(tk ) =
.
qTr (tk )vr (tk ) + qTt (tk )vt (tk ) . λ
(29)
(2) The delay at instant .tk is updated as [45], .
τ (tk ) =
τ (tk−1 ) +
qTr (tk−1 )vr (rk−1 )+qTt (tk−1 )vt (tk−1 ) Δt c
d(t0 ) c
k > 0, k = 0.
(30)
These results are suitable for all NTN nodes flying in 3D space.
3 Unique Challenges in NTN In this section, we give some important design issues, including multiple access, handover management, interference control, and mobility management.
3.1 Multiple Access Future NTN is expected to provide high-quality service for global users with requirements of low latency, high spectral efficiency, and high reliability. As an open challenge, multiple-access technologies are essential for realizing the full potential of NTN. There are four main multiple-access technologies in NTN, i.e., frequency
706
X. Li et al.
division multiple access (FDMA), time division multiple access (TDMA), code division multiple access (CDMA), and space division multiple access (SDMA) [46]. (1) FDMA refers to a scheme that the NTN node allocates the spacecraft transponder bandwidth into multiple different working carrier frequencies. Due to the presence of high-power amplifiers (HPAs) in the PHY-layer chain, the nonlinear characterization of NTN channels is a hindrance to multicarrier modulations, which causes severe interferences [47]. (2) Different from FDMA, TDMA is a method that all concerned users share the same carrier frequency and bandwidth with time-sharing and nonoverlapping intervals. As such, there is no intermodulation interference problem in TDMA, since only one carrier operates at any time of the satellite transponder. Another advantage of TDMA is based on digital transmission, where information can be easily processed in time domain and information exchange is flexible. However, TDMA may suffer from the complexity of computer procedures and hardware, as well as automated synchronization among users. Besides, owing to the high burst bit rate, the peak power and bandwidth of individual users in TDMA are larger than that of FDMA [48]. (3) Different from FDMA and TDMA, CDMA is a scheme where all concerned users simultaneously share the same bandwidth and recognize their signals by various processes, such as code identification. Actually, all users share the resources of both frequency and time using a set of mutually orthogonal codes. CDMA has strong anti-interference capability, high security, and high flexibility. Therefore, CDMA is widely used in the military satellite communication system. However, compared with FDMA and TDMA, CDMA with relatively low bandwidth efficiency is only applicable for low-rate data transmission, such as very small aperture terminal (VSAT) system [49]. (4) SDMA refers to a scheme that all concerned users can use the same frequency and time resource via a separate space available for each link. There are two different beamforming approaches in SDMA for NTN communications: (1) The multiple-spot beam antennas are the fundamental way of applying SDMA in large satellite systems; (2) adaptive array antennas dynamically adapt to the number of users [46]. However, it is difficult to allocate the independent satellite antennas for each earth station. Thus, SDMA must be used in combination with other multiple-access methods, such as TDMA and FDMA. Besides the four main multiple-access methods, one of the relatively novel multiple-access approaches is the non-orthogonal multiple-access (NOMA) scheme. In this scheme, multiple users of different channel conditions are multiplexed in the power domain on the transmitter side, requiring multiuser signal separation on the receiver side. From an information-theoretic point of view, adopting superposition coding at the transmitter and successive interference cancellation (SIC) at the receiver, non-orthogonal user multiplexing not only outperforms orthogonal multiplexing, but also achieves the capacity region of the downlink broadcast channel. Importantly, NOMA can be also applied to the
Non-terrestrial Network
707
uplink communications. Despite the fact that one can achieve the capacity of the channel under the orthogonal multiple access (OMA), NOMA achieves a better trade-off between system capacity and user fairness [4]. Therefore, higher spectrum efficiency can be achieved with NOMA compared to TDMA or FDMA [50]. However, for NOMA-based NTN systems, the performance analysis is nontrivial as more complicated time scheduling strategies should be designed to optimize the system’s effective capacity performance [51]. Besides, it is a challenge to jointly design resource allocation and UAV trajectory with NOMA in practice [52]. To sum up, it is difficult to improve spectrum efficiency, realize precise synchronization, reduce computation complexity, and strengthen anti-interference simultaneously via existing access technologies in future heterogeneous 3D NTN. Besides, the choice of multiple-access technologies depends on the application scenario and demand.
3.2 Handover Management The handover (HO) mechanism is designed to enable users to move from one cell to another while guaranteeing the service connectivity without noticeable service interruption. One of the most important challenges of NTN is to handle frequent handovers without resulting in an increased number of radio link failures (RLFs), HO failures (HOFs), and ping-pongs (PPs)[53]. Unfortunately, terrestrial network handover management methods are inadequate for NTN handoff management, since NTN is mobile compared to fixed BSs on the ground. Nowadays, lots of researches have paid attention to the analysis of handover management of UAVs or HAPS [4, 54]. In addition, various companies conducted field trials to evaluate the handover performance of aerial UEs in a commercial LTE network, compared to ground users [36]. In this section, we will focus more on handover issues in satellite communications, especially LEO. The nature of LEO satellite links has an important implication on handover. First, LEO footprint can vary at speeds of up to .7.56 .km/s, which is significantly faster than terrestrial user mobility. Even with NTNs’ larger footprints, their speed may lead to an increased number of handovers [55]. Second, longer propagation delay leads to outdated UE measurements [3], late handover judgments, and, eventually, a handover procedure failure [56]. The work [57] classified the handover mechanism in mobile satellite networks. Specifically, from the link layer perspective, it can be divided into three categories: spot-beam HO/intra-satellite HO, in which the end user is reallocated to another spot beam; satellite HO/inter-satellite HO, in which the end user is transferred from the previously connected satellite to another satellite; inter-satellite link (ISL) handover, which happens when interplane ISLs are temporarily switched off. From the network layer perspective, it can also be divided into three schemes, hard handover scheme, soft handover scheme, and signalingdiversity scheme, which are distinguished according to the relationship with the original link when the new link is established. The work [58] discussed different effects of earth fixed cells and earth moving cells on LEO handover, where some
708
X. Li et al.
valuable observations are provided, e.g., the earth fixed cell scenario may lead to periodic peaks (bursts) in the number of handovers. The conventional handover mechanism procedure includes measurement event trigger, HO decision, admission control, and UE reconfiguration. A commonly used trigger is the NR event A3: The event A3 is triggered based on reference signal received power (RSRP) measurements when the RSRP of a neighboring cell becomes HOM dB better than that of the serving cell for time-to-trigger (TTT) ms. According to simulations in [56], the traditional 5G NR HO algorithm fails to provide seamless connectivity in LEO-based NTN. However, a comparison between measurement-based handover and the alternative handover mechanism (namely, distance-based handover, elevation angle-based handover, timer-based handover) in [55] indicated that the conventional HO mechanism can still outperform these alternatives with proper parameter selection for a homogeneous NTN. Note that this conclusion was drawn without considering feasibility or possible errors such as elevation angles, locations, or distances. Moreover, due to the much smaller coverage area radius compared with the propagation distance, reduced signal strength variation may lead to the UE difficulty to distinguish the better cell [3, 55]. For LEO satellite-based NTN, adaptation of the NR handover protocols needs further study by taking advantage of the knowledge of the UE location and satellite ephemeris information [13]. Recently, handover decision algorithms using various decision criteria are proposed, such as reinforcement learning [59], optimization criteria [60], etc. More characteristics may be considered in the future handover research directions [57], including UAV-codesign-related characteristics, millimeter-wave channel characteristics, mobile device characteristics, etc.
3.3 Interference Control Due to the lack of spectral resources of 5G, it is of great importance to adopt interference management for the shared spectrum between NTN and terrestrial equipment [61]. In the past, due to the sparse number of aerial platforms, interference avoidance was generally considered. However, with the increase of NTN types and numbers in 3D space, more serious interferences happen in the groundair and air-air communications. Furthermore, LoS-dominant links of NTN will lead to strong air-to-ground interference, severely limiting cellular network capacity for ground users. In addition, the mobility of transmission equipment in NTN causes the Doppler effect, which depends on the relative velocity as well as the frequency band. Even quasi-static GEO satellites will exhibit some Doppler effects due to perturbations from the sun or moon, as well as nonspherical component earth attraction. The Doppler effect will continually change the carrier frequency, phase, and spacing, which potentially leads to inter-carrier interference (ICI) [13]. Extensive researches on interference control for HAPS, UAVs, and satellites have already been studied. For HAPS, the management of on-board radio power and frequencies, antenna beams, and time scheduling influences interference man-
Non-terrestrial Network
709
Table 4 Summary of the UAV interference management method Method FD-MIMO
Directional antenna
Receive beamforming Intra-site JT CoMP
Coverage extension Power control
Coordinated resource allocation
Path design
Description BS transmitters/receivers are equipped with full-dimensional (FD) antenna arrays to achieve flexible beamforming in both azimuth and elevation dimensions. The aerial UEs equipped with directional antenna instead of an omnidirectional antenna, decreasing the interference power coming from a broad range of angles. The aerial UEs equipped with more than two receive antennas to mitigate the interference in the downlink. Intra-site joint transmission CoMP multiple cells from the same site are coordinated, and data is delivered to the UEs as a group. Enhance synchronization and initial access for aerial UEs. To deal with a heterogeneous network that includes both terrestrial and aerial users, where the present UL power control method might be enhanced by introducing UE-specific power control parameters. Intercell interference can be reduced by jointly optimizing communication resources across cells, which may include channel assignment, power allocation, BS association, and so on. Use the UAV mobility and known (or partially known) information about the environment to choose the path with the least interference.
UL
DL
agement and determining signal to interference plus noise ratio (SINR) levels of HAPS. Figure 7 in [4] also summarized the interference management researches, which is worthy of reference for relevant researchers for HAPS. For UAVs, 3GPP proposed some interference management methods for UAVs, which we summarize in Table 4. Apart from HAPS and UAVs, various approaches have also been proposed to address interference challenges in satellite communications, such as interference coordination, interference elimination, and interference mitigation. Specifically, interference coordination aims to effectively manage system resources in terms of space, time, and frequency. Typical technologies include soft frequency reuse, NOMA, and space protected area [62, 63]. Interference cancellation aims to eliminate the interference between users via mathematical methods and to precode user signals based on CSI to suppress interference. Beamforming and smart antenna technology as representatives of precoding [64] can reduce interference from harmful directions in a dense network. However, these technologies require more complex communication equipment and the accurate location information of interference terminals. Interference mitigation technology generally reduces the interference to a value lower than the acceptable level of the receiver with power control [65]. The aim of power control is to maximize the throughput of the network with SINR requirements of the satellite and terrestrial links. However, it is not
710
X. Li et al.
conducive to the centralized control of the overall network, due to the complex process of information exchanges. To sum up, existing researches on the interference management in NTN mainly focused on UAVs/HAPS-ground, satellite-satellite (e.g., GSO-NGSO satellites) [66], or satellite-ground links [67]. In future 6G mobile communications, different networks in 3D space will be integrated, and the advanced antenna technologies will emerge. As such, the existing interference management technologies may no longer be applicable for heterogeneous and complex NTN distributions in the future.
3.4 Mobility Management As well known, satellites fly in predetermined orbits, and their orbits are usually immutable. Therefore, the management of satellite mobility generally uses satellite ephemeris to predict satellite positions, which provides assistance for mechanisms such as handover. Furthermore, it is usually difficult for balloons and airships to change their tracks flexibly, and they are greatly affected by wind speed and wind direction. We will focus on UAV’s mobility management in this subsection. For the conventional UAVs controlled by radio direct connection, the pilot generally controls the flight trajectory through the UAV controller, or pre-plans the cruise trajectory at the ground station before takeoff and then adjusts the route according to actual needs during the flight. However, for more complex beyond visual line of sight (BVLOS) flight scenarios, it is important to design intelligent adaptive UAV path planning. Considering 3D ubiquitous cellular coverage cannot be guaranteed in general, and SWAP constraints limit the flight time and distance of UAVs, and predictable and controllable path planning design provides a new degree of freedom for UAVs to improve communication performance and mission completion efficiency. The works [68, 69] studied the trajectory design problem of cellular-connected UAVs to minimize the mission completion time while ensuring certain connectivity requirements. In [70], the authors designed 3D trajectories and power allocation for the UAVs in order to maximize data flow while meeting the interference constraint. Besides, the work [71] managed interference by joint optimizing the transmit power of all communication nodes and UAV trajectory to maximize the sum throughput of the target sensors. Although there has been a lot of researches on path planning with different optimization objectives, several important issues need to be considered. First, most works on path planning of cellular-connected UAVs ignore the obstacle avoidance issue by assuming that the UAV altitude is sufficiently high. Such an assumption does not generally hold for future applications of cellularconnected UAVs in complex environments, such as urban areas, and ensuring obstacle avoidance is critical for safe UAV operations. In fact, the problem of obstacle avoidance has been well studied in the field of robotics, and we believe that it can be well applied to the field of UAVs after proper expansion. The work [72] gave an overview of path-planning and obstacle avoidance algorithms for
Non-terrestrial Network
711
UAVs, including potential field algorithm, genetic algorithm, A* algorithm, Dijkstra algorithm, reinforcement learning, etc. Second, some studies rely on strong assumptions such as a pure LoS link and an isotropic BS antenna pattern, which results in a circular BS coverage in the sky at a certain altitude. To overcome these limitations, trajectory design methods based on reinforcement learning and map construction can be considered. The work [73] proposed a new framework named simultaneous navigation and radio mapping (SNARM), which only requires the raw signal measurements as input for path learning and radio map construction. Furthermore, the work [74] extended to simultaneous environment sensing and channel knowledge mapping to facilitate the efficiency of reinforcement learning. Third, existing researches do not cover the specific kinematics and dynamics of UAV flight, where energy is an important factor for actual path planning due to SWAP constraint. The work [75] validated the recently developed theoretical energy model for rotary-wing UAVs and then constructed a general model for those intricate flying scenarios where rigorous theoretical model derivation is difficult. Such an energy model can be utilized in the future path planning design.
4 Future Research Directions of NTN To support the ambitious goals of 6G and to construct a ground-air-space integration network (SAGIN), we provide several future research directions of NTN, including flexible mobility management for heterogeneous networks, cross-layer resource management, seamless and continuous coverage in 3D space, and NTN integrated sensing and location.
4.1 Flexible Mobility Management for Heterogeneous Network NTN is a dynamical and heterogeneous network, including satellites, HAPS, and UAS, which are located at different altitudes, and owns unique antenna and transmission characteristics [76]. Conventionally, each platform operates independently without information exchange and link connection. With the emergence of the industrial Internet of Things, cloud computing, and big data, NTN needs to provide more resources to deal with the increasing traffic demand and massive services simultaneously. Therefore, it is of great significance to construct SAGIN with rational division and effective cooperation among different platforms. Due to the dynamic topology and frequent reconfiguration of NTN, NTN suffers from the challenges of high routing protocol overhead, poor link stability, and high packet loss rate. To adapt to the changing flight environment and mission requirements, we ought to adequately mine the predictability of the position and path, as well as deeply study routing selection and channel access technology
712
X. Li et al.
in 3D space. Consequently, it is necessary to utilize intelligent, adaptive, and robust mobility management to realize close connections with satellite constellation, UAV ad hoc networks, and HAPS. In the future, the heterogeneous NTN has characteristics of high reliability, high safety, giant connection, wide coverage, as well as cross-layer resource allocation.
4.2 Seamless and Continuous Coverage in 3D Space Although satellites can provide wide coverage and support high-capacity communication, satellite communications still have lots of practical challenges in terms of coverage communication, such as the large transmission delay, echo effect, communication blind spots, solar transit interruption, star eclipse, and rain attenuation. As such, it is crucial to utilize UAVs and HAPS to assist satellite communication, ensure seamless coverage, and improve communication performance. Besides, integrating NTN into terrestrial networks to realize high dynamic, giant connection, and highly robust networking is also the future development of NTN. Therefore, we have to comprehensively consider the number of terrestrial and NTN nodes, the categories of NTN nodes, their deployments, and the number of antennas, to establish a space information highway. Moreover, the NOMA method, 3D multi-beamforming, ad hoc networking, the novel architectures including intelligent reflecting surface (RIS), and modular antenna array can be utilized, to achieve seamless and continuous coverage of the 3D dynamical space.
4.3 Cross-Layer Resource Management Most existing research on resource allocations in NTN communications mainly focused on a single layer, such as the physical layer or link layer [77]. To support the ambitious goals of 6G, a cross-layer design should be considered to jointly optimize the resource allocation of NTN. Information interaction between different platforms is one of the crucial problems in NTN. For example, a HAPS can usually serve as a controller to manage a large number of UAVs, and the UAV can be utilized to amplify and forward information from the ground. Moreover, the impact of massive MIMO channel characteristics in NTN remains a challenging research topic. A variety of intrinsic properties of space, aerial, and ground massive MIMO communication systems demands joint optimization of massive MIMO architectures, including time/frequency scheduling, beamforming design and interference mitigation, access management of different terminals or networks, and so on. Clearly, it is crucial to resort for the cross-layer design to achieve better performance of NTN.
Non-terrestrial Network
713
4.4 NTN Integrated Sensing and Localization Sensing and positioning can provide location-based services such as navigation, mapping, and intelligent transportation. In addition, through wireless sensing, the obtained accurate location information can realize location-assisted (or locationaware) communication to improve communication performance and network efficiency. However, since mobile terminals, such as vehicles, UAVs, or satellites, may move in various environments, electromagnetic waves will cause Doppler shift, multipath interference, and transmission delay, which seriously deteriorate the communication quality. Therefore, it is a challenge to simultaneously track and communicate with mobile users in NTN. With the introduction of millimeter wave (mmwave) technology and the development of extremely large-scale antenna arrays (ELAA), future B5G and 6G communication networks can achieve ultrahigh throughput and ultrareliable data connections. The highly directional transmission also makes sensing and positioning possible in future 6G. It is also worth mentioning that as a new array architecture, the reconfigurable holographic surface (RHS) has the potential to improve integrating sensing and communication (ISAC) performance and reduce hardware costs, due to much smaller element spacing and no phase shifters. As such, RHS-aided NTN communications can help terrestrial or non-terrestrial equipment track mobile users in 3D space. Acknowledgments This work was supported by the National Natural Science Foundation of China under grant 62071114 and 62172339.
References 1. X. You, C. Wang, J. Huang, X. Gao, Z. Zhang, and M. Wang, “Towards 6G wireless communication networks: Vision, enabling technologies, and new paradigm shifts,” Science China Information Sciences, vol. 64, no. 1, pp. 1–74, 2020. 2. B. Aazhang, P. Ahokangas, H. Alves, M.-S. Alouini, J. Beek, H. Benn, M. Bennis, J. Belfiore, E. Strinati, F. Chen, K. Chang, F. Clazzer, S. Dizit, D. Kwon, M. Giordiani, W. Haselmayr, J. Haapola, E. Hardouin, E. Harjula, and P. Zhu, Key drivers and research challenges for 6G ubiquitous wireless intelligence (white paper), 2019. 3. 3GPP, “Solutions for NR to support non-terrestrial networks (NTN),” 2019. 4. G. Karabulut Kurt, M. G. Khoshkholgh, S. Alfattani, A. Ibrahim, T. S. J. Darwish, M. S. Alam, H. Yanikomeroglu, and A. Yongacoglu, “A vision and framework for the high altitude platform station (HAPs) networks of the future,” IEEE Communications Surveys and Tutorials, vol. 23, no. 2, pp. 729–779, 2021. 5. Y. Maguire, “High altitude connectivity: The next chapter,” Facebook, 2018. 6. Y. Zeng, Q. Wu, and R. Zhang, “Accessing from the sky: A tutorial on UAV communications for 5G and beyond,” Proceedings of the IEEE, vol. 107, no. 12, pp. 2327–2375, 2019. 7. V. U. Pai and B. Sainath, “UAV selection and link switching policy for hybrid tethered UAVassisted communication,” IEEE Communications Letters, vol. 25, no. 7, pp. 2410–2414, 2021.
714
X. Li et al.
8. R. Giofrè, P. Colantonio, F. Giannini, L. Gonzalez, L. Cabria, and F. De Arriba, “Development of solid state power amplifier on GaN technology for Galileo satellite systems,” in 2016 21st International Conference on Microwave, Radar and Wireless Communications (MIKON), 2016, pp. 1–4. 9. M. Quintan, “Galileo - a European global satellite navigation system,” in 2005 The IEE Seminar on New Developments and Opportunities in Global Navigation Satellite Systems (Ref. No. 2005/10810), 2005, pp. 9–16. 10. “DJI-official website,” https://www.dji.com. 11. F. Rinaldi, H.-L. Maattanen, J. Torsner, S. Pizzi, S. Andreev, A. Iera, Y. Koucheryavy, and G. Araniti, “Non-terrestrial networks in 5G & beyond: A survey,” IEEE Access, vol. 8, pp. 165 178–165 200, 2020. 12. X. Lin, S. Rommer, S. Euler, E. A. Yavuz, and R. S. Karlsson, “5G from space: An overview of 3GPP non-terrestrial networks,” IEEE Communications Standards Magazine, 2021. 13. 3GPP, “Study on new radio (NR) to support non-terrestrial networks,” 2019. 14. A. Mohammed, A. Mehmood, F.-N. Pavlidou, and M. Mohorcic, “The role of high-altitude platforms (HAPs) in the global wireless connectivity,” Proceedings of the IEEE, vol. 99, no. 11, pp. 1939–1953, 2011. 15. Y. Zeng, R. Zhang, and T. J. Lim, “Wireless communications with unmanned aerial vehicles: Opportunities and challenges,” IEEE Communications Magazine, vol. 54, no. 5, pp. 36–42, 2016. 16. Y. Zeng, J. Lyu, and R. Zhang, “Cellular-connected UAV: Potential, challenges, and promising technologies,” IEEE Wireless Communications, vol. 26, no. 1, pp. 120–127, 2018. 17. “3GPP specifications: active work programme,” https://www.3gpp.org/DynaReport/ GanttChart-Level-2.htm. 18. G. Geraci, A. Garcia-Rodriguez, M. M. Azari, A. Lozano, M. Mezzavilla, S. Chatzinotas, Y. Chen, S. Rangan, and M. Di Renzo, “What will the future of UAV cellular communications be? a flight from 5G to 6G,” arXiv preprint arXiv:2105.04842, 2021. 19. M. Giordani and M. Zorzi, “Non-terrestrial networks in the 6G era: Challenges and opportunities,” IEEE Network, vol. 35, no. 2, pp. 244–251, 2020. 20. S. Chen, S. Sun, and S. Kang, “System integration of terrestrial mobile communication and satellite communication—the trends, challenges and key technologies in B5G and 6G,” China Communications, vol. 17, no. 12, pp. 156–171, 2020. 21. A. Pino, Y. Rodriguez-Vaqueiro, B. Gonzalez-Valdes, M. Arias Acuña, D. Martinez-De-Rioja, J. A. Encinar, and G. Toso, “A multibeam parabolic reflectarray for onboard Tx and Rx satellite antennas at the Ka band,” in 2018 IEEE International Symposium on Antennas and Propagation USNC/URSI National Radio Science Meeting, 2018, pp. 1409–1410. 22. S.-M. Moon, S. Yun, I.-B. Yom, and H. L. Lee, “Phased array shaped-beam satellite antenna with boosted-beam control,” IEEE Transactions on Antennas and Propagation, vol. 67, no. 12, pp. 7633–7636, 2019. 23. H. He, S. Zhang, Y. Zeng, and R. Zhang, “Joint altitude and beamwidth optimization for UAVenabled multiuser communications,” IEEE Communications Letters, vol. 22, no. 2, pp. 344– 347, 2018. 24. P.-D. Arapoglou, K. Liolis, M. Bertinelli, A. Panagopoulos, P. Cottis, and R. De Gaudenzi, “Mimo over satellite: A review,” IEEE Communications Surveys and Tutorials, vol. 13, no. 1, pp. 27–51, 2011. 25. L. You, K.-X. Li, J. Wang, X. Gao, X.-G. Xia, and B. Ottersten, “Massive MIMO transmission for LEO satellite communications,” IEEE Journal on Selected Areas in Communications, vol. 38, no. 8, pp. 1851–1865, 2020. 26. G. C. Hess, “Land-mobile satellite excess path loss measurements,” IEEE Transactions on Vehicular Technology, vol. 29, no. 2, pp. 290–297, 1980. 27. C. Loo, “A statistical model for a land mobile satellite link,” IEEE Transactions on Vehicular Technology, vol. 34, no. 3, pp. 122–127, 1985.
Non-terrestrial Network
715
28. L. Bai, C.-X. Wang, G. Goussetis, S. Wu, Q. Zhu, W. Zhou, and E.-H. M. Aggoune, “Channel modeling for satellite communication channels at Q-band in high latitude,” IEEE Access, vol. 7, pp. 137 691–137 703, 2019. 29. L. Bai, Q. Xu, Z. Huang, S. Wu, S. Ventouras, G. Goussetis, and X. Cheng, “An atmospheric data-driven Q-band satellite channel model with feature selection,” IEEE Transactions on Antennas and Propagation, vol. 70, no. 6, pp. 4002–4013, 2022. 30. S. Zheng, W. Liu, Z. Deng, K. Wang, W. Lin, J. Lei, Y. Jin, and H. Liu, “A modified s-band satellite channel simulation model,” in 2021 IEEE 4th International Conference on Electronics Technology (ICET), 2021, pp. 722–726. 31. I. Recommendation, “Propagation data required for the design of earth-space land mobile telecommunication systems,” International Telecommunication Union, pp. 681–686, 2009. 32. P. Series, “Propagation data and prediction methods required for the design of earth-space telecommunication systems,” Recommendation ITU-R, pp. 618–12, 2015. 33. Z. Ma, B. Ai, R. He, G. Wang, Y. Niu, and Z. Zhong, “A wideband non-stationary air-toair channel model for UAV communications,” IEEE Transactions on Vehicular Technology, vol. 69, no. 2, pp. 1214–1226, 2019. 34. R. Amorim, H. Nguyen, P. Mogensen, I. Z. Kovács, J. Wigard, and T. B. Sørensen, “Radio channel modeling for UAV communication over cellular networks,” IEEE Wireless Communications Letters, vol. 6, no. 4, pp. 514–517, 2017. 35. M. M. Azari, F. Rosas, K.-C. Chen, and S. Pollin, “Ultra reliable UAV communication using altitude and cooperation diversity,” IEEE Transactions on Communications, vol. 66, no. 1, pp. 330–344, 2017. 36. 3GPP, “Study on enhanced LTE support for aerial vehicles,” 2017. 37. ——, “Study on channel model for frequencies from 0.5 to 100 GHz,” 2017. 38. N. Saeed, H. Almorad, H. Dahrouj, T. Y. Al-Naffouri, J. S. Shamma, and M.-S. Alouini, “Pointto-point communication in integrated satellite-aerial 6G networks: State-of-the-art and future challenges,” IEEE Open Journal of the Communications Society, vol. 2, pp. 1505–1525, 2021. 39. 3GPP, “Solutions for NR to support non-terrestrial networks (NTN),” 2021. 40. S. Berrezzoug, F. T. Bendimerad, and A. Boudjemai, “Communication satellite link budget optimization using gravitational search algorithm,” in 2015 3rd International Conference on Control, Engineering Information Technology (CEIT), 2015, pp. 1–7. 41. A. Guidotti, A. Vanelli-Coralli, A. Mengali, and S. Cioni, “Non-terrestrial networks: Link budget analysis,” in ICC 2020 - 2020 IEEE International Conference on Communications (ICC), 2020, pp. 1–7. 42. 3GPP, “Study on new radio (NR) to support non-terrestrial networks,” 2020. 43. L. J. Ippolito, Satellite Orbits, 2008, pp. 19–36. 44. D. Roddy, Satellite Communications, McGraw Hill, New York, 2006. 45. 3GPP, “CR to TR 38.901 for remaining open issues in iiot channel modelling,” 2019. 46. S. D. Ilcev, “Space division multiple access (SDMA) applicable for mobile satellite communications,” in 2011 10th International Conference on Telecommunication in Modern Satellite Cable and Broadcasting Services (TELSIKS), vol. 2, 2011, pp. 693–696. 47. R. Mulinde, T. F. Rahman, and C. Sacchi, “Constant-envelope SC-FDMA for nonlinear satellite channels,” in 2013 IEEE Global Communications Conference (GLOBECOM), 2013, pp. 2939– 2944. 48. S. D. Lcev, “Time division multiple access (TDMA) applicable for mobile satellite communications,” in 2011 21st International Crimean Conference “Microwave Telecommunication Technology”, 2011, pp. 365–367. 49. P. Monsen, “Multiple-access capacity in mobile user satellite systems,” IEEE Journal on Selected Areas in Communications, vol. 13, no. 2, pp. 222–231, 1995. 50. S. M. R. Islam, N. Avazov, O. A. Dobre, and K.-s. Kwak, “Power-domain non-orthogonal multiple access (NOMA) in 5G systems: Potentials and challenges,” IEEE Communications Surveys and Tutorials, vol. 19, no. 2, pp. 721–742, 2017. 51. L. Bai, L. Zhu, X. Zhang, W. Zhang, and Q. Yu, “Multi-satellite relay transmission in 5G: Concepts, techniques, and challenges,” IEEE Network, vol. 32, no. 5, pp. 38–44, 2018.
716
X. Li et al.
52. P. Li and J. Xu, “Fundamental rate limits of UAV-enabled multiple access channel with trajectory optimization,” IEEE Transactions on Wireless Communications, vol. 19, no. 1, pp. 458–474, 2020. 53. 3GPP, “Radio resource control (RRC) protocol specification,” 2019. 54. L. Gupta, R. Jain, and G. Vaszkun, “Survey of important issues in UAV communication networks,” IEEE Communications Surveys and Tutorials, vol. 18, no. 2, pp. 1123–1152, 2015. 55. Y. I. Demir, M. S. J. Solaija, and H. Arslan, “On the performance of handover mechanisms for non-terrestrial networks,” arXiv preprint arXiv:2201.04904, 2022. 56. E. Juan, M. Lauridsen, J. Wigard, and P. E. Mogensen, “5G new radio mobility performance in LEO-based non-terrestrial networks,” in 2020 IEEE Globecom Workshops (GC Wkshps), 2020, pp. 1–6. 57. S. Park and J. Kim, “Trends in LEO satellite handover algorithms,” in 2021 Twelfth International Conference on Ubiquitous and Future Networks (ICUFN), 2021, pp. 422–425. 58. 3GPP, “Discussion on earth fixed vs. earth moving cells in NTN LEO,” 2019. 59. S. He, T. Wang, and S. Wang, “Load-aware satellite handover strategy based on multiagent reinforcement learning,” in GLOBECOM 2020-2020 IEEE Global Communications Conference, 2020, pp. 1–6. 60. Y. Li, W. Zhou, and S. Zhou, “Forecast based handover in an extensible multi-layer leo mobile satellite system,” IEEE Access, vol. 8, pp. 42 768–42 783, 2020. 61. C. Zhang, C. Jiang, L. Kuang, J. Jin, Y. He, and Z. Han, “Spatial spectrum sharing for satellite and terrestrial communication networks,” IEEE Transactions on Aerospace and Electronic Systems, vol. 55, no. 3, pp. 1075–1089, 2019. 62. E. Lagunas, C. G. Tsinos, S. K. Sharma, and S. Chatzinotas, “5G cellular and fixed satellite service spectrum coexistence in c-band,” IEEE Access, vol. 8, pp. 72 078–72 094, 2020. 63. A. Ugolini, Y. Zanettini, A. Piemontese, A. Vanelli-Coralli, and G. Colavolpe, “Efficient satellite systems based on interference management and exploitation,” in 2016 50th Asilomar Conference on Signals, Systems and Computers, 2016, pp. 492–496. 64. S. K. Sharma, S. Chatzinotas, and B. Ottersten, “Transmit beamforming for spectral coexistence of satellite and terrestrial networks,” in 8th International Conference on Cognitive Radio Oriented Wireless Networks, 2013, pp. 275–281. 65. E. Lagunas, S. K. Sharma, S. Maleki, S. Chatzinotas, and B. Ottersten, “Resource allocation for cognitive satellite communications with incumbent terrestrial networks,” IEEE Transactions on Cognitive Communications and Networking, vol. 1, no. 3, pp. 305–317, 2015. 66. S. Chatzinotas, S. K. Sharma, and B. Ottersten, “Frequency packing for interference alignmentbased cognitive dual satellite systems,” in 2013 IEEE 78th Vehicular Technology Conference (VTC Fall), 2013, pp. 1–7. 67. L. Zhong, D. Zhou, R. Liu, X. Wang, and X. Meng, “The feasibility of coexistence between IMT-2020 and inter-satellite service in 26GHz band,” in 2020 International Wireless Communications and Mobile Computing (IWCMC), 2020, pp. 1006–1011. 68. S. Zhang, Y. Zeng, and R. Zhang, “Cellular-enabled UAV communication: A connectivityconstrained trajectory optimization perspective,” IEEE Transactions on Communications, vol. 67, no. 3, pp. 2580–2604, 2018. 69. E. Bulut and I. Guevenc, “Trajectory optimization for cellular-connected UAVs with disconnectivity constraint,” in 2018 IEEE International Conference on Communications Workshops (ICC Workshops), 2018, pp. 1–6. 70. A. Rahmati, S. Hosseinalipour, Y. Yapıcı, X. He, I. Güvenç, H. Dai, and A. Bhuyan, “Dynamic interference management for UAV-assisted wireless networks,” IEEE Transactions on Wireless Communications, vol. 21, no. 4, pp. 2637–2653, 2021. 71. S. Zhang, S. Shi, S. Gu, and X. Gu, “Power control and trajectory planning based interference management for UAV-assisted wireless sensor networks,” IEEE Access, vol. 8, pp. 3453–3464, 2019. 72. M. Radmanesh, M. Kumar, P. H. Guentert, and M. Sarim, “Overview of path-planning and obstacle avoidance algorithms for UAVs: A comparative study,” Unmanned systems, vol. 6, no. 02, pp. 95–118, 2018.
Non-terrestrial Network
717
73. Y. Zeng, X. Xu, S. Jin, and R. Zhang, “Simultaneous navigation and radio mapping for cellular-connected UAV with deep reinforcement learning,” IEEE Transactions on Wireless Communications, vol. 20, no. 7, pp. 4205–4220, 2021. 74. Y. Huang and Y. Zeng, “Simultaneous environment sensing and channel knowledge mapping for cellular-connected UAV,” in 2021 IEEE Globecom Workshops (GC Wkshps), 2021, pp. 1–6. 75. N. Gao, Y. Zeng, J. Wang, D. Wu, C. Zhang, Q. Song, J. Qian, and S. Jin, “Energy model for UAV communications: experimental validation and model generalization,” China Communications, vol. 18, no. 7, pp. 253–264, 2021. 76. M. Giordani and M. Zorzi, “Non-terrestrial networks in the 6G era: Challenges and opportunities,” IEEE Network, vol. 35, no. 2, pp. 244–251, 2021. 77. F. Rinaldi, H.-L. Maattanen, J. Torsner, S. Pizzi, S. Andreev, A. Iera, Y. Koucheryavy, and G. Araniti, “Non-terrestrial networks in 5G beyond: A survey,” IEEE Access, vol. 8, pp. 165 178–165 200, 2020.
Convergence of 6G and Wi-Fi Networks Hyunsoo Lee, Soohyun Park, Minjae Yoo, Chanyoung Park, Hankyul Baek, and Joongheon Kim
1 Background and Motivation Prior to the widespread of smartphones, WLAN was the only device that supported wireless connections to laptops. However, since the introduction of Apple’s iPhone, smartphones have been equipped with both Wi-Fi modules and cellular chips. As a result, mobile users access the Internet using Wi-Fi when stationary and switch to cellular networks when on the move. With the increasing use of mobile communication, the amount of exchanged data has grown exponentially, leading to what is commonly referred to as the “flood of information.” To maximize the benefits of both cellular and Wi-Fi networks and increase network throughput, there has been a consideration of using both networks simultaneously. The convergence of cellular and Wi-Fi networks is expected to lead to new business models in various industrial fields, such as manufacturing, robotics, AR/VR, public Wi-Fi, home appliances, and edge computing. However, there are some challenges to overcome when aggregating cellular and Wi-Fi networks. One such issue is fairness between the two networks. When cellular and Wi-Fi attempt to transmit on the same channel, the transmission can be unfair to Wi-Fi, as LTE can occupy the communication channel. While the licensed band is used for cellular networks, the base station schedules and transmits users without users competing for channel access. However, for Wi-Fi networks using the unlicensed band, users must contend with other users for channel access. This chapter provides an overview of the convergence of cellular and Wi-Fi networks. Section 2 focuses on the development of modern Wi-Fi technology, highlighting the advancements that have led to the present-day Wi-Fi standards. In
H. Lee · S. Park · M. Yoo · C. Park · H. Baek · J. Kim () Korea University, Seoul, South Korea e-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_28
719
720
H. Lee et al.
Sect. 3, the evolution of cellular communication is discussed underlining from 1G to 5G. Section 4 explains the current convergence technology in detail, and Sect. 5 describes the future direction on the emerging trends and technologies. Finally, the chapter ends with Sect. 6.
2 Trends in Modern Wi-Fi Technology Evolution This section provides the evolution of Wi-Fi systems. The progress of Wi-Fi technology is presented in Table 1. Wi-Fi technology, also known as WLAN, has undergone significant progress since the introduction of IEEE 802.11 legacy standard in 1997. The 802.11 standards have since been developed to connect personal mobile devices such as smartphones, tablets, and laptops and have become an essential component of data networks. In addition to IEEE standards, the Wi-Fi Alliance performs standard revision work, including Wi-Fi-based service standards and secure compatibility of wireless connections between products. The initial Wi-Fi standard 802.11 in 1997 supported a rate of up to 2 Mbps. A simple direct-sequence spread spectrum (DSSS) was adopted as the modulation method. Currently, it is only used to search or connect Wi-Fi devices. After that, 802.11a and 802.11b standards were released in 1999. The maximum speed of 802.11b was 11 Mbps, which is 5.5 times faster than the legacy standard but slower than 802.11a, but overcame it with the coverage of the 2.4-GHz frequency characteristic with wide coverage thanks to the good diffraction. 11a supported up to 54 Mbps by applying OFDM for the first time. However, due to the characteristics of the 5-GHz frequency band, the narrow coverage made 11a less popular than 11b. However, OFDM in 11a was applied to the 2.4 GHz frequency band, becoming a core technology for Wi-Fi modulation, and an 11g standard was released that supports up to 54 Mbps. WLAN built-in laptops also appeared in earnest from models after 2004, and smartphones first emerged from the iPhone in 2007. The following 11n supported up to 600 Mbps, thanks to channel bonding and multipleinput multiple-output (MIMO) technology [1]. 11ac is the specification published in 2013. It only uses a 5-GHz band, supporting up to 3.7 Gbps of data rate, which is 68.5 times faster than Wi-Fi 4. 11ac is the first Wi-Fi standard that applied beamforming and 256-QAM [2]. By supporting up to .8 × 8 MIMO, the wide 5-GHz band can be properly used through channel bonding. 11ax is a standard developed to improve the weak wireless network output, a weakness of 802.11ac, and to guarantee the best QoS even when many devices connect simultaneously in a wide range [3]. In 2020, 11ax applied various brand-new technologies, including OFDMA and MU-MIMO that are available both uplink and downlink, spatial reuse, and TWT, supporting high data rates up to 9.6 Gbps. 11ax expanded from 802.11ac up to 256-QAM to 1024-QAM. Study on Wi-Fi 7 is still ongoing, planned to be standardized in 2024 [4]. It is expected that Wi-Fi 7 supports up to .16 × 16 MIMO and introduces fullduplex multiplexing technology through multi-link operation (MLO) to perform
802.11g 2003 54 Mbps
2.4 GHz
20 MHz
1×1
802.11a 1999 54 Mbps
5 GHz
20 MHz
1×1
802.11b 1999 11 Mbps
2.4 GHz
20 MHz
1×1
Year Data rate
Frequency
Bandwidth
MIMO
Table 1 The progress of Wi-Fi technology
20 MHz 40 MHz 4×4
2.4 GHz 5 GHz
Wi-Fi 4 (802.11n) 2009 600 Mbps
80 MHz 160 MHz 8×8
5 GHz
Wi-Fi 5 (802.11ac) 2013 6.8 Gbps
Wi-Fi 6/6E (802.11ax) 2020 4.8 Gbps 9.6 Gbps (6E) 2.4 GHz 5 GHz 6 GHz (6E) 80 MHz 160 MHz 8×8
2.4 GHz 5 GHz 6 GHz 160 MHz 320 MHz 16 × 16
Wi-Fi 7 (802.11be) 2024 46 Gbps
Convergence of 6G and Wi-Fi Networks 721
722
H. Lee et al.
multichannel/multiband operation. In January 2022, fabless semiconductor company MediaTek announced the first live demo of its Wi-Fi 7 technology and said it could provide services that support multiplayer AR technology, 4K phone calls, 8K video streaming, and many more. Wi-Fi 7 applies 320-MHz bandwidth, supports 4K-QAM, and aims to provide more than twice the speed compared to Wi-Fi 6, even with the same number of antennas.
3 Trends in Cellular Evolution About every decade, a new generation of mobile communications emerges. Firstgeneration (1G) analog cellular systems first provided mobile voice call services by using frequency division multiple access (FDMA), including the first-generation analog FM cellular systems in 1981 [6]. However, because no encryption is applied to the phone service, phone conversations and data transfers cannot be made private or secure. As indicated in Fig. 1, the data rate of 1G service is 2.4 kbps, and the bandwidth is 30 kHz. Many disadvantages such as hard handovers, low data transmission efficiency, and vulnerable security caused rapid progress in cellular communication. 1G analog systems were replaced by second-generation (2G) digital cellular networks in the early 1990s. Code division multiple access (CDMA) and time division multiple access (TDMA) were digital modulation technologies applied in
Fig. 1 Evolution of cellular network [5]
Convergence of 6G and Wi-Fi Networks
723
2G networks. 2G enabled people around the world to enjoy the conveniences of mobile voice, internal roaming, conference calls, short text messaging, and lowcost data services. In Fig. 1, the data rate of the 2G service is 9.6 200 kbps, and the bandwidth is 200 kHz. Service providers could charge users for the data transferred rather than connection time after 2.5G, thanks to progressive techniques such as general packet radio service (GPRS) and enhanced data rate for GSM evolution (EDGE). In 2000, 3GPP announced the third-generation (3G) network for Internet access and communications at a speed of 2 Mbps at least. 3G system supported an upgrade version of security service based on 2G. Wide-band CDMA (WCDMA) became the primary technology with a universal mobile telecommunication system (UMTS), the network defined by 3GPP. Although provided data rate was insufficient to support progressed services like video streaming, the 3G system upgraded security for voice calls by including bidirectional authenticating and the authentication and key agreement (AKA). 3.5G achieved higher data rates than 3G with technologies, such as high-speed downlink packet access (HSDPA). After that, commercial long-term evolution (LTE) networks were rolled out as completely IP-based systems, serving the world’s first fourth-generation (4G) mobile broadband services in December 2009. LTE uses RAN architecture instead of a circuit switching method, so service providers had to change a lot of infrastructure frameworks. Powered by the combination of multi-input multi-output (MIMO) and orthogonal frequency-division multiplexing (OFDM), 4G systems are accelerating the proliferation of smartphones and expanding the multi-trillion-dollar mobile Internet industry per year. Unlike generations before, 5G has focused only on improving network capacity, and 5G is expanding mobile communication services from humans to things. The potential scale of mobile subscriptions expands significantly from billions of the world’s population to nearly uncountable interconnections between humans and other things. 5G services provide data rates of over 20 Gbps and bandwidths of 0.25 to 1 GHz, as described in Fig. 1. It provides services ranging from traditional mobile broadband to Industry 4.0, new frequency bands (e.g., millimeter wave (mmWave) and optical spectra), virtual/augmented reality (VR/AR), Internet of Things (IoT), and ubiquitous on-demand coverage [7]. 5G services also have played an essential role in digital infrastructures in running societies and connecting families, particularly in 5G services and applications such as telesurgeons/teleworking, online education, unmanned delivery, smart healthcare, and autonomous manufacturing. Currently, 5G is still widely deployed and used worldwide. However, 5G cannot support the increasing demand for wireless communication until 2030. To meet the future demand for information and communication technology (ICT), it is time to pay attention to six-generation (6G) systems [8]. The 6G system, a new paradigm for wireless communication, is expected to be implemented between 2027 and 2030 with full support from artificial intelligence. Some basic challenges 6G needs to address are higher system capacity/data rates/security, lower latency, and improved QoS compared to 5G systems. In detail, the peak data rate is expected to reach up to 1 Tbps, which is ten times faster than 5G. By providing rapid transmission, 6G
724
H. Lee et al.
is likely to guarantee .1−10−7 , or 99.99999%, of reliability and latency of 100 to 10 .μs. 6G services will become an important key infrastructure due to state-of-theart techniques like artificial intelligence (AI), terahertz band communications, quantum communications, and wireless optical technologies while ensuring QoS. Various industrial fields including blockchain, big data analytics, and unmanned aerial mobility (UAM) will gain momentum thanks to 6G. 6G systems require large-scale interfaces, ubiquitous computing between local devices and the cloud, multisensor data fusion to create diverse, complex reality experiences, and sensing and operational accuracy to control the physical world. Research related to 6G is still in its infancy and is in the research stage [9].
4 Major Technologies for Cellular and Wi-Fi Convergence This section provides a comprehensive overview of the major cellular and Wi-Fi convergence technologies that have been developed to date. The convergence of cellular networks and Wi-Fi has moved beyond a complementary relationship to a close collaborative relationship. As both cellular and Wi-Fi traffic increase, cellular networks are improving their capacity and data rates, while Wi-Fi is providing stable service quality. This section introduces representative cellular-Wi-Fi integrating technologies, including LTE-U/LAA, LWA, MPTCP proxy, and NR-U [10, 11]. Carrier aggregation is a technology that can broaden the total frequency bandwidth and increase user data rates by combining multiple communication channels. While this technology is not directly related to Wi-Fi access, the concept can be applied to the convergence of cellular and Wi-Fi. Bonded channels in LTE-CA technology are divided into a single primary channel and one or more secondary channels, which can broaden the total frequency bandwidth and increase user data rates. Control for carrier aggregation only occurs in the primary channel, with multiple secondary channels being converged or diverged using the primary channel as an anchor. In releases 10 to 12 of the 3GPP, LTE-A could bond up to five channels by adopting carrier aggregation and Coordinated MultiPoint (CoMP). The LTE-A Pro in 3GPP release 13 and 14 could bond up to 32 LTE channels achieving data rates from hundreds of Mbps to several Gbps with advanced technologies such as beamforming, multiple CA, and D2D communication.
4.1 LTE-LAA To improve the data rate of LTE, expanding carrier aggregation of LTE to the unlicensed band was challenged in 3GPP [12]. For LTE-LAA, the LTE channel roles
Convergence of 6G and Wi-Fi Networks
725
Fig. 2 LTE-U/LAA deployment scenarios [12]
as the primary channel, and the unlicensed Wi-Fi channel supports data transmission as a secondary channel and does not need to be used alone. The LTE-U/LAA network construction scenario considers four cases depicted in Fig. 2. Scenario 1 is CA between a licensed band macro base station and an unlicensed band small cell. Scenario 2 is CA between a licensed band small cell and an unlicensed band small cell, scenario 3 is CA between a small cell of the same frequency as a macro base station and an unlicensed band small cell within the licensed band macro coverage, and scenario 4 is within the macro coverage in CA between a base station and a small cell of a different licensed band frequency and an unlicensed band small cell. In the case of scenario 4, if the macro base station and the small cell are connected through an ideal backhaul, CA that aggregates the macro base station and the small cell in the licensed band and the small cell in the unlicensed band is also possible. If users have previously used the 5-GHz band, there is a fairness issue between existing Wi-Fi users and LTE-LAA users. In other words, if the frequency bands and channels of existing users are reduced because of LTE-LAA, the slower data transmission speed of the Wi-Fi network is inevitable. A method called “Listen Before Talk (LBT)” was required to solve this problem. According to countryspecific Wi-Fi regulations, LBT in Wi-Fi should be required in Europe and Japan, whereas LBT is not required in Korea, China, and the United States. Thus, the LTE-U standard based on the duty cycle was applied without LBT regulations. The cell of LTE-U chooses a clean channel for spectrum sensing. If there is a clean channel, then the data is transmitted by a full-duty cycle. Else, the cell selects the least crowded channel and transmits data using the dynamic duty cycle. On the other hand, in nations with LBT regulations, the LAA method standardized in 3GPP release 13 can be applied. In Europe, frame-based equipment (FBE) and load-based equipment (LBE) are defined as LBT mechanisms in the 5-GHz band [13]. The difference between FBE-LBT and LBE-LBT is described in Fig. 3.
726
H. Lee et al.
Fig. 3 FBE-LBT vs. LBE-LBT
• FBE-LBT: Tx and Rx have fixed timing based on the fixed-frame period. Therefore, at the end of the idle frame, to check if the channel is clear or not, a clear channel assessment (CCA) is operated once in every frame period. If the channel is occupied by other users, CCA is operated right after the frame. • LBE-LBT: Unlike FBE-LBT, LBE-LBT has no fixed timing, and it operates CCA when there is data to transmit, and if the channel is clear, it is transmitted immediately. Else if it is occupied by another user, it is transmitted after backoff during extended CCA. The LBE-LBT method is similar to Wi-Fi in that it has a backoff time, but unlike Wi-Fi’s exponential backoff, LBE applies a fixed linear backoff. In Wi-Fi, random backoff is triggered when a channel collision occurs, increasing the contention window size and decreasing the channel access opportunity as the channel load increases. Conversely, the LBE method uses a fixed window size range, ensuring the channel access opportunity remains constant even with higher channel loads. Meanwhile, 3GPP has standardized LBE-LBT with varying window sizes. For coexistence with Wi-Fi, the goal pursued by LTE-LAA is that the effect on existing Wi-Fi should not exceed that of adding at least one new Wi-Fi. To do so, it is necessary to acquire a frequency use opportunity on the same basis as existing Wi-Fi devices. LBT in LTE-LAA is designed with the same philosophy as in Wi-Fi [14], and several conditions are additionally considered to ensure fair coexistence. The contention window should be variable, so exponential backoff can be applied like Wi-Fi. Also, LBT applies a deferred period before transmission when the channel is changed to idle, recommended to match with Wi-Fi parameters.
Convergence of 6G and Wi-Fi Networks
727
Since LTE-U/LAA small cells require new 5-GHz LTE hardware, service providers should pay a high cost to construct new access network infrastructure. Also, most of the recently released devices are equipped with hardware capable of LAA. LTE-LAA was standardized in 3GPP release 13. Release 13 defined fair LTEWi-Fi coexistence standards in downlink (DL), and DL/UL transmission, including uplink (UL), was dealt with in release 14. A method of operating with a variable size of contention window and random backoff has been proposed. Then, the fixed window size described in the LBE-LBT of Fig. 3 is variably changed [10, 15]. In this case, it is most similar to the existing Wi-Fi channel access, so it is the most suitable method as a fair access solution.
4.2 LTE-LWA LTE-LWA has a significant advantage over LTE-LAA in that it does not require 5-GHz hardware in existing user terminals and small cells. Both existing user terminals and base stations can be used with a simple firmware update. LTE traffic is delivered through Wi-Fi, enabling Wi-Fi APs to deliver LTE traffic without requiring 5-GHz hardware for LTE cells. Additionally, Wi-Fi APs can use the LTE core network function without an additional gateway and without affecting the performance of existing native Wi-Fi APs. The LWA framework comprises an LWA base station, Wi-Fi AP, and user equipment as described in Fig. 4. The integrated LWA base station and Wi-Fi can operate together or separately, with data transmitted through the IP tunnel. The LWA base station schedules packet data convergence protocol (PDCP) packets at the PDCP layer, which compresses and decompresses IP data, and data transmission, through both LTE and Wi-Fi [16]. Some packets are transmitted to LTE, and others are encapsulated in the Wi-Fi frame through Wi-Fi AP. The user terminal combines traffic from both LTE and Wi-Fi at the PDCP layer. The Wi-Fi AP reports its channel status to the LWA base station, which decides whether to operate the Wi-Fi AP as an LWA. This dynamic management of radio resources can improve LTE performance. When the Wi-Fi AP is not operating as an LWA, it can function as a native Wi-Fi AP. Fig. 4 LTE-WLAN link aggregation
728
H. Lee et al.
Telecommunication service providers need to collect billing information for LTE traffic transmitted over Wi-Fi and include it in their billing policy. The 3GPP release 13 standardizes the scenario of combining LTE and Wi-Fi at the radio link level. This includes the standardization of the communication protocol framework for LWA, as well as solutions for combining data transmitted from the PDCP layer, signaling, and the interface between eNB and Wi-Fi AP.
4.3 Multipath TCP (MPTCP) Proxy Unlike LWA, which aggregates LTE and Wi-Fi at the radio link layer, MPTCP standardized in the Internet engineering task force (IETF) MPTCP working group combines LTE and Wi-Fi at the TCP level [17]. The basic concept is extending TCP, a transport layer protocol, to multipath TCP. The main purpose of MPTCP is to transmit data using all possible paths while operating stably in the existing environment without changing the existing Internet infrastructure or application. Unlike the existing TCP establishing only one TCP path between a terminal and a server, MPTCP configures multiple TCP paths between a terminal and a server to transmit data simultaneously. Since the MPTCP terminal and the MPTCP proxy operate over the IP network, they are not affected by whether the underlying network is LTE or Wi-Fi. Therefore, LTE and Wi-Fi can be combined by upgrading only the user terminal’s software without replacing the existing network infrastructure. MPTCP proxy-based LTEWi-Fi convergence is very cost-effective in that it can reduce infrastructure costs and improve users’ QoS by directly applying it to commercial networks. MPTCP proxy is evaluated as the most realistic technology that can be commercialized and serviced immediately with minimal cost in the existing network. MPTCP basically consists of two layers, and a plurality of subflow layers exist under the MPTCP layer. The MPTCP layer manages the connection and rearranges the order of packets transmitted to the application side, and each subflow layer deals with reliable packet transmission and network congestion control. MPTCP may encounter several issues, such as the use of more network resources than a single-path TCP, which causes fairness issues. To address this problem, congestion control algorithm studies have been conducted [18–20]. Additionally, it is crucial to ensure service continuity even if the MPTCP proxy experiences a service failure. In the event of all proxy servers failing, the UE bypasses the MPTCP proxy and communicates directly with the destination server to provide seamless service. From a telecommunication service provider’s point of view, since LTE and Wi-Fi have separate core networks, it is necessary to link the MPTCP proxy and the system operator’s policy server to perform flow control of traffic delivered to each network and pass accounting information of traffic delivered through the WiFi network.
Convergence of 6G and Wi-Fi Networks
729
4.4 NR-U To achieve high-performance mobile communication in the unlicensed band in 5G, the 3GPP release 16 introduced NR-U [21]. This includes 5G mobile communication standards that support both LAA function and unlicensed band standalone operation. The spectrum sharing of next-generation Wi-Fi and NR-U is expected to occur in the 6-GHz band, which is an extended version of LTE-LAA discussed in Sect. 4.1. Unlike LTE-AAA, which was designed to coexist fairly with multiple Wi-Fi devices already deployed, there are no such restrictions in the currently unlicensed 6-GHz band. NR-U derives its PHY layer from 5G NR, so the advantage of performance improvement of the PHY layer made in 5G NR can be applied as it is. In 5G NR, the same reservation signal used in LTE-LAA is applied to 5G NR to reserve the channel during shared channel access. The use of reservation signals can be minimized by transmitting packets in mini-slots through advanced minislot scheduling technology and the use of flexible numerologies. In LTE-LAA, a reservation signal was required because the unlicensed transmission had to be synchronized with the sub-frame boundaries of the licensed carrier. For this reason, the LTE-LAA device could start data transmission only at the sub-frame boundary. The reservation signal was transmitted from the time of acquiring channel access to the starting period of the sub-frame boundary. The difference between NR-U and LTE-LAA is that while the licensed primary carrier is mandatory in LTE-LAA, NR-U is not required. In this scenario, since the NR-U is connected to the 5G core network and operates without a licensed primary carrier, the NR-U network can be deployed similarly to the Wi-Fi AP deployment.
5 Future Convergence This section presents a vision of convergence between 6G and WLAN. In the future, 6G networks will be highly connected and rapidly changing. For instance, in a V2X network, the primary goal is to maintain URLLC even under constantly changing car positions and speeds. Unstable communication in V2X may cause traffic snarls or accidents. Furthermore, a rapidly changing network can make spectrum sharing more challenging, particularly in cities where large-scale and complex heterogeneous network environments. To address this issue, a spectrum sharing algorithm that is compatible with existing transport protocols and specifically designed for various applications will be required [22]. In the 6G network, machine learning is a key solution to improve performance. In terms of handover and connection management, handover decisions are typically made based on signal strength. By using an ML framework, the management of handover and connectivity between WLANs can be enhanced by predicting user mobility and requirements. When WLAN deployment is performed through
730
H. Lee et al.
ML, various APs can perform bidirectional communication with appropriate base stations, reducing channel collision between stations and improving overall network throughput. The ML framework can also be applied to network slicing, a technique that virtually divides network resources to meet various application requirements. Allocating spectrum resources according to the heterogeneity of applications and devices can be challenging, but by applying the ML framework, user requirements for optimizing network performance can be predicted. The Wireless Broadband Alliance (WBA) and the Next Generation Mobile Networks (NGMN) Alliance have published a joint white paper on the network convergence of Wi-Fi 6 (802.11ax) and 5G (3GPP NR). This white paper defines access capabilities (latency, reliability, throughput, coverage, and availability) in various use cases, including manufacturing, public space, and smart city, as defined in [23].
6 Conclusion This chapter has explored various contents about convergence between cellular networks and Wi-Fi have been studied. The history of cellular and Wi-Fi communications development is covered, and the convergence technologies widely used today are covered. Finally, we mentioned the technologies that should be considered in 6G in the future. In the next chapter, we will discuss research to ensure network security and reliability.
References 1. M. Gast, 802.11n: A Survival Guide. Newton, MA, USA: O’Reilly Media, 2012. 2. M. S. Gast, 802.11ac: A Survival Guide. Newton, MA, USA: O’Reilly Media, 2013. 3. E. Khorov, A. Kiryanov, A. Lyakhov, and G. Bianchi, “A Tutorial on IEEE 802.11ax High Efficiency WLANs,” IEEE Communications Surveys & Tutorials, vol. 21, no. 1, pp. 197–216, 2019. 4. C. Deng, X. Fang, X. Han, X. Wang, L. Yan, R. He, Y. Long, and Y. Guo, “IEEE 802.11be WiFi 7: New Challenges and Opportunities,” IEEE Communications Surveys Tutorials, vol. 22, no. 4, pp. 2136–2166, 2020. 5. K. Vaigandla, S. Bolla, and R. Karne, “A Survey on Future Generation Wireless Communications-6G: Requirements, Technologies, Challenges and Applications,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 10, pp. 3067–3076, 10 2021. 6. M. H. Alsharif, A. H. Kelechi, M. A. Albreem, S. A. Chaudhry, M. S. Zia, and S. Kim, “Sixth generation (6G) wireless networks: Vision, research activities, challenges and potential solutions,” Symmetry, vol. 12, no. 4, p. 676, 2020. 7. 3GPP TS 22.261 V18.6.1, “Service requirements for the 5G system,” Tech. Spec. Group Services and System Aspects, Technical Specification (TS), March 2022. 8. W. Jiang, B. Han, M. A. Habibi, and H. D. Schotten, “The road towards 6G: A comprehensive survey,” IEEE Open Journal of the Communications Society, vol. 2, pp. 334–366, 2021.
Convergence of 6G and Wi-Fi Networks
731
9. M. Z. Chowdhury, M. Shahjalal, S. Ahmed, and Y. M. Jang, “6G wireless communication systems: Applications, requirements, technologies, challenges, and research directions,” IEEE Open Journal of the Communications Society, vol. 1, pp. 957–975, 2020. 10. Netmanias, “Analysis of LTE – Wi-Fi Aggregation Solutions,” Netmanias, March 2016, last accessed 06 March 2023. [Online]. Available: https://bit.ly/3x129T2 11. 3GPP TR 23.729 V15.0.0, “Study on Unlicensed Spectrum Offloading,” Tech. Spec. Group Services and System Aspects, Technical Report (TR), September 2017. 12. 3GPP TR 36.889 V13.0.0, “Study on Licensed-Assisted Access to Unlicensed Spectrum,” Tech. Spec. Group Radio Access Network, Technical Report (TR), July 2015. 13. G. Naik, J.-M. Park, J. Ashdown, and W. Lehr, “Next Generation Wi-Fi and 5G NR-U in the 6 GHz Bands: Opportunities and Challenges,” IEEE Access, vol. 8, pp. 153 027–153 056, August 2020. 14. ETSI 301 893 V1.6.1, “Broadband Radio Access Networks (BRAN); 5 GHz High Performance RLAN,” European Telecommunications Standards Institute (ETSI), Harmonized European Standard, November 2011. 15. M. Alhulayil and M. López-Benítez, “Novel LAA Waiting and Transmission Time Configuration Methods for Improved LTE-LAA/Wi-Fi Coexistence Over Unlicensed Bands,” IEEE Access, vol. 8, pp. 162 373–162 393, 2020. 16. 3GPP TS 36.360 V13.0.0, “LTE-WLAN Aggregation Adaptation Protocol (LWAAP) Specification,” Tech. Spec. Group Radio Access Network, Evolved Universal Terrestrial Radio Access (E-UTRA), Technical Specification (TS), March 2016. 17. M. Scharf and A. Ford, “Multipath TCP (MPTCP) Application Interface Considerations,” RFC 6897, March 2013. 18. M. Becke, T. Dreibholz, H. Adhari, and E. P. Rathgeb, “On the fairness of transport protocols in a multi-path environment,” in IEEE International Conference on Communications (ICC), Ottawa, ON, Canada, June 2012, pp. 2666–2672. 19. A. Singh, M. Xiang, A. Konsgen, C. Goerg, and Y. Zaki, “Enhancing fairness and congestion control in multipath TCP,” in Proc. 6th Joint IFIP Wireless and Mobile Networking Conference (WMNC), Dubai, United Arab Emirates, April 2013, pp. 1–8. 20. D. Zhou, W. Song, and Y. Cheng, “A Study of Fair Bandwidth Sharing with AIMD-Based Multipath Congestion Control,” IEEE Wireless Communications Letters, vol. 2, no. 3, pp. 299– 302, 2013. 21. 3GPP TR 38.889 V16.0.0, “Study on NR-based Access to Unlicensed Spectrum,” Tech. Spec. Group Radio Access Network, Technical Report (TR), December 2018. 22. P. Yang, L. Kong, and G. Chen, “Spectrum Sharing for 5G/6G URLLC: Research Frontiers and Standards,” IEEE Communications Standards Magazine, vol. 5, no. 2, pp. 120–125, 2021. 23. S. Lagen, N. Patriciello, and L. Giupponi, “Cellular and Wi-Fi in Unlicensed Spectrum: Competition leading to Convergence,” in Proc. 2020 2nd 6G Wireless Summit (6G SUMMIT), 2020, pp. 1–5.
Semantic Communications and Networking Won Joon Yun, Soohyun Park, Rhoan Lee, Jihong Park, Young-Chai Ko, and Joongheon Kim
1 Introduction 1.1 Motivation As deep learning technologies have advanced in recent decades, research on the Shannon-Weaver communication model’s second stage, which conveys meaning, is ongoing [1]. Semantic communication (SC) is designed on top of the existing Shannon-Weaver communication model. Specifically, SC has a structure where a SE decoder, which considers meaning and utility, is added before and after the Shannon-Weaver communication model composed of bit encoder-decoder. SC has the following new advantages over Shannon-Weaver communication. First, by understanding the meaning of information with a semantic encoder (SE) in advance, it is possible to reduce the communication load by converting only the information containing important meanings into bits. In addition, it is possible to achieve a robust gain against communication errors by allowing bit errors for unnecessary meanings to achieve the purpose through the semantic decoder (SD). Furthermore, it is possible to implement a new method of encoding meaning in consideration of purpose task characteristics through bidirectional communication
W. J. Yun · S. Park · Y.-C. Ko · J. Kim () Korea University, Seoul, South Korea e-mail: [email protected]; [email protected]; [email protected]; [email protected] R. Lee Ewha Womans University, Seoul, South Korea e-mail: [email protected] J. Park Deakin University, Geelong, Australia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_29
733
734
W. J. Yun et al.
Semantic Communications
Artificial Intelligence
Variational AutoEncoder
Transformer
Semantic Networking
AI-Native
AI-Aided
Task-Oriented Communication
Goal-Oriented Communication
MARL Joint Source Channel Coding
DeepSC /HARQ
NLP
Contextual/ Semantic
Vision
Semantic Filtering
Video Conferencing
Value of Information Age of Information
State-Update System
Fig. 1 The taxonomy of SCs
between SE and SD. This enables future prediction, and retrospective inference beyond the traditional communication task focused on accurate transmission and recovery of information. SC refers to the second- and third-level Shannon-Weaver communication models. SC is comprised of the Shannon model with SE, SD, and semantic knowledge. The SE makes semantic representation (SR) with semantic knowledge, which is driven by deep neural networks (DNNs) and sufficient data. Then the SD restores the original information by decoding SR with semantic knowledge. In many studies, the throughput of using SR surpasses Shannon capacity [2]. We survey SC papers and summarized the taxonomy as presented in Fig. 1.
1.2 Background of SC Three phases comprise the Shannon-Weaver communication model. The first stage transmits the bit accurately; the second step transmits the meaning, and the third step increases the gain achieved from the meaning transmission [3].
1.2.1
Level 1 Shannon Communication
The level 1 Shannon-Weaver model consists of source (e.g., transmitter), channel (e.g., noisy channel), and destination (e.g., receiver). Wireless communication, up to and including 5G mobile communication, has been created to enable the transmitter to communicate information accurately and rapidly to the receiver [4]. According to information theory and wireless communication, the throughput increases by widening the bandwidth and reducing noise, which cannot exceed Shannon’s capacity.
Semantic Communications and Networking
1.2.2
735
Level 2 Shannon Communication
As deep learning technologies have advanced in recent decades, research on Shannon-Weaver communication model’s second stage, which conveys meaning, is ongoing [5]. SC is comprised of the Shannon model with SE, SD, and background knowledge [6]. The SE makes SR with DNN and sufficient data. Then the SD restores the original information by decoding SR with background knowledge. In many studies, the throughput of using SR surpasses Shannon capacity [7]. Recent studies on SC mainly focus on encoding with high compression rate and decoding with high accuracy. The authors of [8] have proposed attention-based SC in natural language processes in varying channel conditions, and the authors of [9] have proposed transformer-based SC.
1.2.3
Level 3 Shannon Communication
To achieve the objective of the third stage of Shannon communication, taskoriented SC has been studied. One of the representative studies is emergent SC. In differentiable interagent learning (DIAL) [10], while taking actions, the agents exchange clean-slate messages passed through their actor models. During training, these messages are progressively turned into meaningful representations for better interagent collaboration, hereafter referred to as SRs, which is an analog of children’s developing language-based cues as they grow. The authors of [11] propose emergent SC in the multi-agent reinforcement learning framework. The authors of [12] have elaborated emergent SC to semantics-native communication for contextual reasoning with the contextual meaning space and the semantic representation space. In addition, the probabilistic logic approach has been studied for analyzing the semantics. The authors of [13] have investigated semantic information and its communication with logic clauses by reducing the uncertainty of semantic information.
2 Preliminaries of Artificial Intelligence 2.1 Deep Neural Network as Universal Function Approximator According to the universal function approximation theorem [14], any continuous function can be approximated by DNN and nonlinear activation function. In addition, the convolution neural network (CNN) can be not only a feature extractor but also a universal function approximator. Thus, the DNN has shown good performance on machine learning tasks such as supervised learning, unsupervised learning, generative model, and reinforcement learning. These successful studies on AI enable the applications to communication problems, e.g., successive interference
736
W. J. Yun et al.
Fig. 2 The structure of variational autoencoder. This VAE structure is utilized as an architecture of joint source and channel coding
cancellation, denoising, pilot design, and many resource allocation problems. Many JSCC, AI-native, and AI-aided communication studies are based on this theorem.
2.2 Variational Autoencoder The variational autoencoder (VAE) [15] has been proposed to train a deep generative model, as illustrated in Fig. 2. VAEs have been used in a variety of applications, including data generation, semi-supervised learning [16], data compression, data disentanglement, and so on. On the other hand, VAE is of special interest to us because it has the same encoder-decoder structure as a JSCC system. Second, VAE training entails a two-term optimization: a reconstruction error term and a KullbackLeibler (KL) divergence term. The authors of [17] show that the second term can be reinterpreted as a power constraint term when viewed through the lens of JSCC. This knowledge also paves the way for future use of the extensive VAE literature to advance the design of JSCC over analog-channel systems for general sources.
2.3 Transformer The transformer is widely used in language models by utilizing self-attention [18]. Self-attention and multi-head attention are the basic techniques of the transformer. The strength of attention has a low inductive bias, which means that the trainable inductive bias is very low. Therefore, the transformer and attention can be utilized in multimodal tasks, and super neural network is available. In addition, the multilayer perceptrons (MLPs) have no inductive bias. Whereas the CNN-based networks, e.g., VDSR, SRCNN, and SRGAN, have high inductive bias. Thus, the CNNbased networks are efficient in a single task such as the spatiotemporal task; however, they show weakness in long relationships. As a result, the transformer has shown its success in various fields, e.g., vision transformer (ViT) [19], multi-agent reinforcement learning (MARL) [20], reinforcement learning (RL) agent such as
Semantic Communications and Networking
737
Fig. 3 The structure of multi-head attention. The multi-head attention is utilized as a knowledge base for SC and networking
Fig. 4 The illustration of neural network-based JSCC
decision transformer [21], and Dall-E [22]. As shown in Fig. 3, embedding to the query, key and value occur by the knowledge base. Thus, the transformer enables AI-native SC especially in the language model and image SC area. DeepSC, the SC framework based on the transformer [23], utilizes the transformer as a knowledge base, which will be elaborated in Sects. 3 and 4.
2.4 Joint Source and Channel Coding First of all, we describe JSCC, which is utilized in SE and SD, that has been discussed theoretically in [24–26]. The first principle is that the signal should be recovered noiselessly where noise exists [26]. To follow this principle, previous work models JSCC with hidden Markov sources, linear sparse-graph structures, double low-density parity-check (LDPC) code, and belief propagation. Recent works have shown that neural networks can jointly train both source and channel coding directly from raw data [27, 28]. Figure 4 shows the basic concepts of JSCC. First, an input data/signal is given to the neural encoder. The neural encoder makes a latent by its linear operations and nonlinear activation. In this progress,
738
W. J. Yun et al.
source coding and channel coding are conducted by a neural encoder. In other words, a latent contains not only information of source coding but also channel coding. The latent is transmitted from the neural encoder to the neural decoder. In the neural decoder, the latent is decoded to the source data/signal. The generative models or variational autoencoder is used as a neural decoder architecture. Farsad et al. [27] have developed a neural network for JSCC. Since error correction is an important objective in JSCC, the signal restoration techniques from the impairments due to various channel fading are studied by using deep generative adversarial networks. Indeed, neural networks can be used as a universal approximator; thus, neural network shows its feasibility in JSCC. The model of [27] uses a recurrent neural network (RNN) encoder, binarization layer, and channel layer. In addition, a deep generative model is implemented on JSCC data [28]. The important principle of JSCC is to correct the error on JSCC. Thus, the feasibility studies on the neural network-based encoder, where the encoder can actually code-compress and error-correct the source data, are conducted. Here, autoencoder-based JSCC emerges due to its structural advantage. The encoder of VAE compresses source data to the latent. From the latent sample, the decoder can restore the source data via appropriate objective function, i.e., mean squared error. Thus, VAE is selected as the major JSCC architecture. Autoencoders learn latent representations unsupervised. The latent is modeled with neural network-based channel coding and sends messages over noisy random channels. A channel between the transmitter and receiver randomly corrupts the codeword, allowing original data to be recovered from a noisy higher-dimensional representation. Near-optimal channel codes have impacted long-term evolution (LTE) and 5G. The traditional approach uses handcrafted optimal decoding algorithms for additive white Gaussian noise (AWGN) channels. Channel codes will be optimal only when block length approaches infinity [29]. The authors in [29] have introduced TurboAE, a neural network-based over-complete autoencoder with CNN inspired by turbo codes. TurboAE outperforms multiple capacity-approaching codes on AWGN channels when the block length is moderate. Thus, hybrid automatic repeat request (HARQ) is combined with JSCC to improve sentence transmission reliability. Semantic coding HARQ (SCHARQ) is a transformer-based source and channel coding [30]. Zhou et al. [31] have investigated an end-to-end design of a practical JSCC scheme for adaptive semantic rate control with incremental knowledge-based HARQ (IK-HARQ) [31]. It proposes a simplified system architecture with one encoder only for retransmission. The system is equipped with a self-adaptive semantic bit rate control mechanism to improve efficiency and reduce cost. In this paper, we introduce a novel self-adaptive denoising module to improve the reliability of semantic transmission. The encoder with adaptive bit rate control enables transmission more robustly. The decoder with IK-HARQ allows us to use the retransmitted sentences more effectively. The authors of [32] have broken the curse of dimensionality for learning encoding and decoding. Deep convolutional autoencoder can automatically encode and decode a much longer code. Using the standard convolutional layer in both the encoder and decoder can overcome the curse of dimensionality for learning-
Semantic Communications and Networking
739
based decoders. Since weights are learned and shared at all positions within each convolutional layer, generalization to unseen codewords is good even though only a tiny fraction of all possible codewords are revealed in the training model. Existing deep learning (DL)-based JSCC methods assume that network optimization and deployment channel conditions are the same. A mismatch can degrade DL-based techniques’ performance. This paper designs a DL-based JSCC method that works well across signal-to-noise ratios (SNRs). Xu et al. [33] proposes an attention mechanism to allocate different contributions for intermediate features based on channel SNR. In the presence of channel mismatch, this is more robust. In addition, they designed the JSCC-based DL method using traditional JSCC design principles [33]. Communication and coding theory is math-based. The encoder/decoder space is too large and unstructured. To cope with this problem, the authors of [34] expand and strengthen the sequential code family. Deep learning approaches for reliable communication require information theory and coding theory intuitions and insights. Kim et al. (2020) broke the logjam by combining information theory and RNN encoders and decoders of which codes are three times more reliable than previous research [34]. Saidutta et al. (2021) have suggested using VAEs to learn the encoders and decoders for JSCC of sources over additive noise channels that are independent of each other [17]. They argued that VAEs are very similar to the JSCC system and that their loss function minimizes an upper bound on the one from the rate-distortion theory. In addition, discontinuous projections are an important part of bandwidth compression and using multiple encoders with a universal decoder. Each encoder network has a possible encoding, but only one is chosen to be sent. JSCC with VAE shows that the solutions can handle changes in channel noise of up to 5dB from the conditions they are designed for. This can be made even more reliable by training a single system in a number of different channel conditions.
3 Semantic Communications 3.1 AI-Native SC The JSCC implementation in various areas makes AI-native SC. To establish a speech-learning SC system, the semantic communication system for speech signals (DeepSC-S) learns and extracts speech signals from received features to create a DL-enabled semantic speech communication system. DeepSC-S treats transmitter and receiver as neural networks (NNs). JSCC corrects source and channel distortions. In addition, DeepSC-S uses a squeeze-and-excitation network to learn and extract essential speech semantic information. Furthermore, the attention mechanism improves signal recovery [8]. In contrast to the well-known problem of reliable transmission, there is a new schematic shift where the goal is not only to model the semantics but also to communicate them as accurately as possible. To delve into this attention, Lu et al. [35] came up with a practical way to bridge the
740
W. J. Yun et al.
semantic gap by learning from the semantic similarity that cannot be told apart [35]. They gave a new perspective on traditional SC schemes and introduced a RLbased optimization paradigm so that optimization can be done in a stable way. Usually, a new self-critic policy gradient approach is introduced for large-scale and complex semantic transmission. This approach gives a precise and efficient gradient estimation without adding extra parameters. Lu et al. [35] proposes the RL agent that figures out the semantic similarity between two messages [36]. The natural language process metrics are used, e.g., BLEU and CIDEr score. In addition, the RL agent makes successive action decisions with its recurrent network. By putting the selfcritic training into the decoupled transceiver at the same time, SemanticRL-JSCC shows a complete large-scale wireless SC paradigm. This version handles both the non-differentiable channel problem and the objective problem with learnable policies in both SE and SD. It is trained end to end with self-critic stochastic iterative updating (SemanticRL-SCSIU) [35]. Zhang et al. [37] consider the integration of JSCC and image reconstruction [37]. The encoder of [37] consists of a semantic feature extractor, segmentation feature extractor, and low-level feature module. The image is processed with the aforementioned modules, and they are compressed to the physical channel. The signal with AWGN is conveyed to the decoder where it fusions multifeatures. Then the decoder reconstructs the image in high resolution. The pattern recognition metrics are used, e.g., peak signal-to-noise ratio (PSNR) and structural similarity index map (SSIM). Also, in the emergent SC in MARL system, all agents exchange their attention score with graph attention exchanging network (GAXNet) [11] to satisfy the ultrareliable low-latency communication (URLLC).
3.2 AI-Aided SC The prototype version of AI-aided SCs is based on semantic filtering. Semantic filtering selects valid vocabulary and phrases for an abstract syntax element [38]. In beyond-5G (B5G) or 6G communications, task- and goal-oriented communication emerge with the developed form of semantic filtering. The age of information (AoI) and value of information (VoI) are mainly used and modeled with dynamics [39], which indicate that the metrics take the dynamics into account. This goal-oriented communication is studied for the implementation of power control [40, 41] and energy harvesting [42–45]. The new paradigm of AoI and VoI has emerged with deep RL [46], fault detection [47], and scheduling [48]. According to [49], all network hierarchies (e.g., application, transport, network, link/MAC, and physical layer) are linked to semantic-effectiveness (SE) planes. Popovski et al. [49] suggest that SE planes perform as filters in MAC frame, radar sensing, PHY computing, retransmission, and congestion control. According to [47, 50], the metrics of these applications are called semantic of information (SoI). Agheli et al. [51] have proposed the semantic-aware source coding in status update systems [51]. The status update system is relevant to AoI and VoI. The
Semantic Communications and Networking
741
system model consists of the source, semantic filtering, and packet encoding and monitoring. The semantic filtering dynamically controls the steady-state timeaverage codeword length by filters. As an application to Internet of Things (IoT) settings, Stamatakis et al. [52] propose a semantic-aware active fault detection in IoT state updating system [52]. To maximize the VoI over a finite horizon, the authors make a RL agent generate the probability distribution which represents the belief of the IoT objective. In addition, the authors consider the AoI, which is elapsed by state update. In goal-oriented communications, semantic filtering or RL-based control is a solution. As an application to scene classification in the unmanned aerial vehicle system, the RL agent recognizes channel rate and time delay [53]. Then, the RL agent filters the region of interest (RoI) of input images and transmits them to the server. It significantly reduces communication costs while maximizing the accuracy of classification.
3.3 Applications 3.3.1
SC for Speech Recognition
The authors of [5] assume a scenario in which a voice signal is transmitted, and the voice signal is restored as much as possible to the original signal. The structure shown in Fig. 5 is used, and the SE and SD learn through the attention mechanism. The context vector obtained through the RNN-based attention mechanism becomes the background knowledge of SE and SD, respectively. It introduces a technique for successfully restoring original data by extracting essential voice information through SC and delivering it to the channel encoder [5]. Also, it is confirmed that it has robustness even in environments with various communication qualities.
3.3.2
SC for Natural Language Processing
When the two senders are speaking different languages, in order for the sender to communicate successfully with the receiver, it is better to convey the original meaning rather than to transmit raw data. Xie and Qin [54] assumes a scenario that restores the original meaning as much as possible when communicating natural language datasets using different languages. In [54], SE and SD are composed of a transformer technique using self-attention. It is assumed that if the data
Fig. 5 Framework of SCs
742
W. J. Yun et al.
transmitted through the channel is the original data, the translation will fail, and if the transmitted data is the meaning of the original data, the translation will succeed. Xie and Qin [54] showed the validity of SC by showing high performance in translation while having robustness in various communication environments through SC.
3.3.3
SC for Internet of Things
Many IoT devices need to perform network quantization in order to communicate with the edge and cloud. When network quantization proceeds, access to multiple devices is possible, whereas the accuracy of information transmission is lowered. In this case, SC is proposed to increase communication efficiency. In [55], if the data type is 4-bit or 8-bit, and if the original coded data is transmitted, the performance is very low when coded in 4-bit and poor when coded in 8-bit. However, when performing SC, the performance of 4-bit, 8-bit, and original is almost similar. In short, the efficiency of information becomes better.
3.3.4
SC for Multi-Agent Reinforcement Learning
In [11], multiple UAVs provide communication in hazardous areas. It aims to optimize the UAV path to ensure ultralow latency without collision in hazardous areas where communication must be provided. By sharing the semantic information that an agent creates through an attention-based SE to other agents, it satisfies ultralow latency and does not collide.
4 Semantic Networking In the 1990s, a semantic network is studied around a form of knowledge representation modeled with graph [56]. A semantic network is expressed as nodes and edges for characteristic information and semantic relations. It is also known as a database that machines can interpret. Spurred by the recent advances in SC and AI, semantic networking gets attention. According to [7], a semantic network consists of the components of SC, semantic knowledge base, and local knowledge base. When end-to-end communications occur, the knowledge of local devices is reorganized by the server-side semantic base. For this, knowledge graphs and federated learning are used to coordinate the server-side knowledge base. Toward semantic networking, Xie et al. [5] have suggested the multimodal framework of multiuser SC systems that utilize the SR encoder and SR decoder [55]. The SE encodes the modality to semantic information, and then the JSCC encoder encodes the SE to the digital signal. The signal is transmitted to the semantic receiver, which supports single-modal and multimodal. Jiang et al. [30] elaborate
Semantic Communications and Networking
743
the IR-HARQ for videoconferencing in the semantic networks. In this work, the authors specify the framework by combining HARQ and exploiting channel state information (CSI) feedback. For this, the authors have distinguished the effectiveness layer, semantic layer, and technical layer. In the effectiveness layer, the frame is transmitted to the semantic layer or itself. In the semantic layer, the keypoint detector and the semantic generator are operated. In the technical layer, JSCC with quantization-dequantization works. To make this happen, the author considers the semantic detector, which gives an acknowledgment (ACK) that is composed of 0 or 1. In addition, the CSI of subchannels is considered in the technical layer.
5 Summary This chapter has introduced trends in SC in the service stage, such as voice signals, natural language, IoT devices, and multi-agent RL. In common, AI technology has been used for SEs and SDs, and among the deep learning technologies, various techniques such as VAE and transformer are used. The performance of using SC is better than traditional Shannon communications. In addition, SC has a better data compression rate than the existing traditional method. It is confirmed that SC showed robustness in various communication environments such as AWGN, Rayleigh fading, and Rician fading for the paper. Although many studies on SC have not been conducted yet, it has been confirmed that it is more efficient than the existing communication method based on the literature introduced above. There are not many research cases yet, and it is limited to natural language processing. Studies on SC are needed in various communication network applications.
References 1. Y. C. Eldar, A. Goldsmith, D. Gündüz, and H. V. Poor, Machine Learning and Wireless Communications. Cambridge University Press, 2022. 2. Q. Lan, D. Wen, Z. Zhang, Q. Zeng, X. Chen, P. Popovski, and K. Huang, “What is semantic communication? a view on conveying meaning in the era of machine intelligence,” J. Commun. Inf. Networks, vol. 6, pp. 336–371, 2021. 3. C. E. Shannon, “A mathematical theory of communication,” The Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948. 4. J. Park, S. Samarakoon, M. Bennis, and M. Debbah, “Wireless network intelligence at the edge,” Proceedings of the IEEE, vol. 107, no. 11, pp. 2204–2239, October 2019. 5. H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, “Deep learning enabled semantic communication systems,” IEEE Transactions on Signal Processing, vol. 69, pp. 2663–2675, 2021. 6. Q. Lan, D. Wen, Z. Zhang, Q. Zeng, X. Chen, P. Popovski, and K. Huang, “What is semantic communication? a view on conveying meaning in the era of machine intelligence,” Journal of Communications and Information Networks, vol. 6, no. 4, pp. 336–371, 2021.
744
W. J. Yun et al.
7. G. Shi, Y. Xiao, Y. Li, and X. Xie, “From semantic communication to semantic-aware networking: Model, architecture, and open problems,” IEEE Communications Magazine, vol. 59, no. 8, pp. 44–50, 2021. 8. Z. Weng and Z. Qin, “Semantic communication systems for speech transmission,” IEEE Journal on Selected Areas in Communications, vol. 39, pp. 2434–2444, 2021. 9. Q. Zhou, R. Li, Z. Zhao, C. Peng, and H. Zhang, “Semantic communication with adaptive universal transformer,” IEEE Wireless Communications Letters, vol. 11, no. 3, pp. 453–457, 2022. 10. J. Foerster, I. A. Assael, N. de Freitas, and S. Whiteson, “Learning to communicate with deep multi-agent reinforcement learning,” in Proc. of the Advances in Neural Information Processing Systems (NeurIPS), vol. 29, NY, USA, December 2016, pp. 2145–2153. 11. W. J. Yun, B. Lim, S. Jung, Y.-C. Ko, J. Park, J. Kim, and M. Bennis, “Attention-based reinforcement learning for real-time UAV semantic communication,” in Proc. IEEE International Symposium on Wireless Communication Systems (ISWCS). IEEE, 2021, pp. 1–6. 12. H. Seo, J. Park, M. Bennis, and M. Debbah, “Semantics-native communication with contextual reasoning,” CoRR, vol. abs/2108.05681, 2021. 13. J. Choi, S. W. Loke, and J. Park, “A unified view on semantic information and communication: A probabilistic logic approach,” CoRR, vol. abs/2201.05936, 2022. 14. K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural networks, vol. 2, no. 5, pp. 359–366, 1989. 15. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” CoRR, vol. abs/1312.6114, 2013. 16. A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow, and B. Frey, “Adversarial autoencoders,” CoRR, vol. abs/1511.05644, 2015. 17. Y. M. Saidutta, A. Abdi, and F. Fekri, “Joint source-channel coding over additive noise analog channels using mixture of variational autoencoders,” IEEE Journal on Selected Areas in Communications, vol. 39, pp. 2000–2013, 2021. 18. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017. 19. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” CoRR, vol. abs/2010.11929, 2020. 20. S. Iqbal and F. Sha, “Actor-attention-critic for multi-agent reinforcement learning,” in International conference on machine learning. PMLR, 2019, pp. 2961–2970. 21. L. Chen, K. Lu, A. Rajeswaran, K. Lee, A. Grover, M. Laskin, P. Abbeel, A. Srinivas, and I. Mordatch, “Decision transformer: Reinforcement learning via sequence modeling,” CoRR, vol. abs/2106.01345, 2021. 22. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” 2022. 23. H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, “Deep learning enabled semantic communication systems,” IEEE Transactions on Signal Processing, vol. 69, pp. 2663–2675, 2021. 24. M. Fresia, F. Perez-Cruz, H. V. Poor, and S. Verdu, “Joint source and channel coding,” IEEE Signal Processing Magazine, vol. 27, no. 6, pp. 104–113, 2010. 25. J. L. Massey, “Joint source and channel coding,” MASSACHUSETTS INST OF TECH CAMBRIDGE ELECTRONIC SYSTEMS LAB, Tech. Rep., 1977. 26. T. Richardson and R. Urbanke, Modern coding theory. Cambridge University Press, 2008. 27. N. Farsad, M. Rao, and A. J. Goldsmith, “Deep learning for joint source-channel coding of text,” 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2326–2330, 2018. 28. K. Choi, K. Tatwawadi, A. Grover, T. Weissman, and S. Ermon, “Neural joint source-channel coding,” in Proc. International Conference on Machine Learning (ICML), 2019.
Semantic Communications and Networking
745
29. Y. Jiang, H. Kim, H. Asnani, S. Kannan, S. Oh, and P. Viswanath, “Turbo autoencoder: Deep learning based channel codes for point-to-point communication channels,” CoRR, vol. abs/1911.03038, 2019. 30. P. Jiang, C.-K. Wen, S. Jin, and G. Y. Li, “Deep source-channel coding for sentence semantic transmission with HARQ,” CoRR, vol. abs/2106.03009, 2022. 31. Q. Zhou, R. Li, Zhifeng Zhao, Yong Xiao, Honggang Zhang, “Adaptive bit rate control in semantic communication with incremental knowledge-based HARQ,” CoRR, vol. abs/2203.06634, 2022. 32. H. Ye, L. Liang, and G. Y. Li, “Circular convolutional auto-encoder for channel coding,” Proc. International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 1–5, 2019. 33. J. Xu, B. Ai, W. Chen, A. Yang, P. Sun, and M. Rodrigues, “Wireless image transmission using deep source channel coding with attention modules,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 4, pp. 2315–2328, 2022. 34. H. Kim, S. Oh, and P. Viswanath, “Physical layer communication via deep learning,” IEEE Journal on Selected Areas in Information Theory, vol. 1, pp. 5–18, 2020. 35. K. Lu, R. Li, X. Chen, Z. Zhao, and H. Zhang, “Reinforcement learning-powered semantic communication via semantic similarity,” CoRR, vol. abs/2108.12121, 2021. 36. C. Liu, C. Guo, Y. Yang, and N. Jiang, “Adaptable semantic compression and resource allocation for task-oriented communications,” CoRR, vol. abs/2204.08910, 2022. 37. Z. Zhang, Q. Yang, S. He, M. Sun, and J. Chen, “Wireless transmission of images with the assistance of multi-level semantic information,” CoRR, vol. abs/2202.04754, 2022. 38. “Semantic filtering.” [Online]. Available: https://www.ibm.com/docs/en/odm/8.9.2?topic= tree-semantic-filtering 39. R. Yates, Y. Sun, D. Brown, S. Kaul, E. Modiano, and S. Ulukus, “Age of information: An introduction and survey,” IEEE Journal on Selected Areas in Communications, vol. 39, pp. 1183–1210, 2020. 40. P. Mankar, Z. Chen, M. A. Abd-Elmagid, N. Pappas, and H. S. Dhillon, “Throughput and age of information in a cellular-based IoT network,” IEEE Transactions on Wireless Communications, vol. 20, pp. 8248–8263, 2020. 41. O. Ozel, A. Yener, and S. Ulukus, “State amplification and masking while timely updating,” CoRR, vol. abs/2202.11682, 2022. 42. M. A. Abd-Elmagid and H. S. Dhillon, “Closed-form characterization of the MGF of AoI in energy harvesting status update systems,” IEEE Transactions on Information Theory, vol. 68, no. 6, pp. 3896–3919, 2022. 43. A. Arafa, J. Yang, S. Ulukus, and H. Poor, “Timely status updating over erasure channels using an energy harvesting sensor: Single and multiple sources,” IEEE Transactions on Green Communications and Networking, vol. 6, pp. 6–19, 2021. 44. M. Hatami, M. Leinonen, Z. Chen, N. Pappas, and M. Codreanu, “On-demand AoI minimization in resource-constrained cache-enabled IoT networks with energy harvesting sensors,” CoRR, vol. abs/2201.12277, 2022. 45. A. Jaiswal, A. Chattopadhyay, and A. Varma, “Age-of-information minimization via opportunistic sampling by an energy harvesting source,” CoRR, vol. abs/2201.02787, 2022. 46. M. Hatami, M. Leinonen, and M. Codreanu, “AoI minimization in status update control with energy harvesting sensors,” IEEE Transactions on Communications, vol. 69, pp. 8335–8351, 2020. 47. N. Pappas and M. Kountouris, “Goal-oriented communication for real-time tracking in autonomous systems,” in 2021 IEEE International Conference on Autonomous Systems (ICAS), 2021, pp. 1–5. 48. E. Fountoulakis, M. Codreanu, A. Ephremides, and N. Pappas, “Joint sampling and transmission policies for minimizing cost under AoI constraints,” CoRR, vol. abs/2103.15450, 2021. 49. P. Popovski, O. Simeone, F. Boccardi, D. Gündüz, and O. Sahin, “Semantic-effectiveness filtering and control for post-5G wireless connectivity,” Journal of the Indian Institute of Science, vol. 100, pp. 435–443, 2020.
746
W. J. Yun et al.
50. M. Kountouris and N. Pappas, “Semantics-empowered communication for networked intelligent systems,” IEEE Communications Magazine, vol. 59, no. 6, pp. 96–102, 2021. 51. P. Agheli, N. Pappas, and M. Kountouris, “Semantics-aware source coding in status update systems,” CoRR, vol. abs/2203.08508, 2022. 52. G. Stamatakis, N. Pappas, A. Fragkiadakis, and A. Traganitis, “Semantics-aware active fault detection in status updating systems,” CoRR, vol. 2202.00923, 2022. 53. X. Kang, B. Song, J. Guo, Z. Qin, and F. R. Yu, “Task-oriented image transmission for scene classification in unmanned aerial systems,” IEEE Transactions on Communications, vol. 70, no. 8, pp. 5181–5192, 2022. 54. H. Xie and Z. Qin, “A lite distributed semantic communication system for internet of things,” IEEE Journal on Selected Areas in Communications, vol. 39, pp. 142–153, 2020. 55. H. Xie, Z. Qin, X. Tao, and K. Letaief, “Task-oriented multi-user semantic communications,” CoRR, vol. abs/2112.10255, 2021. 56. K. M. Fisher, “Semantic networking: The new kid on the block,” Journal of research in science teaching, vol. 27, no. 10, pp. 1001–1018, 1990.
Network Security and Trustworthiness Soyi Jung, Soohyun Park, Seok Bin Son, Haemin Lee, and Joongheon Kim
1 Introduction & Motivation 6G is expected to revolutionize the digital world with intelligent, hyperconnected, and automated networks [1]. In particular, 6G networks are expected to leverage artificial intelligence technology to automatically configure and manage networks using artificial intelligence technology [2, 3]. This innovation is set to offer convenience and expand the possibilities in diverse fields for humans, according to [4]. However, the introduction of these 6G technologies could lead to various security threats. Due to the rapid increase in the use of open networks with the advent of 6G technology, various attack cases are likely to arise. Open networks, accessible to anyone without any security device, such as authentication, encryption, can be easily breached by attackers [5]. Additionally, even encrypted networks may no longer be secure, as 6G technology makes quantum computers more prevalent, enabling them to break current cryptographic algorithms [5]. Furthermore, with greater connectivity between networks, there is an increased risk of personal information leakage. Therefore, it is essential to ensure the network’s security and trustworthiness to benefit from 6G technology. Moreover, it is essential to consider the system’s trustworthiness in the auction mechanism, which is widely used to address resource allocation issues in distributed systems. When designing an auction, there is a threat of malicious users compromising the system’s trustworthiness and reliability. To
S. Jung () Ajou University, Suwon, South Korea e-mail: [email protected] S. Park · S. B. Son · H. Lee · J. Kim Korea University, Seoul, South Korea e-mail: [email protected]; [email protected]; [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8_30
747
748
S. Jung et al.
address this issue, studying auction research trends can help design an auction model that meets various trustworthiness requirements To better address security threats in 6G technology, this chapter is divided into four main sections: openness, post quantum cryptography, privacy preserving, and auction, and introduce security technologies that can respond to them. In addition, we introduced the open issues that have not yet been resolved. This chapter will primarily focus on security threats in both 5G and 6G technology, followed by new security technologies intended for use in 6G. Then we will address current issue that may arise in 6G era.
2 Trends and Related Work of Technology This section focuses on the latest trends and related work concerning security and trustworthiness threats in 6G. While numerous security threats can arise in 6G technology, we have divided and organized them into four primary categories
2.1 Openness Security Threats Network openness has increased with the transition from 5G to 6G, resulting in the widespread use of open source software (OSS) and open networks. OSS refers to software that is publicly distributed with its source code [6], allowing users to easily develop and use it for various tasks. However, this accessibility also makes OSS an attractive target for attackers. Vulnerability information for open sources is organized into Common Vulnerabilities and Exposures (CVE), and the National Vulnerability Database (NVD) provides detailed analysis of such information, which attackers can use to develop new attack methods [5]. For instance, in open sources lacking security devices like access rights control and authentication, attackers can exploit vulnerability information to perform privilege escalation attacks and steal sensitive information. In other words, attacks on vulnerabilities in open source software are likely to increase. In addition, security strategies in existing networks have allowed outside access to information inside by authenticating at network boundaries [5]. End point entities such as Internet Protocol Security (IPsec), Intrusion Detection System (IDS), Extensible Authentication Protocol-Transport Layer Security (EAP-TLS), and network firewalls were used to block untrusted external access [7]. But as the network becomes more open with the introduction of 6G, new networks can emerge, and the attack surface that attackers can attack can be further expanded [5]. In such 6G network, it cannot be guaranteed that existing boundary-based security strategies will still be useful. Therefore, in 6G, it is necessary to pay more attention to security because vulnerabilities may arise due to openness.
Network Security and Trustworthiness
749
2.2 Post Quantum Crypto Security Threats One of the most critical parts of security is a cryptographic algorithm. A cryptographic algorithm is a method that encodes private data (such as passwords, bank account numbers, etc.) into values that are harder to decipher so that only user can see them. The current cryptosystems include a one-way method utilizing hashing algorithm such as Secure Hash Algorithm (SHA), and Message Digest algorithm 5 (MD5) [8]. In addition, it is composed of asymmetric key methods such as Rivest Shamir Adleman (RSA), Eliptic-Curve Cryptography (ECC) and Advanced Encryption Standard (AES) [8]. These various encryption algorithms can be used to control access to data by authenticating external accessors, thus maintaining the confidentiality and integrity of the data. Existing cryptography techniques are mostly based on mathematically challenging problems like discrete logarithm difficulties and problems with minority decomposition in big prime numbers [5]. In traditional computing, it would have taken a long time to solve this issue. The existing encryption algorithm have used these advantages to secure sensitive information. However, existing encryption algorithms are no longer secure due to quantum computing that emerged with the advent of 6G. The mathematical complexity problem employed in conventional cryptography can be quickly and efficiently solved by quantum computing. Because of this, many security cryptography techniques (such as block chain technology, digital signatures, etc.) and security protocols (such SSH, TLS), which are based on the complexity of existing mathematics, are vulnerable to quantum attacks [1, 7]. Therefore, it is necessary to prepare for the security threats that may arise from quantum computing.
2.3 Privacy Preserving Technologies Threats Information protection is a critical issue in security, especially when it comes to safeguarding personal data. The General Data Protection Regulation (GDPR) proposed by the European Union defines personal information as any data that can identify an individual, such as their name, location, economic, cultural, and social identity [9]. This includes sensitive information like passwords and registration numbers, as well as economic and cultural data such as personal purchase history and viewing habits of cultural content. Furthermore, the concept of digital privacy has emerged recently, referring to the protection of personal information in the digital world [4]. Security attempted to protect personal information while complying with the three major elements of information security: confidentiality, integrity, and availability. Additionally, the importance of traceability, identity, and linkability has been noted to comply with privacy [10]. In addition, each country recognized the importance of privacy protection and enacted it into law, such as the Consumer
750
S. Jung et al.
Online Privacy Rights Act (COPRA) in the United States and three data laws for the protection of personal information in Korea [5]. Recently, various service providers such as mobile communication services collect and analyze personal information and provide various services using it [5]. For example, YouTube provides a system that collects personal activity information, such as personal viewing records and favorite lists, analyzes them, and recommends personalized vide s [11]. Although it has the advantage of being able to receive various services using personal information, on the other hand, personal information can be abused, which can cause serious damage to individual privacy. In particular, with the introduction of 6G, the hyperconnected Internet of Things (IoT) will be activated, and Internet of Everything (IoE) will be realized [12]. Therefore, collecting personal information will increase, and in 6G, managing personal information securely has become an increasingly important issue. For example, IoT is limited in resources, so complex passwords cannot be used, so it can become a major target for attackers, resulting in personal privacy information leakage problems, such as IP, data, and location [13]. Therefore, in 6G, it is necessary to safely manage privacy.
2.4 Auction Security Threats Auction theory has been contributing enormously to solving various engineering problems, i.e., most likely resource allocation in various domains. The auction approach offers a useful and intuitive method for solving distributed scheduling and resource allocation problems in a distributed and truthful manner. However, there exists inherent uncertainty regarding valuations for both auctioneers/sellers and buyers/bidders. For instance, the auctioneer is unsure about the values that bidders attach to the object being sold, i.e., the maximum amount each bidder is willing to pay. If the auctioneer knew the values precisely, it could just offer the object to the bidder with the highest value at or just below what this bidder is willing to pay. No bidder knows with certainty the values attached by other bidders and the knowledge of other bidders’ values would not affect how much the object is worth to a particular bidder [14]. In addition, with a massive volume of economic transactions conducted through auctions, numerous research on limited resource allocation and scheduling problems has been conducted through auction-based computation processes [15]. A traditionally well-known First-Price Auction (FPA) is a common type of auction that the bidder who submits the highest bid value to the auctioneer (seller) is awarded and pays its bid value to the auctioneer. However, the advent of those untruthful bidders does not make the incentive-compatible mechanism and FPA is not efficient, as shown in Fig. 1 [16]. On the other hand, the other type of auction mechanism is called Second-Price Auction (SPA). With SPA, the mechanism for selecting a winner is equivalent to FPA, where the payment by the winner is not the winner’s highest bid value but the second highest bid value. In the literature, the SPA is well-known for its truth-
Network Security and Trustworthiness
751
Fig. 1 The advent of malicious user in the auction system threats the system’s trustworthiness
fulness [17, 18]. Therefore, SPA is widely used for truthful resource allocation in various distributed computing applications. However, SPA cannot achieve revenueoptimal, i.e., the auctioneer cannot obtain maximum benefits because the second bid value will be given to the auctioneer rather than the highest. Therefore, it is important to design the auction mechanism properly for the application field. It can be designed with desired properties like solution equilibrium, incentive compatibility, allocative efficiency, individual rationality, and so forth. However, there are several impossibility theorems in auction theory that state that these properties cannot be realized simultaneously. Therefore, the research has been focused on finding the maximal subset of these properties that can be simultaneously achieved. The following chapter describes the auction studies in detail.
3 Technical Details This section provides an in-depth overview of security technologies aimed at addressing potential security threats in 6G technology. We present the security technologies in the following order: openness security techniques, post-quantum cryptography techniques, privacy-preserving techniques, and action techniques.
3.1 Openness Security Technologies in 6G Since 6G will be a more open network environment, OSS is applied to a wide range of fields, and open source is often used in various fields. However, there is a disadvantage that an attacker can take advantage of the ease with which open source can be used as open code. Furthermore, because 6G networks will be used in space-air-ground integration, the use of security systems such as IDS and firewalls within existing open network interior and exterior boundaries is limited [7]. As a
752
S. Jung et al.
result, a new security strategy against the opening vulnerability in 6G is required. Various security strategies have been proposed to respond to this. First, when developing with open source code, the automated management system should be introduced to solve security problems in open source [5, 19]. By integrating and managing potential vulnerabilities from open sources, it is necessary to quickly detect threats, detect potential vulnerabilities, and apply appropriate patches to them [5]. Furthermore, it has been suggested that secure Over The Air (OTA) technology be used to address open source security issues [5, 19]. OTA is a method of objectively evaluating the performance of radio frequency transceivers [20]. Using OTA, the software can be updated periodically. [21]. In 6G, the process of applying software patches in a timely manner should be introduced using this OTA [5]. As a new security architecture in an open network, Zero Trust Architecture (ZTA) has been presented [5, 22]. The ZTA is based on ZT’s security principles. Zero Trust (ZT) assumes that an attacker can gain access to the network from the outside and that an attacker may even be present inside the network [5]. In other words, ZT means always assuming a situation without trust and continuing the verification [23]. Applying this concept of ZT to the network architecture results in ZTA, which has the advantage of being able to control dynamic access rights in 6G, where the boundaries of the network become unclear [22]. ZTA authenticates different network elements in a reliable way, analyzes actions such as network function and user equipment, and checks for violations in security policies [5]. That is, to regulate access to the internal network, the ZTA assesses each network entity’s dependability using the trust level [5]. In other words, 6G needs to strengthen security in boundary-based networks by introducing ZTA [24].
3.2 Post Quantum Crypto Security Technologies in 6G The standard measures that were commonly used in pre-quantum cryptography have collapsed due to the emergence of quantum computing. Therefore, it is necessary to prepare countermeasures against these new forms of attack by replacing the classic cryptography algorithms with quantum counterparts, reinforcing the security of Distributed Energy Resource (DERs) systems [25]. Various methods have been proposed according to the above demands, which are known as post-quantum cryptography algorithms. Classic algorithms, such as the RSA cryptosystem have proven to be powerless against attacks using quantum computers because of their ability to factorize large numbers in an extremely short time, allowing the utilization of Shor’s algorithm [26]. Hence, Post Quantum Cryptography (PQC) algorithms must be able to defend against such attacks. One of the most well-known methods is Quantum Key Distribution (QKD) [7, 27, 28]. QKD involves leveraging the characteristics of photons to ensure the security of the information being transmitted [27, 28]. Since photons cannot be replicated and collapse when they are measured, they can be used as a quantum secret key.
Network Security and Trustworthiness
753
When unauthorized access to the information occurs, the sender and the receiver are alerted, while no meaningful information is given to the attacker at the same time [27, 28]. Another method is Quantum Secure Direct Communication (QSDC) [29, 30]. In this method, no secret key is used, and information is directly transmitted [29]. Instead, the framework consists of 2 parts: the checking round and the message coding round. During the checking round, the receiver ensures that the channel is secure for the transmission of information. After that, the receiver directly receives the encoded message from the sender. Consequently, information is safely transmitted without a quantum key, and not using a quantum key means that the communication capacity is more efficient than QKD because no resources are allocated in managing the key [30]. Finally, blind quantum computation is another effective method of ensuring quantum privacy [31, 32]. In this protocol, there is a user and a server. The server has all the computational power required to fulfill the purpose of the client. In order to perform the required computations, the client generates single qubit instructions, which are sent to the server via a secure two-way communication channel. The server carries out all the instructions and transmits the results back to the client. Consequently, no data has been transmitted between the server and the client while performing the computations. As a result, privacy is perfectly preserved. However, when there are multiple clients using the server, an attacker can still acquire parts of the information from the two-way channel. This can be used for a gradient attack where the attacker measures the distance between the current gradient and the target gradient to recover the original information. Although this can be prevented by implementing differential privacy and adding noise to the information, it is still a weakness that can be exploited.
3.3 Privacy Preserving Technologies in 6G In 6G, the simultaneous connectivity was greater than in 5G. With the integration of AI and data analysis technology, and the increasing use of smart devices, personal information is analyzed and various services are provided to users [12]. However, this hyperconnectivity in 6G can cause serious problems with privacy, such as privacy leaks. Accordingly, various methods of protecting personal information in 6G are being introduced. Representative examples include Homogeneous Encryption (HE) [33, 34], federated learning [35], split learning [36]. The HE is a cryptographic algorithm that encrypts data to protect personal information and allows third parties, such as service providers, to perform specific mathematical calculation functions without decrypting data [33, 34]. An encryption method E, all possible messages M, and homomorphic operations on ciphertexts .⊕, ⊗ are all components of a HE system, the formula satisfies as follows: .E(m1 ) ⊕ E(m2 ) = E(m1 + m2 ), ∀m1 , m2 ∈ M, .E(m1 ) ⊗ E(m2 ) = E(m1 × m2 ), ∀m1 , m2 ∈ M [34]. Since existing encryption algorithms had to be decrypted first in order to
754
S. Jung et al.
operate on encrypted data, users gave up privacy in order to use various services that used the existing encryption algorithm [33]. However, when HE algorithm is used, data can be utilized in an encrypted form, so that user information can be processed while protecting personal information [5]. Therefore, it can be used to produce a machine learning model that can safely protect personal information by using a large amount of sensitive data [1]. Federated learning is one of the promising solutions used to protect personal information [37]. Federated learning is a method of protecting personal information by exchanging model parameters between devices rather than sharing original data between users [35]. In other words, since personal data is kept locally and only model parameters are transferred between devices, data privacy can be maintained [38]. To protect personal information in 6G, edge-based federated learning method was introduced [1, 12]. This method protects personal information by introducing physical control to maintain user proximity [39]. In addition to protect personal information, one shot XorMixFL framework has been proposed to protect personal information by combining XorMixup data enhancement technology with XOR operation in federated learning. The method can protect the personal information by performing XOR operation on the original data and modifying the information, and it can protect the detailed information of the original data by augmenting and using the sample data [37]. Split learning divides the deep neural network between clients or end-systems and a centralized server [36]. The local end-system only learns the model up to a specific layer, typically the first hidden layer, after which the parameter updated from the first hidden layer is transferred to the centralized server, where the rest of the computation is performed. Kim et al. [40] devised spatio-temporal split learning that divides the deep neural network (temporal) among multiple clients allocated to geographically different locations (spatio) and one centralized server. Similarly, Ha et al. [41] applied a spatio-temporal split learning to medical data to protect the privacy of patient’s personal information, which is the most sensitive private information. It assumes each client has a privacy preserving layer, and begins the learning process with the patient’s medical record. In the spatio-temporal split learning scheme, only after the privacy preserving layer are the model parameters transferred to the central server to finish the training, as shown in Fig. 2. The privacy attack contingency is effectively reduced since there is no chance of data sharing among the clients.
3.4 Trustworthy Auction Technologies in 6G The auction approach is a useful intuitive method for solving distributed scheduling and resource allocation problems in a distributed and truthful way. With a massive volume of economic transactions conducted through auctions, numerous research on limited resource allocation and scheduling problems has been conducted through auction-based computation processes [15]. For example, Bandyopadhyay et
Network Security and Trustworthiness
755
Fig. 2 Overall framework of a multi-site split learning algorithm [40]. The end-systems refer to the multiple hospitals that possess original medical data to be trained in the deep learning model. The server is where the actual learning of the deep neural network occurs. All the hospitals connected to the server collaborate to train the deep neural network in the server without exposing their raw data to an external network.
al. [42] proposed a combinatorial auction-based fog service allocation mechanism to enhance the allocation efficiency and improve the profit of fog providers. The proposed algorithm in [43] sketches a self-organizing architecture for very large compute clouds and provides a relatively simple, scalable, and tractable solution to cloud resource allocation through the combinatorial auction. On the other hand, the proposed algorithm in [44] introduces an auction-based scheduling algorithm that plans to transfer items between robots to conduct deliveries in a more efficient way. Additionally, an auction-based incentive mechanism that achieves near-optimal long-term social welfare in collaborative computation offloading is proposed [45]. Among various auction-based scheduling and resource allocation algorithms, the Myerson auction is one of the most efficient revenue-optimal singleitem auctions [46]. In a single-item mechanism .M = (g(b), p(b)) with a set of N of n bidders consists of an allocation and payment rule. The allocation rule chooses a feasible allocation .g(b) ∈ X ⊆ R n as a function of the bids, which is gi (b) ≤ 1. Payment rule chooses payments .p(b) ∈ R n as a function of the bids. . And Bidder i has utility .ui (b) = vi · gi (b) − pi (b). The allocation and payment rule follows a standard SPA and only the concept of Myerson’s virtual valuation is added. The mechanism must deter the presence of malicious bidders to make the system trustworthy. Here are several desirable properties that a truthful mechanism should hold. Definition (Individual Rationality (IR)) A truthful mechanism .M = (g(b), p(b)) is individually rational for all bidders, if their utilities are more than 0. Ui (b) ≥ 0, ∀i ∈ N
.
(1)
Definition (Incentive Compatibility (IC)) A truthful mechanism .g(b), p(b) is incentive compatible if no requester can improve its utility by misreporting its bid.
756
S. Jung et al.
Fig. 3 A drone charging auction scenario where multiple drones compete to access the services provided by a mobile charging station [17]
Ui (bi , b−i ) ≥ Ui (bˆi , b−i ), ∀bˆi ∈ η(i), ∀i ∈ N
.
(2)
To numerically approximate the Myerson auction, applying DNN-based architecture is getting interest. Learning-based Myerson auction algorithms for charging scheduling in Wireless Power Transfer (WPT)-based multi-drone networks are proposed in [17]. In addition, the proposed algorithms in [47] and [48] solve resource allocation problems using DNN-based auctions in mobile edge computing and wireless virtualization, respectively. Shin et al.,[17] designed an auction-based mechanism to control the charging schedule in a multi-drone setting due to the flight time limitation and weight constraints as shown in Fig. 3. In this chapter, charging time slots are auctioned, and through the bidding process, the winner and payment are determined. The objective is to leverage the revenue while guaranteeing trustful auction properties using the deep learning-based auction structure. The main challenge in developing this framework is the lack of prior knowledge on the distribution of the number of drones participating in the auction. They first model the system as a second-price auction and then rely on deep learning algorithms to learn such distribution online. The proposed deep learning structure consists of an allocation network, payment network, and monotonic network for maximizing the revenue as shown in Fig. 4. Numerical results from extensive simulations show that the proposed deep-learning-based approach provides effective battery charging control in multi-drone scenarios. Auction approaches also have been widely used to address resource allocation in wireless networks. The key idea of wireless virtualization is to abstract the physical network infrastructure and resources into virtual slices. Therefore, resource allocation is a main challenging issue in wireless virtualization. However, for most existing auction-based allocation schemes, the objective is to maximize social
Network Security and Trustworthiness
757
Fig. 4 Deep learning framework for revenue optimal auction computation in a trustworthy manner [17].
welfare due to its simplicity. While in reality, Mobile Virtual Network Operators (MVNOs) are more interested in maximizing their own revenues. Work in [47] aims to design a revenue-optimal auction mechanism for resource allocation in wireless virtualization satisfying trustworthy conditions, such as individual rationality, incentive compatibility, and budget constraint. The deep learning techniques are applied, constructing a multi-layer feed-forward neural network based on the analysis of optimal auction design similar approach to [17]. The neural network adopts users’ bids as the input and the allocation rule and conditional payment rule for the users as the output. Besides the deep learning auction framework, the auction mechanism can be used for resource allocation in future networks like heterogeneous vehicular networks or cybertwin-based 6G networks to secure users’ privacy while maximizing utilities at the same time. The 6G networks are expected to support Heterogeneous Vehicular Networks (HetVNETs) scenarios. HetVNETs consist of satellites, drones, BSs, Roadside Units (RSUs), etc to manage and assist various applications for Moving Vehicles (MVs). The collaborative computing resource allocation algorithm to satisfy the Quality-of-Experience (QoE) of MVs is designed in [49]. To support various personalized requirements of vehicles for computation-intensive applications, the Parked Vehicles (PVs) can be integrated as Edge Computing Devices (ECDs) in 6G HetVNETs in a secure manner. The novel secure scheme to provide personalized edge computing services for MVs in 6G HetVNETs. Under the scenario, the computing task requested by MV can be divided into subtasks and executed by ECD and different PVs collaboratively. The threat model assumes three cases, i.e., malicious ECDs, malicious MVs, and malicious PVs. A smart-contractbased secure edge computing architecture is designed by jointly considering the attack models and the characteristics of the 6G network infrastructures, where each network infrastructure manages a number of PVs to complete computing services
758
S. Jung et al.
Fig. 5 Cybertwin-driven 6G resource trading framework [50]. If multiple cybertwins request network resources at the same time, they will compete with each other for these limited network resources
collaboratively. In this context, the collaborative computing resource allocation algorithm is designed to support each network nodes decide a Customized Service Strategy (CSS) to satisfy the QoE of MVs. After deciding on the CSSs with subtasks, a model based on the second price-sealed auction is formulated among the network nodes to obtain their Optimal Bidding Strategies (OBS) to get the chance for completing the services. In this auction game, each ECD submits its bid for the computing service task to the MV. Then the MV selects the bidder with the lowest reward as a winner and paid the second-lowest reward by ECD. The utility of ECD is related to bidding strategies and ECD needs to decide the OBS to become a winner. Cybertwin is a promising technique in the 6G network to cope with an exponentially increasing demand for mobile data traffic and a growing number of required network service appliances, as shown in Fig. 5 [50]. Multiple resources are required to fulfill various personalized services for mobile users while protecting their privacy. In this article, we propose a progressive adaptive user selection environment (PAUSE)-based combinatorial auction resource trading mechanism to allocate the resource efficiently and securely. Cybertwins can request the various resources as a bid according to their corresponding user and proceed with the auction in a fully distributed manner. Since of the transparency of this mechanism, not only is the user’s privacy well protected, but it also saves money. Simulation results validate out effectiveness.
Network Security and Trustworthiness
759
4 Open Issues and concluding remarks In this chapter, we highlighted the open issues and potential limitations of current security technologies. As network openness increases, vulnerabilities may arise, and security technologies such as OTA and ZTA have been introduced to address these issues. However, existing ZTA technology is limited in its ability to address security concerns arising from 6G networks. The majority of ZTAs currently in use are designed for a single network domain with a logically centralized controller [22], regulating access and dividing resources to protect computer services and resources [24]. Although effective in smaller networks, these ZTAs may prove insufficient for distributed management architecture-based networks in 6G, where fine-tuning control for larger networks can be time-consuming and resourceintensive. Thus, further advancements in ZTA are necessary to ensure security for super-large networks that will occur in 6G. In addition, due to the advent of quantum computers with 6G, previously used asymmetric algorithms no longer serve as secure cryptographic algorithms. In response to this, quantum-safe encryption algorithms such as QKD and QSDL have been proposed. However, there are still limitations to these techniques: the quantum safety encryption algorithm proposed thus far has not yet reached a standardization discussion [7]. To standardize and use quantum-safe encryption algorithms, various factors such as key size, key generation time, verification time, encryption time, and decryption time must be considered. The transition from existing encryption algorithms still in use to quantum-safe encryption algorithms is one of the time-consuming tasks. In addition, it is also difficult to integrate quantum safety encryption algorithms into IoT devices that will be widely used in 6G [4]. Therefore, it is also important to develop device-independent quantum cryptography. Moreover, in 6G, the use of AI-based smart devices will increase, so more personal information will be collected and utilized, making it more likely that security incidents such as personal information leakage will occur. In response, HE and federated learning have been proposed to protect personal information. However, existing federated learning is effective in Independently and Identically Distributed (IID) data, but there is a problem in that performance is degraded in non-IID data [51]. The problem may also arise since it is still expensive to protect all personal information using new technologies such as HE [7]. And these technologies still have flaws in the privacy issues that can arise if there are attackers inside the network. Therefore, privacy solutions should be further strengthened to respond to possible privacy incidents in 6G. The main point of the auction-based studies is in mechanism design, considering the system’s desirable properties. It can be designed with desired properties like solution equilibrium, incentive compatibility, allocative efficiency, individual rationality, and so forth. Therefore, it is important to configure the mechanism adequate for the system and to satisfy the maximal subset of those properties. Moreover, a trustworthy and reliable system can be designed. In addition, auction
760
S. Jung et al.
models can be applied in both centralized and distributed systems, knowing full, partial, or none of the knowledge about other users which can model and solve realworld problems. Due to the above, the auction method has been widely employed in various network resource allocation scenarios and is applicable to 6G networks like heterogeneous vehicular networks, cybertwin based networks, and space-airground-integrated networks. This chapter has explored Various solutions in response to various security threats that can arise in 6G, but there are still many problems to be solved, such as the above open issues. Since security and trust become more important in 6G, more research on various security solutions should be conducted.
References 1. V. Ziegler, P. Schneider, H. Viswanathan, M. Montag, S. Kanugovi, and A. Rezaki, “Security and Trust in the 6G Era,” IEEE Access, vol. 9, pp. 142 314–142 327, 2021. 2. C. De Alwis, A. Kalla, Q. V. Pham, P. Kumar, K. Dev, W. J. Hwang, and M. Liyanage, “Survey on 6G Frontiers: Trends, Applications, Requirements, Technologies and Future Research,” IEEE Open Journal of the Communications Society, vol. 2, pp. 836–886, 2021. 3. X. You, C.-X. Wang, J. Huang, X. Gao, Z. Zhang, M. Wang, Y. Huang, C. Zhang, Y. Jiang, J. Wang et al., “Towards 6G Wireless Communication Networks: Vision, Enabling Technologies, and New Paradigm Shifts,” Science China Information Sciences, vol. 64, no. 1, pp. 1–74, 2021. 4. P. Porambage, G. Gür, D. P. M. Osorio, M. Liyanage, A. Gurtov, and M. Ylianttila, “The Roadmap to 6G Security and Privacy,” IEEE Open Journal of the Communications Society, vol. 2, pp. 1094–1122, 2021. 5. D. H. Je, J. Jung, and S. Choi, “Toward 6G Security: Technology Trends, Threats, and Solutions,” IEEE Communications Standards Magazine, vol. 5, no. 3, pp. 64–71, 2021. 6. V. Lenarduzzi, D. Taibi, D. Tosi, L. Lavazza, and S. Morasca, “Open Source Software Evaluation, Selection, and Adoption: a Systematic Literature Review,” in Proc. of Euromicro Conference on Software Engineering and Advanced Applications (SEAA). Portoroz, Slovenia: IEEE, August 2020, pp. 437–444. 7. V.-L. Nguyen, P.-C. Lin, B.-C. Cheng, R.-H. Hwang, and Y.-D. Lin, “Security and Privacy for 6G: A Survey on Prospective Technologies and Challenges,” IEEE Communications Surveys & Tutorials, vol. 23, no. 4, pp. 2384–2428, 2021. 8. A. V. Mota, S. Azam, B. Shanmugam, K. C. Yeo, and K. Kannoorpatti, “Comparative Analysis of Different Techniques of Encryption for Secured Data Transmission,” in Proc. of IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI). Chennai, India: IEEE, September 2017, pp. 231–237. 9. P. Regulation, “General Data Protection Regulation,” Proc. of Intouch, vol. 25, pp. 1–8, 2018. 10. P. Pleva, “A Revised Classification of Anonymity,” CoRR, vol. abs/1211.5613, 2012. 11. J. Davidson, B. Liebald, J. Liu, P. Nandy, T. Van Vleet, U. Gargi, S. Gupta, Y. He, M. Lambert, B. Livingston et al., “The Youtube Video Recommendation System,” in Proc. of the ACM conference on Recommender systems, Barcelona, Spain, September 2010, pp. 293–296. 12. Y. Siriwardhana, P. Porambage, M. Liyanage, and M. Ylianttila, “AI and 6G Security: Opportunities and Challenges,” in Proc. of Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit). Porto, Portugal: IEEE, June 2021, pp. 616–621. 13. Y. Yang, L. Wu, G. Yin, L. Li, and H. Zhao, “A Survey on Security and Privacy Issues in Internet-of-Things,” IEEE Internet of Things Journal, vol. 4, no. 5, pp. 1250–1258, 2017.
Network Security and Trustworthiness
761
14. V. Krishna, Auction Theory. Academic Press, September 2009. 15. P. Klemperer, “Auction Theory: A Guide to the Literature,” Journal of Economic Surveys, vol. 13, no. 3, pp. 227–286, December 1999. 16. E. Maskin et al., Auctions and Efficiency. School of Social Science, Institute for Advanced Study, 2001. 17. M. Shin, J. Kim, and M. Levorato, “Auction-Based Charging Scheduling with Deep Learning Framework for Multi-Drone Networks,” IEEE Transactions on Vehicular Technology, vol. 68, no. 5, pp. 4235–4248, May 2019. 18. X. X. Wang and W. Wu, “Towards Truthful Auction Mechanisms for Task Assignment in Mobile Device Clouds,” in IEEE conference on Computer Communications (INFOCOM), Atlanta, GA, USA, May 2017, pp. 1–9. 19. S. A. Abdel Hakeem, H. H. Hussein, and H. Kim, “Security Requirements and Challenges of 6G Technologies and Applications,” Sensors, vol. 22, no. 5, p. 1969, 2022. 20. Y. Qi, G. Yang, L. Liu, J. Fan, A. Orlandi, H. Kong, W. Yu, and Z. Yang, “5G Over-the-Air Measurement Challenges: Overview,” IEEE Transactions on Electromagnetic Compatibility, vol. 59, no. 6, pp. 1661–1670, 2017. 21. M. M. Villegas, C. Orellana, and H. Astudillo, “A Study of Over-the-Air (OTA) Update Systems for CPS and IoT Operating Systems,” in Proc. of European Conference on Software Architecture ECSA, vol. 2, Paris, France, September 2019, pp. 269–272. 22. X. Chen, W. Feng, N. Ge, and Y. Zhang, “Zero Trust Architecture for 6G Security,” CoRR, vol. abs/2203.07716, 2022. 23. M. Campbell, “Beyond Zero Trust: Trust Is a Vulnerability,” Computer, vol. 53, no. 10, pp. 110–113, 2020. 24. S. Rose, O. Borchert, S. Mitchell, and S. Connelly, “Zero Trust Architecture,” National Institute of Standards and Technology, Tech. Rep., 2020. 25. J. Ahn, H.-Y. Kwon, B. Ahn, K. Park, T. Kim, M.-K. Lee, J. Kim, and J. Chung, “Toward Quantum Secured Distributed Energy Resources: Adoption of Post-Quantum Cryptography (PQC) and Quantum Key Distribution (QKD),” Energies, vol. 15, no. 3, p. 714, 2022. 26. P. W. Shor, “Algorithms for Quantum Computation: Discrete Logarithms and Factoring,” in Proc. of Annual Symposium on Foundations of Computer Science. New Mexico, USA: IEEE, November 1994, pp. 124–134. 27. V. Scarani, H. Bechmann-Pasquinucci, N. J. Cerf, M. Dušek, N. Lütkenhaus, and M. Peev, “The Security of Practical Quantum Key Distribution,” Reviews of modern physics, vol. 81, no. 3, p. 1301, 2009. 28. Y. Kwak, W. J. Yun, J. P. Kim, H. Cho, M. Choi, S. Jung, and J. Kim, “Quantum Heterogeneous Distributed Deep Learning Architectures: Models, Discussions, and Applications,” CoRR, vol. abs/2202.11200, 2022. 29. W. Zhang, D.-S. Ding, Y.-B. Sheng, L. Zhou, B.-S. Shi, and G.-C. Guo, “Quantum Secure Direct Communication with Quantum Memory,” Physical review letters, vol. 118, no. 22, p. 220501, 2017. 30. G.-L. Long and X.-S. Liu, “Theoretically Efficient High-Capacity Quantum-Key-Distribution Scheme,” Physical Review A, vol. 65, no. 3, p. 032302, 2002. 31. W. Li, S. Lu, and D.-L. Deng, “Quantum Federated Learning Through Blind Quantum Computing,” Science China Physics, Mechanics & Astronomy, vol. 64, no. 10, pp. 1–8, 2021. 32. S. Barz, E. Kashefi, A. Broadbent, J. F. Fitzsimons, A. Zeilinger, and P. Walther, “Demonstration of Blind Quantum Computing,” science, vol. 335, no. 6066, pp. 303–308, 2012. 33. A. Acar, H. Aksu, A. S. Uluagac, and M. Conti, “A Survey on Homomorphic Encryption Schemes: Theory and Implementation,” ACM Computing Surveys (Csur), vol. 51, no. 4, pp. 1–35, 2018. 34. J. Li, X. Kuang, S. Lin, X. Ma, and Y. Tang, “Privacy Preservation for Machine Learning Training and Classification Based on Homomorphic Encryption Schemes,” Information Sciences, vol. 526, pp. 166–179, 2020. 35. J. Koneˇcn`y, H. B. McMahan, D. Ramage, and P. Richtárik, “Federated Optimization: Distributed Machine Learning for On-Device Intelligence,” CoRR, vol. abs/1610.02527, 2016.
762
S. Jung et al.
36. Y. Matsubara, D. Callegaro, S. Baidya, M. Levorato, and S. Singh, “Head network distillation: Splitting distilled deep neural networks for resource-constrained edge computing systems,” IEEE Access, vol. 8, pp. 212 177–212 193, 2020. 37. M. Shin, C. Hwang, J. Kim, J. Park, M. Bennis, and S.-L. Kim, “XOR Mixup: PrivacyPreserving Data Augmentation for One-Shot Federated Learning,” CoRR, vol. abs/2006.05148, 2020. 38. N. Truong, K. Sun, S. Wang, F. Guitton, and Y. Guo, “Privacy Preservation in Federated Learning: An Insightful Survey from the GDPR Perspective,” Computers & Security, vol. 110, p. 102402, 2021. 39. Y. Sun, J. Liu, J. Wang, Y. Cao, and N. Kato, “When Machine Learning Meets Privacy in 6G: A Survey,” IEEE Communications Surveys & Tutorials, vol. 22, no. 4, pp. 2694–2724, 2020. 40. S. J. J. Kim, S. Park and S. Yoo, “Spatio-temporal split learning,” in 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks-Supplemental Volume (DSN-S). IEEE, 2021, pp. 11–12. 41. Y. Ha, G. Lee, M. Yoo, S. Jung, S. Yoo, and J. Kim, “Feasibility study of multi-site split learning for privacy-preserving medical systems under data imbalance constraints in COVID19, X-ray, and cholesterol dataset,” Scientific Reports, vol. 12, p. 1534, January 2022. 42. V. S. A. Bandyopadhyay, T. S. Roy and S. Mallik, “Combinatorial auction-based fog service allocation mechanism for iot applications,” in 2020 10th International Conference on Cloud Computing, Data Science & Engineering. IEEE, 2020, pp. 518–524. 43. D. C. Marinescu, A. Paya, J. P. Morrison, and P. Healy, “An Auction-Driven Self-Organizing Cloud Delivery Model,” CoRR, vol. abs/1312.2998, 2013. 44. B. Coltin and M. Veloso, “Online Pickup and Delivery Planning with Transfers for Mobile Robots,” in Proc. of the IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, May 2014, pp. 5786–5791. 45. J. He, D. Zhang, Y. Zhou, and Y. Zhang, “A Truthful Online Mechanism for Collaborative Computation Offloading in Mobile Edge Computing,” IEEE Transactions on Industrial Informatics, vol. 16, no. 7, pp. 4832–4841, July 2020. 46. R. B. Myerson, “Optimal Auction Design,” Mathematics of Operations Research, vol. 6, no. 1, pp. 58–73, February 1981. 47. K. Zhu, Y. Xu, J. Qian, and D. Niyato, “Revenue-Optimal Auction For Resource Allocation in Wireless Virtualization: A Deep Learning Approach,” IEEE Transactions on Mobile Computing (Early Access), pp. 1–1, September 2020. 48. N. C. Luong, Z. Xiong, P. Wang, and D. Niyato, “Optimal Auction for Edge Computing Resource Management in Mobile Blockchain Networks: A Deep Learning Approach,” in Proc. of the IEEE International Conference on Communications (ICC), Missouri, USA, May 2018, pp. 1–6. 49. Y. Hui, Z. S. N. Cheng, Y. Huang, P. Zhao, T. H. Luan, and C. Li, “Secure and Personalized Edge Computing Services in 6G Heterogeneous Vehicular Networks,” IEEE Internet of Things Journal, vol. 9, no. 8, pp. 5920–5931, 2022. 50. H. L. H. Liang and W. Zhang, “A Combinatorial Auction Resource Trading Mechanism for Cybertwin-Based 6G Network,” IEEE Internet of Things Journal, vol. 8, no. 22, pp. 16 349– 16 358, 2021. 51. Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra, “Federated Learning with Non-IID Data,” CoRR, vol. abs/1806.00582, 2018.
Index
A Aerial communications, 690 AI-aided networks, 566–567, 576–577 AI and machine learning (AI/ML), 34–36, 105, 585 AI-native communications, 563–570 AI-native network, 98, 110, 111, 565–569, 573–583 Air interface, 27, 32, 37–38, 61, 90, 143–160, 535, 545, 587, 611 Artificial intelligence (AI), 2, 6–8, 16–20, 34–49, 53, 54, 58–61, 75, 98, 104–106, 143–160, 292, 563–570, 573–583, 614, 724, 735, 742, 743, 753 Auction, 750–751, 754–758 Augmented reality/virtual reality/extended reality (AR/VR/XR), 5, 22, 105, 291, 573, 611, 719, 723 Autoencoder (AE), 37, 47, 145–147, 149, 150, 156, 567–569, 738 Automation, 5, 15, 17, 37, 41, 60, 525, 539, 542, 543–544, 611, 614–618, 633
B Beamforming, 7, 16, 144, 166, 241, 311, 317–347, 384, 401, 440, 506, 535, 566, 588, 616, 706, 720 Beyond Massive MIMO, 527
C Carrier aggregation, 16, 39, 724 Cell-free massive MIMO, 501–528
Cellular networks, 57, 59, 119, 129, 132, 143, 379, 390, 400, 439, 448, 456, 503, 504, 511, 512, 514, 516, 534, 585, 690, 708, 719, 722, 724, 730 Channel estimation, 7–9, 35, 37, 38, 53, 148, 170, 174, 175, 187–223, 240, 300, 305, 313, 343, 383, 443, 447, 456, 487, 506, 509, 510, 516, 519, 565, 574–576 Charging, 50, 488, 495, 669–670, 679–680, 756 Connectivity divide, 113–115 Convergence, 16, 17, 43–45, 193, 202, 281, 309, 565, 575, 589, 595, 599, 600, 603–605, 621, 633, 637, 656, 657, 659, 719–730 Coordinated multipoint (CoMP), 302, 303, 507, 509, 527, 724 CRC-polar codes, 264, 265, 274, 275 CSI feedback, 144, 147–149, 151–160, 743 Customization, 87 Cyber-physical, 5, 94, 106
D Deep learning, 5, 8, 35, 47, 48, 59, 227, 236–241, 275–288, 308, 565, 566, 568, 576, 614, 625, 620, 622–623, 627, 628, 680, 733, 735, 739, 743, 755–757 Deterministic networks (DetNet), 16, 633–662 Digital Twin Consortium (DTC), 95, 106 Digital twins, 5, 18, 19, 23–25, 27, 37, 42, 61, 76–84, 87, 90, 91, 94, 95, 105, 106, 544, 573, 618 Distributed MIMO, 390, 505
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 X. Lin et al. (eds.), Fundamentals of 6G Communications and Networking, Signals and Communication Technology, https://doi.org/10.1007/978-3-031-37920-8
763
764 E End-to-end communications, 144–147, 742 End-to-end training, 565, 575 Energy efficiency, 6, 17, 18, 42, 45, 49, 51, 52, 54, 59, 60, 89, 94, 105, 177, 222, 227, 228, 241, 292, 464, 465, 471, 526, 670, 671 Environmental, societal, and governance (ESG), 110 Extremely Large Aperture Array (ELAA), 317–347, 507, 508, 713 F Finite-depth beamforming, 336–339 Fractional Doppler shifts, 193, 199, 202–205, 207, 208, 219, 220 Frames-per-second (fps), 101 Full-duplex (FD), 33, 237, 303, 389, 390, 400, 425–456, 464 G 5G, 1, 15, 71, 93, 115, 143, 165, 188, 227, 259, 291, 318, 378, 399, 439, 523, 533, 547, 563, 573, 585, 611, 667, 691, 720, 730 6G, 2, 17, 73, 93, 129, 143, 165, 187, 227, 259, 291, 351, 377, 400, 463, 533, 547, 563, 573, 585, 611, 662, 668, 687, 723, 740, 747 6G scenarios, 292, 301, 524 6G vision, 9, 15–61, 71–91, 463, 687 H Haptic sensors and somatic sensory networks, 94 Hardware design, 343, 406–407, 409, 412 High altitude platforms (HAPs), 20, 121, 122, 132, 303 Hyper-connectivity, 576 I Imaging, 19, 26, 27, 29, 30, 377, 379, 380, 382, 390, 471, 474, 492 Immersive digital world experience (DWE), 94, 98, 105, 106, 109, 110 IMT-2020, 1, 89 IMT-2030, 2, 20, 101, 104 Independent hardware, 551 Index modulation (IM), 9, 32, 227–254, 302, 384
Index Integrated sensing and communications (ISaC), 10, 28, 29, 181, 221, 222, 293, 301, 377–393, 711 Integrated space-air-ground networks, 127, 128, 291, 303, 304, 697 Intelligent transportation systems, 386, 693 International Telecommunication Union (ITU), 1–3, 89, 113, 154, 697
K Key performance indicators (KPIs), 9, 36, 38, 42, 46, 59, 89–90, 463, 465, 479, 502, 611, 612, 614, 616–617, 623
L Latency, 4, 15, 72, 98, 125, 154, 169, 193, 232, 259, 292, 379, 401, 432, 463, 508, 543, 547, 565, 576, 612, 633, 673, 705, 723, 742 LDPC codes, 260, 261, 273–275, 281–283, 737 LEO mega-constellations, 681 Light-fidelity (LiFi), 463–465, 493–495 Listen Before Talk (LBT), 725, 726 Load balancing, 588, 592, 604, 606 Localization, 26, 28, 31–33, 35, 39, 40, 59, 233, 379, 380, 386, 388, 390, 392, 421, 616, 660, 713
M Machine learning (ML), 16, 60, 105, 110, 144, 237, 261, 277, 300, 306–308, 426, 437, 455, 538, 540, 544, 545, 565, 578, 581, 676, 729, 735, 739, 754 MAC layer applications, 580–582 Massive MIMO, 4, 32, 129, 147, 205, 351, 378, 399, 439, 503, 585, 674, 719 Message passing, 175, 192, 193, 205, 209, 210, 212, 216–218, 232, 252, 269, 576 Millimeter-wave, 4, 74, 115, 165, 187, 222, 238, 253, 351–378, 426, 439, 440, 509, 565, 612, 708, 713, 723 Modulation, 3, 30, 90, 145, 167, 187, 228, 277, 297, 377, 465, 552, 565, 575, 623, 712, 726 Motion-to-photon latency, 23, 101 Multi-objective optimization, 587 Multiplicative parabolic antenna, 372
Index N Natural language processing (NLP), 35, 47, 144, 741, 743 Near-field multiplexing, 339–342 Network disaggregation, 10, 547–561 Network sensing, 39, 95, 98, 105, 110, 111 Neural network decoding, 282–284 Next G Alliance (NGA), 94, 98, 101–104 Next generation multiple access (NGMA), 9, 292, 295, 302, 306, 310–313 Non-orthogonal multiple access (NOMA), 9, 33, 90, 292, 421, 495, 526, 576, 706 Non-terrestrial networks (NTN), 10, 17, 102, 110, 124, 667, 687–713
O OAM-embedded-MIMO (OEM), 353–373 OEM-water-filling power allocation, 352, 353, 356, 362–369, 372, 373 Online learning, 40, 114, 594, 596 Open RAN (O-RAN), 10, 533, 538–545 Optical wireless communication (OWC), 29, 34, 129, 463–495 Orbital angular momentum (OAM), 7, 9, 32, 34, 61, 351–373, 560, 660 Orthogonal frequency division multiplexing (OFDM), 4, 32, 168, 188, 227, 298, 381, 477, 534, 617, 720 Orthogonal time frequency space (OTFS) modulation, 9, 32, 173, 188, 234, 293, 297, 382, 617
P PHY layer applications, 565–568, 729 Physical layer techniques, 292, 297–303, 313 Pixel-per-degree (ppd), 101 Polar codes, 37, 260, 261, 263–276, 278, 284–288 Predictive beam management, 387 Proof-of-concept (PoC), 111, 491–492
Q Quality-of-Experience (QoE), 757, 758 Quality-of-service (QoS), 4, 41, 43, 114, 128–130, 291, 292, 294–296, 306, 307, 313, 502–504, 510, 511, 512, 524, 527, 535–537, 576, 577, 612, 620, 626, 628, 633, 634, 662
765 R RAN intelligent controller (RIC), 539–541, 544 Real-time systems, 545 Reconfigurable/Reflective intelligent surfaces (RIS), 7, 10, 61, 73, 90, 143, 227–254, 293, 304, 308, 399, 399–422, 455, 465, 493–495, 712 Reinforcement learning (RL), 36, 38, 39, 45, 48, 53, 59, 147, 307–308, 310, 569, 576–578, 581–583, 586–589, 592–595, 597–600, 602–606, 619, 671, 679, 708, 711, 735, 736, 740, 741, 743 Reliability, 4, 8, 15, 16, 18, 22, 24, 26, 27, 30, 36, 38–40, 42, 46–48, 72, 86, 98, 99, 101, 115, 123, 124, 126, 129, 169, 227, 253, 259–262, 265–267, 273, 275, 292, 294–296, 465, 495, 510, 523, 525, 526, 538, 611, 614, 616, 617, 619–621, 625, 626, 628, 634, 637, 651, 673, 680, 681, 687, 705, 724, 730, 738, 747 Resource allocation, 36, 44, 54, 177, 180, 292, 307, 308–310, 381–386, 393, 421, 503, 508, 510, 517, 521, 526, 528, 573, 575–578, 585, 612, 616, 619, 621, 628, 657, 659, 673, 674, 677–679, 703, 707, 709, 712, 736, 747, 750, 751, 754–757, 759 Resource scheduling, 221, 602 Return on investment (RoI), 110
S Second price auction (SPA), 750, 751, 755, 756 Security, 6, 18, 22, 26, 29, 44, 54–61, 76, 87–89, 96, 97, 101, 102, 111, 126, 401, 405, 421, 463, 465, 547, 560–561, 582, 583, 657, 660–661, 706, 722, 723, 730, 747–760 Self-interference, 237, 303, 389, 390, 391, 427, 428, 432, 433, 428, 434–440, 442, 447, 448, 450 Semantic communications, 46–48, 60, 733–743 Sensing, 9, 21, 79, 98, 181, 377, 456, 465, 616, 669, 711, 724, 740 Sensing-assisted communication, 39, 386–389 Signal co-processing, 512 Signal processing, 3, 6, 7, 19, 34, 35, 55, 59, 145, 170, 223, 277, 336, 343, 380, 383, 392, 393, 436, 437, 439, 441, 465, 471, 488, 503, 506, 508–510, 512, 521–524, 534, 542, 565, 574, 626
766 Simultaneous transmission and reflection (STAR), 10, 399–422 Smart radio environment, 404, 411, 421 Spectral efficiency, 3, 6, 32, 33, 38, 52, 56, 89, 154, 157–159, 169–175, 181, 230, 295, 296, 299, 300, 302, 305, 326, 347, 390, 426–428, 430, 431, 433, 441, 443–445, 449, 465, 483, 502–504, 506, 508, 509, 511–515, 517, 519, 520–525, 528, 535, 705 Spectrum sharing, 3, 7, 17, 38, 49, 52–54, 114, 115, 428, 431–432, 534, 577, 729 Spherical waves, 32, 169, 322, 323, 331, 335, 343, 451 Split learning (SL), 45, 46, 60, 550, 753–755 Successive cancellation list (SCL) decoding, 268, 271–272 Surveillance, 24, 389, 390, 581, 680–681 Sustainability, 2, 6, 9, 18, 21, 30, 49, 54, 59, 60, 85, 105, 110, 131 T Terabit optical wireless backhaul, 479–486 Terahertz communications, 9, 129, 143, 165–181, 379 3rd Generation Partnership Project (3GPP), 1–4, 8, 15, 21, 40, 43, 52, 143, 144, 156, 158, 160, 273, 275, 390, 456, 525, 534–536, 587, 600, 682, 687, 691–693, 697, 698, 700, 709, 723–730 3D coverage, 30
Index Time sensitive networks (TSN), 16, 634, 635, 638, 641–643, 660, 662 Trajectory optimization, 581, 670, 671 Transformer, 40, 41, 47, 60, 146, 239–241, 273, 475, 734–738, 741 Trustworthiness, 8, 11, 18, 21, 110, 747–760 TV white space, 131
U UAV communications, 10, 580, 623, 667–682, 689, 692, 696, 698 Ubiquitous intelligence, 76–77, 91 Ultra-reliable and low-latency communications (URLLC), 4, 5, 10, 15–17, 19, 24, 71, 98, 105, 110, 260, 305, 399, 524–527, 536, 611–629, 693, 729, 740 Unmanned aerial vehicle (UAV), 10, 20, 119, 187, 301, 580, 623, 667, 689, 742 User-centric networks, 504 User-machine-interface (UMI), 94
V Vehicular networks, 221, 613, 622, 757, 760
W Waveform design, 9, 32, 37, 167, 168, 170–177, 180, 227–254, 381–386, 393 Wi-Fi, 10, 27, 39, 52, 114, 116, 129, 131, 228, 392, 393, 432, 523, 534, 535, 719–730