177 17 16MB
English Pages 379 [380] Year 2019
IET MATERIALS, CIRCUITS AND DEVICES SERIES 40
Digitally Enhanced Mixed Signal Systems
Other volumes in this series: Volume 2 Volume 3 Volume 4 Volume 5 Volume 6 Volume 8 Volume 9 Volume 10 Volume 11 Volume 12 Volume 13 Volume 14 Volume 15 Volume 16 Volume 17 Volume 18 Volume 19 Volume 20 Volume 21 Volume 22 Volume 23 Volume 24 Volume 25 Volume 26 Volume 27 Volume 29 Volume 30 Volume 32 Volume 33 Volume 38 Volume 39 Volume 43 Volume 47 Volume 60 Volume 68
Analogue IC Design: The current-mode approach C. Toumazou, F.J. Lidgey and D.G. Haigh (Editors) Analogue–Digital ASICs: Circuit techniques, design tools and applications R.S. Soin, F. Maloberti and J. France (Editors) Algorithmic and Knowledge-based CAD for VLSI G.E. Taylor and G. Russell (Editors) Switched Currents: An analogue technique for digital technology C. Toumazou, J.B.C. Hughes and N.C. Battersby (Editors) High-Frequency Circuit Engineering F. Nibler et al. Low-Power High-Frequency Microelectronics: A unified approach G. Machado (Editor) VLSI Testing: Digital and mixed analogue/digital techniques S.L. Hurst Distributed Feedback Semiconductor Lasers J.E. Carroll, J.E.A. Whiteaway and R.G.S. Plumb Selected Topics in Advanced Solid State and Fibre Optic Sensors S.M. VaeziNejad (Editor) Strained Silicon Heterostructures: Materials and devices C.K. Maiti, N.B. Chakrabarti and S.K. Ray RFIC and MMIC Design and Technology I.D. Robertson and S. Lucyzyn (Editors) Design of High Frequency Integrated Analogue Filters Y. Sun (Editor) Foundations of Digital Signal Processing: Theory, algorithms and hardware design P. Gaydecki Wireless Communications Circuits and Systems Y. Sun (Editor) The Switching Function: Analysis of power electronic circuits C. Marouchos System on Chip: Next generation electronics B. Al-Hashimi (Editor) Test and Diagnosis of Analogue, Mixed-Signal and RF Integrated Circuits: The system on chip approach Y. Sun (Editor) Low Power and Low Voltage Circuit Design with the FGMOS Transistor E. Rodriguez-Villegas Technology Computer Aided Design for Si, SiGe and GaAs Integrated Circuits C.K. Maiti and G.A. Armstrong Nanotechnologies M. Wautelet et al. Understandable Electric Circuits M. Wang Fundamentals of Electromagnetic Levitation: Engineering sustainability through efficiency A.J. Sangster Optical MEMS for Chemical Analysis and Biomedicine H. Jiang (Editor) High Speed Data Converters Ahmed M.A. Ali Nano-Scaled Semiconductor Devices E.A. Gutiérrez-D (Editor) Nano-CMOS and Post-CMOS Electronics: Devices and modelling Saraju P. Mohanty and Ashok Srivastava Nano-CMOS and Post-CMOS Electronics: Circuits and design Saraju P. Mohanty and Ashok Srivastava Oscillator Circuits: Frontiers in design, analysis and applications Y. Nishio (Editor) High Frequency MOSFET Gate Drivers Z. Zhang and Y. Liu System Design with Memristor Technologies L. Guckert and E.E. Swartzlander Jr. Functionality-Enhanced Devices: An alternative to Moore’s law P.-E. Gaillardon (Editor) Negative Group Delay Devices: From concepts to applications B. Ravelo (Editor) Understandable Electric Circuits: Key concepts, 2nd Edition M. Wang IP Core Protection and Hardware-Assisted Security for Consumer Electronics A. Sengupta and S. Mohanty High Quality Liquid Crystal Displays and Smart Devices, vol. 1 and vol. 2 S.Ishihara, S. Kobayashi and Y. Ukai (Editors)
Digitally Enhanced Mixed Signal Systems Edited by Chadi Jabbour, Patricia Desgreys and Dominique Dallet
The Institution of Engineering and Technology
Published by The Institution of Engineering and Technology, London, United Kingdom The Institution of Engineering and Technology is registered as a Charity in England & Wales (no. 211014) and Scotland (no. SC038698). © The Institution of Engineering and Technology 2019 First published 2019 This publication is copyright under the Berne Convention and the Universal Copyright Convention. All rights reserved. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may be reproduced, stored or transmitted, in any form or by any means, only with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publisher at the undermentioned address: The Institution of Engineering and Technology Michael Faraday House Six Hills Way, Stevenage Herts, SG1 2AY, United Kingdom www.theiet.org While the authors and publisher believe that the information and guidance given in this work are correct, all parties must rely upon their own skill and judgement when making use of them. Neither the authors nor publisher assumes any liability to anyone for any loss or damage caused by any error or omission in the work, whether such an error or omission is the result of negligence or any other cause. Any and all such liability is disclaimed. The moral rights of the authors to be identified as authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.
British Library Cataloguing in Publication Data A catalogue record for this product is available from the British Library
ISBN 978-1-78561-609-9 (hardback) ISBN 978-1-78561-610-5 (PDF)
Typeset in India by MPS Limited Printed in the UK by CPI Group (UK) Ltd, Croydon
Contents
Preface
xi
1 Digitally enhanced mixed signal systems—the big picture Christian Vogel, Harald Enzinger, and Karl Freiberger
1
1.1 Motivation 1.2 Methodology 1.2.1 A system-oriented perspective 1.2.2 An extended view on data converters 1.2.3 The design process 1.3 Examples 1.3.1 Enhancing power amplifiers 1.3.2 Enhancing data converters 1.3.3 Enhancing clock generation 1.4 Conclusion References 2 Nonlinear modeling Raphael Vansebrouck, Dang-Kièn Germain Pham, Chadi Jabbour, and Patricia Desgreys 2.1 Introduction 2.2 Nonlinear models 2.2.1 Parametric models 2.2.2 Nonparametric models 2.3 Suited models for each RF bloc 2.3.1 Extension to complex models 2.3.2 Models for power amplifiers 2.3.3 Models for low-noise amplifiers 2.3.4 Models for baseband blocks 2.4 Digital compensation of nonlinear distortions 2.4.1 Direct learning architecture 2.4.2 Indirect learning architecture 2.5 Summary References
1 2 2 3 6 9 9 11 13 14 15 27
27 27 28 35 38 38 41 44 46 46 47 57 59 60
vi Digitally enhanced mixed signal systems 3 Digital predistortion Geneviève Baudoin, Olivier Venard, and Dang-Kièn Germain Pham 3.1 Why do we need predistortion? 3.1.1 Waveform features 3.1.2 System level considerations 3.2 Principles of predistortion 3.3 Analog vs digital predistortion 3.4 Mathematical aspects 3.4.1 Baseband formulation 3.4.2 pth-Order inverse of linear system 3.5 Models for DPD structures 3.5.1 Parametric models 3.5.2 Nonparametric models 3.6 Identification 3.6.1 Indirect learning architecture 3.6.2 Direct learning architecture 3.6.3 DPD with iterative learning control (ILC) 3.7 Wideband and subband processing 3.8 Multidimensional predistortion 3.8.1 Linearization of noncontiguous carrier aggregation 3.8.2 Multiple input multiple output 3.9 Model sizing 3.9.1 Model sizing by hill climbing heuristic 3.9.2 Model sizing by integer genetic algorithm 3.9.3 Model sizing using orthogonal matching pursuit (OMP) algorithm 3.10 Joint mitigation of various impairments 3.10.1 Cooperation with crest factor reduction (CFR) 3.10.2 Processing of imperfections 3.11 Overview of companion signal processing 3.11.1 Synchronization 3.11.2 Sampling frequency 3.11.3 Monitoring 3.12 Implementation 3.13 Conclusion References
4 Digital post-distortion of radio receivers and analog-to-digital converters Bryce Minger, Raphaël Vansebrouck, Chadi Jabbour, Loïc Fuché, Guillaume Ferré, Dominique Dallet, Patricia Desgreys, and Olivier Jamin 4.1 Motivations for post-distortion of radio receivers and ADCs 4.1.1 Ideal vs. practical radio receiver 4.1.2 Dynamic range issues of modern radio receivers
65 65 66 67 70 73 74 74 75 76 77 83 84 85 87 89 92 96 97 98 99 100 102 104 105 105 106 109 109 110 110 111 113 113
125
126 126 126
Contents 4.1.3 Principle of post-distortion 4.1.4 Figures of merit 4.2 Review of distortions issues met in radio receivers 4.2.1 Distortion issue of IF-digitising superheterodyne receivers 4.2.2 Distortion issue of low-IF receivers 4.2.3 Distortion issue of full-digital receivers 4.3 Model-based post-distortion: modelling 4.3.1 The passband Volterra model 4.3.2 The baseband Volterra model 4.3.3 Physical interpretation and dimensioning of Volterra model non-linearity order and memory depth 4.3.4 Derivatives of Volterra model 4.3.5 Modelling ADCs 4.3.6 Modelling IF-digitising superheterodyne and full-digital receivers 4.3.7 Modelling low-IF receivers 4.3.8 On the usage of baseband Volterra model for reducing computational burden of passband Volterra model 4.3.9 On sampling frequency required for non-linear system modelling 4.4 Model-based post-distortion: identification and inversion 4.4.1 Statements of the model identification problem 4.4.2 Dealing with the need of both distorted and undistorted signal samples 4.4.3 Solution of direct Wiener filter problem 4.4.4 Inversion of a model determined by a direct identification scheme 4.4.5 On the numerical instability issue 4.4.6 Effects of numerical instability on the least square solution 4.4.7 Effects of numerical instability on stochastic least mean square and recursive least square solutions 4.5 Study of an example of ADC and receiver model-based post-distortion solution 4.5.1 Targeted system features 4.5.2 Block diagram of the post-distortion solution 4.5.3 Modelling features 4.5.4 Identification features 4.5.5 Inversion features 4.5.6 Results of post-distortion operated on a simulated full-digital receiver 4.5.7 Results of post-distortion operated on the targeted system 4.6 Look-up-table-based post-distortion of ADCs 4.6.1 LUT-based post-distortion strategies 4.6.2 Determination of LUT values 4.6.3 INL sequence modelling 4.7 Conclusion References
vii 127 127 128 128 130 132 133 133 135 136 137 137 138 139 141 143 144 145 146 147 148 149 150 151 152 152 154 154 156 157 157 158 160 160 161 162 162 163
viii
Digitally enhanced mixed signal systems
5 Time or frequency interleaved analog-to-digital converters Antoine Bonnetat, Han Le Duc, Ali Beydoun, Dominique Dallet, Guillaume Ferré, Jean-Michel Hodé, Patricia Desgreys, and Chadi Jabbour 5.1 Introduction 5.2 Principle of time-interleaved ADCs and impacts of mismatches 5.2.1 TIADC principle 5.2.2 The impact of channel mismatches 5.3 State of the art of interleaved channel mismatches compensation 5.3.1 Analog compensation techniques 5.3.2 Mixed signal compensation techniques 5.3.3 Digital compensation 5.4 Feedforward background calibration technique of clock skews 5.4.1 Digital estimation 5.4.2 Digital correction 5.4.3 Calibration for input at any Nyquist band 5.5 Feedback calibration technique of bandwidth mismatches 5.5.1 Frequency–response mismatch model 5.5.2 Theoretical channel mismatches estimation 5.5.3 Channel mismatches compensation 5.5.4 Simulation results 5.6 Extended frequency band decomposition A/D converter 5.6.1 EFBD architecture 5.6.2 Digital reconstruction system (DS) 5.6.3 Adaptation algorithms 5.7 Conclusion References 6 Digitally enhanced digital-to-analogue converters Torsten Lehmann, Pasindu Aluthwala, and Sridevan Parameswaran 6.1 Overview 6.2 Digital-to-analogue converters 6.2.1 Converter implementation 6.2.2 Converter errors 6.3 DAC linearisation 6.3.1 Calibration 6.3.2 Dynamic element matching 6.4 Harmonic-cancelling DAC with partial DEM 6.4.1 Harmonic-cancelling sine wave generation 6.4.2 Dynamic element matching in HC-DACs 6.4.3 Experimental results 6.5 Summary References
171
171 172 172 173 180 181 181 182 184 184 187 191 193 193 199 202 205 209 209 211 215 218 218 227 227 228 228 230 234 234 235 241 241 244 248 251 252
Contents 7 Clock generation Naser Pourmousavian, Teerachot Siriburanon, Feng-Wei Kuo, Masoud Babaie, and Robert Bogdan Staszewski 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8
ix 255
Development of advanced PLLs ADPLL-based transmitter Ultra-low-voltage, ultra-low-power ADPLL for IoT applications Switched-capacitor DC–DC converter Low-voltage ADPLL architecture with PVT-tolerant TDC Switching current-source oscillator Calibration for PVT-insensitive time-to-digital converter (TDC) Design of high-efficiency switched-capacitor doubler/regulator for event-based load 7.9 Implementation and experimental results 7.10 Conclusion References
255 260 261 263 267 268 270
8 Fixed-point refinement of digital signal processing systems Daniel Ménard, Gabriel Caffarena, Juan Antonio Lopez, David Novo, and Olivier Sentieys
289
8.1 Introduction 8.2 Fixed-point arithmetic 8.2.1 Fixed-point representation 8.2.2 Format propagation 8.2.3 Quantisation process and rounding modes 8.2.4 Overflow modes 8.3 Architecture support for fixed point 8.3.1 Fine-grained word-length operators 8.3.2 Mid and coarse-grained word-length operators 8.4 Fixed-point conversion process 8.5 Integer-part word-length selection 8.5.1 Dynamic range evaluation 8.5.2 IWL determination and insertion of scaling operations 8.6 Fractional-part word-length determination 8.6.1 Word-length optimisation 8.6.2 Accuracy evaluation 8.7 Conclusion References 9 Adaptive filtering Romuald Rocher, Pascal Scalart, and Robin Gerzaguet 9.1 Introduction 9.2 Algorithm presentations 9.2.1 Least mean square algorithm 9.2.2 Affine projection algorithms
275 280 284 286
289 291 291 292 293 294 295 295 297 299 301 301 308 309 309 314 318 318 327 327 328 328 331
x Digitally enhanced mixed signal systems 9.2.3 Recursive least square 9.2.4 Nonlinear algorithms 9.3 Algorithm comparison 9.3.1 Complexity comparison 9.3.2 Implementation and cost 9.3.3 Discussion 9.4 Application 9.4.1 Context and model 9.4.2 Floating-point and fixed-point model 9.5 Conclusion References
334 338 343 343 344 346 347 347 348 351 352
Index
355
Preface
Our book “Digitally Enhanced Mixed Signal Systems” is devoted to digital enhancement techniques addressing key challenges relevant to analog, RF and mixedsignal components. This topic has emerged in the near past in the context of steadily shrinking CMOS technology, user increasing demand for higher flexibility and higher data traffic in communications networks. We, the three editors of this book, have been working in the field on digitally enhanced mixed-signal systems for the last 10 years. We have first contributed to the dissemination activity by organizing two workshops in France: in May 2012 “Digital Correction of Analog Electronics Imperfections” and in December 2015 “Digitally Enhanced Mixed Signal Systems.” The organization was led by the CNRS GDR SOC2, a national research group in charge of studying and proposing new approaches for the design and validation of embedded systems for connected objects (http://www.gdr-soc.cnrs.fr/). On the strength of the success of these workshops and the interest of several European and international participants, we decided to organize a Special Session at the IEEE International Conference on Electronics, Circuits and Systems which took place in December 2016 in Monaco. This session, with toplevel international contributors, participated in the emergence of the topic at the international level. Pushed by the interest of the scientific community, we decided in June 2017 to invite the main contributors encountered during the workshops and the special session as well as other world-class researchers to participate in the writing of a book on this topic. This book provides an overall overview of how to design, size and implement a digital assistance to compensate for a given non-ideality. All the main steps will be covered from the modeling approach to the implementation in fixed point. The major aspects for each topic will be discussed in order to materialize the advantages, drawbacks and limits of the presented methods and models. The book is organized around nine chapters, some of which can be approached independently. The first chapter gives an overview of the theme proposed in this book. Chapters 2, 3 and 4 focus respectively on modeling, pre-distortion and post-distortion of RF and mixed-signal systems. These three chapters are complementary and provide a full understanding of the nonlinearity issues in cyber-physical interfaces, taking into account the memory effect of systems. The following chapters are dedicated to the components: the analogue to digital converters interleaved in time or frequency domain in chapter 5, the digital-to-analog converters in chapter 6 and finally the clock generation in chapter 7. The last two chapters of the book focus on the practical and
xii
Digitally enhanced mixed signal systems
physical implementation of the digital enhancement algorithms on FPGA or/and on ASIC. Chapter 8 covers the main steps for the fixed point refinement, and chapter 9 covers the main adaptive filtering techniques and the related instability problems. Writing this book is the result of a truly international cooperation of experts. We feel very much indebted to the authors for their scientific contributions and for giving us the opportunity to edit this valuable work. We really enjoyed interacting with them. We also express our deep appreciation of the efforts by the IET personnel to make this book possible, thanks to their expert help.
Chapter 1
Digitally enhanced mixed signal systems—the big picture Christian Vogel1 , Harald Enzinger2 , and Karl Freiberger3
A mixed-signal processing system consists of analog and digital signal processing systems that are connected by interfaces. Analog systems process analog signals, which are continuous in amplitude and in time. By contrast, digital systems process digital signals, which are discrete in amplitude and in time. The interfaces between these domains are data converters that either convert the analog signal to a digital signal by using an analog-to-digital converter (ADC) or convert the digital signal to an analog signal by employing a digital-to-analog converter (DAC). Strictly speaking, all signals are analog signals, and all systems are analog systems realized by analog circuits. However, by implementing very well-defined analogsignal-processing blocks, where discrete bands of analog levels represent the digital states, and by synchronizing the processing of these blocks through a clock signal, we can use the abstraction of digital signals and systems realized by digital circuits. In digitally enhanced mixed-signal processing systems, these differences sometimes get blurred, as, for example, digital circuits are used to realize analog functionalities like in an all-digital transmitter [1], or analog circuits are driven by discrete-amplitude signals like in a switch-mode power amplifier (PA) [2].
1.1 Motivation Compared to analog circuits, digital circuits have advantages. The foremost ones are predictability and flexibility, which are difficult to achieve with analog circuits. The behavior of analog circuits is much more depending on manufacturing processes, aging effects, temperature changes, and supply voltage changes. Fortunately, the flexibility of digital circuits can be exploited to overcome the shortcomings of analog circuits when they are connected by data converters. Since Moore’s law has been valid for more than three decades [3], the density of logic circuits and the energy per logic
1
Department of Engineering, FH JOANNEUM – University of Applied Sciences, Austria Intel, Connected Home Division, Austria 3 Infineon Technologies Austria, Austria 2
2 Digitally enhanced mixed signal systems transition have reached a level, where digital enhancement may not only provide more flexibility but can also be a real competitor in energy-efficient hardware design using energy-efficient but—without enhancement—inadequate analog circuitry [4]. Still, as long as we have not achieved a major breakthrough in data conversion techniques, at some point, analog circuits are required for certain signal-processing tasks like filtering, modulation, or amplification.
1.2 Methodology In the following, we give an overview on the high-level design of digitally enhanced mixed-signal processing systems. We start with considerations from a system-oriented perspective to highlight central consequences of moving functionalities from the analog to the digital domain. After that, we present an extended view on data converters, which are central building blocks of every mixed-signal system. Finally, we discuss the design process of mixed-signal systems, which can be interpreted as an iteration loop involving the steps of modeling, analysis, and enhancement.
1.2.1 A system-oriented perspective The basic approach for digitally enhanced mixed-signal processing systems is illustrated in Figure 1.1, which represents the following system design trend [5]: Move functionalities from the analog/radio frequency (RF) domain to the digital domain in order to improve, complement, and mimic analog circuits. Moving analog functionalities to the digital domain increases the flexibility for improving performance figures and for avoiding limitations of analog circuits. Furthermore, functionalities in the digital domain simplify technology scaling. Digital enhancement is supported by the growing performance gap between analog and digital circuits [6], where digitally enhanced mixed-signal processing systems are an attractive design option for a fixed energy budget and high signal fidelity [4].
Move functionality
xc (t)
Analog system
Interface
Improve, complement, and mimic x[n] Digital system
Analog Digital domain domain
Figure 1.1 Digitally enhanced mixed-signal processing system for increased flexibility and system performance
Digitally enhanced mixed signal systems—the big picture
3
Digitally enhanced mixed-signal processing systems require a codesign of analog and digital blocks, which also introduces new design challenges [5]. Such systems exploit inter-block information along the signal processing chain and cross-layer information between the abstraction layers, which leads to a mutual dependence of the blocks and layers that increases the system and design complexity. The interacting functional blocks are getting more application dependent, and the reusability of a single block for different applications is limited. Furthermore, the increased system complexity and the need to simulate analog and digital blocks simultaneously lead to longer simulation times during the system development and require new test concepts. Although we notice an increasing performance gap between analog and digital circuits [6], most of the time, we cannot neglect energy, area, and speed constraints of digital circuits and their impact on the entire mixed-signal processing system. Specifically in real-time systems, the possible complexity of digital circuits is often constrained by speed and energy limits. The complexity of digital enhancement structures depends on the design complexity and the implementation complexity. The implementation complexity is the operational complexity of a digital structure, for example, the number of multiplications and additions per instant, to produce an output from an input. By contrast, the design complexity is the complexity to compute the required parameters of the structure, e.g., to compute the filter coefficients. For a filter design with fixed coefficients, only the implementation complexity is important and the design complexity can be neglected; however, if an online computation of the coefficients is required, the design complexity becomes important or can even be the limiting factor for implementing a structure. Consequently, an important design goal is to reach a good trade-off between the design complexity and the implementation complexity to achieve the best total complexity of the digital enhancement structure, e.g., [7–9].
1.2.2 An extended view on data converters On the one hand, the research on digitally enhanced mixed-signal processing systems is often focused on data converters, i.e., ADCs and DACs, because data converters are the bottleneck in many mixed-signal processing systems. On the other hand, however, to overcome the bottleneck, the often limited view on data converters needs to be extended. They neither only convert amplitude information from one to the other domain nor do they only consist of a sampler and a quantizer [10]. In Figure 1.2, a digitally enhanced mixed-signal processing system representing the analog-to-digital conversion case is shown. It consists of an analog preprocessing block, a continuous-time signal to discrete-time signal converter (C/D), and a digital postprocessing block. Furthermore, it contains feedback loops and feedforward loops providing auxiliary information. The source of the auxiliary information can be either inter block, i.e., information shared among the blocks of the same layer, or it can be cross layer, i.e., information shared among the blocks of different abstraction layers. These two sources of information can be a priori knowledge, i.e., information about the blocks that has already been exploited in the realization of the mixed-signal
4 Digitally enhanced mixed signal systems crosslayer
Interlock Feedforward
xc(t)
Analog preprocessing
^x[n]
^xc(t)
Digital postprocessing
C/D
Analog domain
x[n]
Digital domain Feedback
Interblock
Crosslayer
Figure 1.2 Digitally enhanced mixed-signal processing system with analog preprocessing, continuous-time signal to discrete-time signal conversion (C/D), and digital postprocessing. The entire system can be seen as an extended ADC Table 1.1 Examples of auxiliary information
A priori A posteriori
Inter block
Cross layer
Circuit mismatch characteristics Measured temperature
Input signal distribution Decoded symbols
processing system, or a posteriori knowledge, i.e., additional hardware observing auxiliary parameters. Therefore, dashed lines for the auxiliary information have been used in Figure 1.2 to emphasize that the information flow can be either a posteriori measured physical data or a priori system knowledge. In Table 1.1, some examples of possible auxiliary information are listed. In the simplest case, the analog preprocessing block is an analog filter limiting the bandwidth of the input signal and the C/D block is an ADC. A delta–sigma converter, for example, is a digitally enhanced mixed-signal processing system that needs more blocks of this general concept. The preprocessing does not only consist of the analog filter but also includes a feedback of the previous conversion result. Furthermore, digital filters and rate conversion blocks, i.e., digital postprocessing, are required to reduce the quantization noise of the sampled signal. In Figure 1.3, the digital-to-analog conversion case is shown. It consists of a digital preprocessing block, a discrete-time signal to continuous-time signal converter (D/C), an analog postprocessing block, feedback loops, and feedforward loops.
Digitally enhanced mixed signal systems—the big picture
5
Crosslayer
Interblock Feedforward ^xc (t)
^x[n]
x[n] Digital preprocessing
D/C
Digital domain
Analog postprocessing
xc (t)
Analog domain Feedback
Interblock
Crosslayer
Figure 1.3 Digitally enhanced mixed-signal processing system with digital preprocessing, discrete-time signal to continuous-time signal conversion (D/C), and analog postprocessing. The entire system can be seen as an extended DAC
The simplest realization of a DAC consists of a sample-and-hold represented by the D/C block and an analog reconstruction filter represented by the analog postprocessing block. With the general concepts shown in Figures 1.2 and 1.3, different categories of digitally enhanced mixed-signal processing systems can be identified [6]: ●
●
Digital systems improving analog systems: The correction block is not a required part for the system functionality, but improves the system performance significantly, for example, in terms of a signal-to-noise and distortion ratio [11]. Postcorrection of ADCs and precorrection of DACs are the most well-known applications, where analog impairments are corrected by digitally pre- and postcorrecting the digital signal [12]. Digital systems complementing analog systems: The postprocessing block complements the preprocessing block to realize a necessary functionality of the mixed-signal processing system. An example for such a concept illustrated in Figure 1.3 is the burst-mode transmitter [13], which is used for its high energy efficiency. The PA in this architecture is driven by a pulsed analog signal. Such signals only convey phase information; therefore, a digital preprocessing block is required to encode the amplitude information into pulses of different widths and positions. After the PA, a filter is necessary to transform the pulsed signal back into an amplified amplitude signal. Hence, according to Figure 1.3, the entire system can be seen as an extended DAC, which is optimized for an energy-efficient generation of larger output powers.
6 Digitally enhanced mixed signal systems ●
Digital systems mimicking analog systems: The digital post- or preprocessing block fully adopts the analog functionality. An example is a digital phase-locked loop that implements the entire control loop digitally, i.e., by digital postprocessing, and a digitally controlled oscillator represents the D/C block. According to the general concept in Figure 1.3, a digital phase-locked loop is an extended DAC generating different output frequencies instead of amplitude signals.
Mixed-signal processing systems often employ a mixture of these enhancement techniques. In a burst-mode transmitter, for example, the digital system complements the analog one, but in addition, there are also digital enhancement systems like a predistorter [14–17] mitigating the nonlinear impairments of the PA. The popular concept of analog-to-information converters [18] can also been seen as an extended data converter with analog and digital functionalities complementing each other. The basic idea is to sample signals not at the Nyquist rate given by the sampling theorem [19] but on a much lower rate exploiting the sparseness and the structure of signals in time, frequency, or code domain [20]. Analog-to-information converters utilize the preprocessing block to transform the input signal into a domain, where the necessary information can be sampled at a lower rate than the Nyquist rate. The gathered samples need extensive digital postprocessing to reconstruct the equivalent signal sampled on the Nyquist rate. The required blocks exactly match the blocks shown in Figure 1.2.
1.2.3 The design process The design process is used to develop digital circuits that improve, complement, or mimic analog systems resulting in a digitally enhanced mixed-signal processing system. To this end, as shown in Figure 1.4, three iterative steps are performed [21–23]: ●
●
Modeling: In the modeling step, a model is created which reflects the systems behavior according to a desired accuracy. This model should provide a consistent representation of the mixed-signal processing system and is devised by employing data from the literature, measurement results, external expert knowledge, and researcher experience. It is an abstraction of the system on a certain level of granularity, which gets refined over time. Mathematical models represented by equations and simulation models represented by an executable description are used. Different modeling languages such as MATLAB® /Simulink® , VHDLAMS, and Spice exist to develop models on system, register transfer, and transistor level. The consistency of the created models is verified by measured data and data from the literature. Analysis: In the analysis step, the model is investigated and assessed according to important figures of merit for the given application, i.e., the signal fidelity. The goal is to reach an understanding of the major concepts and challenges of the mixed-signal processing system. This is done by analytical derivations, extensive
Digitally enhanced mixed signal systems—the big picture
7
Problem
Modeling
Analysis
Solution Enhancement
Figure 1.4 The design process for digitally enhanced mixed-signal processing systems
●
simulations, and analysis of measurement and test results from circuit implementations and lab prototypes. The analysis should provide guidelines for enhancing the mixed-signal processing system. Enhancement: The final step is the development of enhancement schemes according to Figures 1.2 and 1.3. Accuracy requirements, speed limits, and energy budgets define the level of enhancements and the possible techniques [24]. From a research perspective, however, only fundamental limits, but not current technology limits, should constrain the enhancement approach. After inventing a new scheme, the process starts again with the modeling step including the enhancement. This is done from the system level, i.e., algorithms in high-level language, to the block level, i.e., signal-flow-graph in Simulink/VHDL, and, when feasible, with current technology on the transistor level.
Depending on the level of required research, this process is not a clear step-by-step procedure. As depicted in Figure 1.5, parts of a system model are developed on different accuracy levels at the same time. The modeling and the analysis steps are often not that well-separated blocks and are developed together. Different approaches are iterated on different levels. Therefore, the research on mixed-signal processing systems is a creative process rather than a step-by-step procedure. Whenever a physical realization of the investigated system is available, measurements should be made in addition to simulations. In the following, we highlight the importance of measurements throughout the design process, i.e., in modeling, analysis, and enhancement. We illustrate the discussion with the digitally enhanced
8 Digitally enhanced mixed signal systems
Realizations
Structures
mo del s
Eq uat ion s
d Mo
Cir cui ts
rem
asu
s on ati riv De n tio ula s Sim esult r
Me
ent s
Analysis
ng
eli
Algorithms
Sim ula tion
En han cem ent
Figure 1.5 The typical design and research process for digitally enhanced mixed-signal processing systems
mixed-signal measurement methods described in [25]. In their development, we also followed the design process outlined above: modeling of the impairments, analysis of their behavior, and design of the enhanced measurement procedure. Modeling usually involves fitting model parameter values from measurement data to obtain a realistic model behavior. Comparing the model behavior to the measured device under test (DUT) behavior is important to verify that the model is appropriate. The model fidelity is typically quantified by means of a distance (or similarity) metric between the model output and the measured DUT signal. An example for such a metric is the normalized mean squared error common in behavioral modeling and predistortion of PAs [26]. Using a different signal for model evaluation and parameter estimation allows for detecting a potential overfitting of the model. This verification of the model can already be counted to the analysis step—the boundary to modeling is blurred. The analysis step usually involves assessment of application-specific figures of merits. To exemplify related measurement challenges together with remedies in the form of digitally enhanced measurement methods, we briefly discuss the error vector magnitude (EVM) of a digitally modulated signal in the following. The EVM is an inverse signal-to-noise and distortion metric, able to quantify the combined effect of analog and mixed-signal non-idealities such as nonlinearity, IQ mixer imbalance, and local oscillator phase noise of a transmitter. In case of the
Digitally enhanced mixed signal systems—the big picture
9
EVM, the measurement is difficult because it requires careful synchronization and standard-dependent demodulation of the data signal. In [27], we present an alternative method to measure the EVM without data demodulation. To estimate the EVM, we remove linearly correlated signal components in the digital domain and analyze the remaining noise and distortion signal component. In essence, we enhance an analogmeasurement method [28] by shifting parts of it (removing the correlated parts and spectrum analysis) to the digital domain. This approach is also pursued with the error power ratio measurement method proposed in [29], where we use digital filtering to obtain a test signal with steep and deep stopbands from a standard-conform communication signal. The measurement itself is possible in the analog domain with a spectrum analyzer, which is advantageous for achieving a very low noise floor at high bandwidths. While extensive simulations are often verified by selected measurements, we went the other way round and verified our measurement method with extensive Monte Carlo simulations by randomly varying impairment model parameters [30]. The swept error power ratio presented in [25] allows for bias-free measurement of frequencydependent EVM characteristics. We sweep the stopband location of an error power ratio signal with a time-variant digital filter and synchronize swept spectrum analyzer measurements to stopband-filter sweep. Measurements during the enhancement step occur in closed loop enhancement systems with hardware in the loop. Many enhancement algorithms profit from a closed-loop adaption to adjust the enhancement system parameters, because the steady adaption allows to react to drift and variations of the system that shall be enhanced. A good example is digital predistortion (DPD) of a transmitter, where the analog PA output is captured to adapt the digital predistorter coefficient values.
1.3 Examples In the following, we give an introduction on three application areas of digital enhancement: the enhancement of PAs, data converters, and clock generation.
1.3.1 Enhancing power amplifiers Linearity and efficiency are central design goals for most of today’s RF-PAs [31]. High linearity is important for high data rate communication based on nonconstant envelope modulation like quadrature amplitude modulation or orthogonal frequency division multiplexing (OFDM). The respective standards impose strict requirements on the in-band modulation accuracy and regulatory agencies impose strict out-of-band emission limits. Both can only be achieved with a highly linear RF-PA. Efficiency, however, is in most cases also an important design goal. In the context of cellular base stations, high efficiency is desired to keep the operational costs low, and for mobile devices, high efficiency is desired to enable a long operational time of battery-powered devices. Building RF-PAs that are jointly linear and efficient is very challenging. The reason for this challenge can be traced down to the fundamental working principle
10
Digitally enhanced mixed signal systems
of RF-PAs which results in a trade-off between their linearity and efficiency [32]. The linearity-efficiency trade-off can be parameterized either in terms of the RFPA output power or in terms of the RF-PA architecture. If we focus on the output power, we notice that the highest efficiency is typically reached if the amplifier is operated close to its maximum output power [33]. However, at high output power, the signal peaks are compressed which leads to nonlinear distortion. Therefore, a certain back off from the maximum output power is necessary to fulfill the linearity requirements at the cost of reduced efficiency. To maintain high efficiency, also at back off, several high-efficiency RF-PA architectures have been invented. For cellular base stations, the Doherty architecture [34] dominates, whereas for mobile devices, envelope tracking [35] is widely used. In general, these architectures improve the low efficiency under back off at the cost of additional nonlinearity. This is the second interpretation of the linearity-efficiency trade-off: highly linear RF-PAs are typically not very efficient, but highly efficient ones tend to be rather nonlinear. Digital enhancement can be used to push the linearity-efficiency trade-off of RFPAs to regions that are hard or even impossible to reach with purely analog solutions. In the following subsections, we give a short overview on three selected methods for the digital enhancement of RF-PAs.
1.3.1.1 Crest factor reduction Crest factor reduction (CFR) is used to limit or to reduce the crest factor of the complex baseband signal in the digital baseband processor before it is passed on to subsequent blocks in the transmit chain. Reducing the crest factor, or equivalently the peak-to-average power ratio, reduces the sensitivity of the signal to the nonlinearity of the RF-PA such that the back off can be reduced. Furthermore, it simplifies the task of a subsequent digital predistorter, which is commonly used after the CFR to linearize the RF-PA. For CFR, there are many methods [36], which can be divided into two categories: (1) methods that modify certain aspects in the generation of the communication signal and (2) methods that are agnostic regarding the type of the communication signal but introduce a certain amount of nonlinear distortion. Most CFR methods of the second category are based on some form of clipping and filtering. Clipping is a very simple method to reduce the crest factor, but it introduces broadband distortion that may violate the out-of-band transmission limits. To reduce the out-ofband distortion at the cost of more, but typically less critical, in-band distortion, lowpass filtering is used after the clipping. The lowpass filtering can be implemented either per OFDM symbol in the frequency domain [37] or as an error-shaping filter in the time domain [35,38].
1.3.1.2 Digital predistortion At DPD, a nonlinear processing block is added to the digital baseband processor with the aim to linearize the overall characteristic of the wireless transmitter [26]. To achieve this goal, the nonlinear processing block must be able to model the inverse behavior of the RF-PA nonlinearity in equivalent baseband, and it must be adaptive to track the changes of the RF-PA behavior due to changing operation conditions. For narrowband signals, a memoryless DPD may be used [39], but for wideband signals,
Digitally enhanced mixed signal systems—the big picture
11
nonlinear memory effects must be taken into account [40]. Most DPD structures with memory, like the generalized memory polynomial [41], are based on the baseband Volterra series [42], which is a combination of a polynomial and a finite impulse response filter. Originally, polynomial baseband models were often limited to oddorder terms [43], but experiments showed that also even-order terms can improve their accuracy [44]. A theoretical justification for even-order terms in polynomial baseband models was later provided in [26,45,46]. For the adaptation of the DPD model coefficients there are two architectures [47]: indirect learning and direct learning. At indirect learning, a post-inverse model of the RF-PA is identified and then used as the DPD. An advantage of indirect learning is that the post-inverse can be identified very easily from the synchronized input and output signals of the RF-PA [48]. However, since the optimal coefficients of a post-inverse model are typically not identical to the optimal coefficients of a pre-inverse model, indirect learning does not give the best results [49]. To improve the performance, several iterations of direct learning can be used [49]. At direct learning, the DPD model coefficients are optimized directly based on some form of error measure like the mean square error [50] or standard specific performance metrics [51].
1.3.1.3 Burst-mode transmitter In the cases of CFR and DPD, digital processing is used to improve the performance of a wireless transmitter, which would be operational also without the digital enhancement. In a burst-mode transmitter [52], however, the digital processing is an essential part of the transmitter architecture. In a burst-mode transmitter, an arbitrary nonconstant envelope signal is preprocessed by a pulse width modulator (PWM) [53] to convert it into a sequence of constant-envelope phase-modulated pulses. These pulses can be amplified efficiently, because the PA is either off or it is operating at its maximum output power. After amplification, the original signal can be regenerated by an analog bandpass filter. A critical component in such a burst mode transmitter is the PWM. With a conventional digital PWM, a very high sampling rate is required to reach sufficient dynamic range. To avoid this problem, the digital bandlimited PWM proposed in [52] mimics the analog natural-sampling PWM and only generates a few harmonics to avoid aliasing. This way, a high dynamic range can be reached without using an exceedingly high sampling rate.
1.3.2 Enhancing data converters Today’s communication systems are limited through the performance of the employed ADCs [54,55]. As functionalities of communication systems are moved into the digital domain to obtain more flexibility [5,6,10], the requirements on data converters increase in terms of higher accuracy, and larger bandwidth for a given energy budget [5,6,10,12,56,57]. One effective approach to overcome the processing limits of sampling systems is to exploit parallelism, which was theoretically introduced by Papoulis’ generalized sampling expansion [58]. In practice, however, only a few parallel multichannel sampling structures [59] have been further analyzed and successfully
12
Digitally enhanced mixed signal systems
applied [60–63], where the time-interleaved structure is the most successful one [55,64–66]. As depicted in Figure 1.6, a time-interleaved ADC is an array of M parallel channel ADCs operating in a time-interleaved manner [67]. Therefore, the sampling rate at the output of an ideal time-interleaved ADC with M channels is M times higher than the sampling rate of the channel ADCs. Like for a single ADC, a time-interleaved ADC requires a bandlimited signal, which is realized in Figure 1.6 by a lowpass filter serving as analog preprocessing block. Depending on the application and the requirements, the channel ADCs themselves can be realized in different converter technologies [64,66,68]. In practice, mismatches among these channel ADCs limit the performance of time-interleaved ADCs [69–75] and need to be digitally corrected. Therefore, a time-interleaved ADC is a primary example for a mixed-signal processing system that needs extensive digital postprocessing to achieve a reasonable performance. In order to reduce the impact of mismatches in time-interleaved ADCs, the mismatches need to be modeled, identified, and corrected. Errors due to gain and offset mismatches are rather simple to mitigate, e.g., [21,76–78], where by contrast, the modeling, identification, and correction of frequency-dependent mismatches among the channels are much more involved and have been an active research area [79–85, 85–112]. The identification of mismatches can be done off-line by interrupting the normal operation and using an identification mode that often requires special calibration signals, e.g., [21,85,113,114], or online during normal operation by applying blind and semi-blind identification methods, e.g., [76,85–88,91,93,98,112,115–117]. The off-line calibration of mismatches is solved to a large extent; however, blind methods for online calibration are often favorable. The blind calibration of a timeinterleaved ADC is illustrated on the right side of Figure 1.6. In most cases, the blocks for blind identification and correction are not separated, as the identification Crosslayer Analog preprocessing
xc(t)
Lowpass filter
D/C Interblock x^c(t)
Channel ADC
Digital postprocessing
Feedforward Multiplexer ^x[n] Mismatch calibration
x[n]
Channel ADC
Channel ADC
Figure 1.6 A time-interleaved ADC is a typical digital postprocessing problem
Digitally enhanced mixed signal systems—the big picture
13
relies on the corrected output and the correction structure is adjusted according to the identified mismatch parameters. Therefore, a blind calibration method combines correction structures and identification algorithms. So far, blind calibration methods for time-interleaved ADCs have either exploited a priori inter-block knowledge about periodically time-varying systems [101,118], a posteriori cross-layer information about communication systems, or a posteriori inter-block knowledge by utilizing additional hardware to derive reference signals. For example, in [111], an a posteriori cross-layer approach is applied to correct gain, timing, and bandwidth mismatches by an M -periodically time-varying feedforward equalizer. The reference signal for the calibration is derived from the decoded symbols and resampled from symbol rate to sampling rate. A least mean square algorithm uses the reference signal to find the coefficients of a time-varying correction filter. By contrast, in [119], an additional low-resolution ADC is applied to obtain a posteriori auxiliary inter-block information for finding the coefficients of a timevarying correction filter in the least mean squares sense. Accordingly, the methods in [119] and in [111] are similar but use either a cross-layer method with the disadvantage of increased application dependency or an inter-block method with the disadvantage of additional hardware. A time-interleaved ADC is a primary example for the necessity of postprocessing in mixed-signal processing systems. Without mismatch correction, the applicability of time-interleaved ADCs is limited. Inter-block and cross-layer information can help to improve the blind calibration of mismatches significantly.
1.3.3 Enhancing clock generation Phase-domain all-digital phase-locked loops replace analog components by equivalent digital circuits for digitally implementing a phase-domain control loop [1,120–141]. Therefore, the digital circuit realization of a control loop mimics the analog control loop. The basic principle is illustrated in Figure 1.7. The digital control signal sets the output frequencies of the digitally controlled oscillator. Ideally, no further processing would be needed. However, due to imperfections of the digitally controlled oscillator, the frequency and the phase at its output differ from the desired ones. By measuring the phase of the oscillator’s output signal and comparing it to the desired phase, the digital controller aims to minimize the difference between the desired and
Digital preprocessing x [n]
Frequencyto-phase converter
Phase detector
D/C Loop filter
^x [n]
Digitally controlled oscillator Feedback
Interblock
Figure 1.7 All-digital phase-locked loop with a digital control loop
^x (t) c
14
Digitally enhanced mixed signal systems
the actual output at the oscillator. All-digital phase-locked loops allow for flexible implementations, which are difficult or even impossible to realize with analog phaselocked loops. The digital loop filters in all-digital phase-locked loops can be easily adapted [131,138,141,142] or can even be replaced by more complex control methods leading to reduced lock times and better phase noise suppression [143,144]. Hence, all-digital phase-locked loops match the basic idea of digitally enhanced mixed-signal processing systems. Analog functionality is shifted to the digital domain to gain flexibility in the enhancement of mixed-signal circuits, i.e., the digitally controlled oscillator, which suffers from process, voltage, and temperature variations. By deriving vital modes and performing careful analysis, new enhancement mechanism can be devised. To model and analyze digital phase-locked loops, it would be intuitive to use discrete-time signal processing and the z-transform from the beginning [145,146]. Nevertheless, the first discrete-time models for phase-domain, all-digital phaselocked loops were based on linear approximations of the Laplace transform [127,130]. In [147], the first z-domain model was introduced that directly represents the behavior of all-digital phase-locked loops in the digital domain and therefore overcomes the limitations of [130]. With this model, the steady state and the transient behavior can be sufficiently accurately evaluated and, for example, the noise suppression capability and the locking time can be investigated. To further model and analyze characteristics of phase-domain all-digital phase-locked loops, a nonuniform z-transform model was addressed in [148], which also takes the retiming mechanism of the clock signal, i.e., a change of the sampling period, into account. This retiming of the clock signals is required for synchronizing the output frequency of the all-digital phase-locked loop and the frequency of the reference clock and results in the retimed reference clock. The analysis with the new model presented in [148] reveals that in steady state, the retimed reference clock undergoes a periodic pattern. This periodic pattern causes a periodic time-varying behavior and generates in combination with the effect of injection pulling a significant amount of additional spurs, i.e., modulation products, in the output spectrum [149]. To mitigate most of these spurs and to simplify the implementation complexity, a new synchronous reference architecture [149,150] was proposed, which does not require an explicit retiming of the reference clock. On the one hand, all-digital phase locked loops take advantages of the digital implementation of analog circuits by allowing for more complex control methods and more possibilities in the compensation of process, voltage, and temperature variations. On the other hand, the design complexity of such systems substantially increases, since, for example, nonlinear effects due to quantization in digital circuits and the much stronger interaction between the analog and the digital domain have to be taken into account.
1.4 Conclusion We have seen that digitally enhanced mixed-signal processing systems are required for future systems to gain flexibility and to overcome performance limitations. By moving
Digitally enhanced mixed signal systems—the big picture
15
functionality to the digital domain and exploiting digital algorithms and structures for improving, complementing, and mimicking analog and mixed-signal circuits, the overall system performance can be optimized. Digitally enhanced mixed-signal processing systems pose many research questions. The design complexity increases, as the mutual dependency of analog and digital circuits on different system abstraction layers significantly increases. The algorithms and structures for digital enhancement need to be optimized for real-time processing. They need to combine high flexibility with low-energy consumption, which are typically contradictory requirements. To model, analyze, and finally enhance mixed-signal processing systems, we need a background from rather diverse fields to optimize the overall system.
References [1]
[2] [3] [4] [5] [6]
[7]
[8]
[9]
[10]
[11] [12]
Staszewski RB, Wallberg JL, Rezeq S, et al. All-digital PLL and transmitter for mobile phones. IEEE Journal of Solid-State Circuits. 2005;40(12): 2469–2482. Grebennikov A, Sokal NO, and Franco MJ. Switchmode RF and Microwave Power Amplifiers. 2nd ed. Academic Press; 2012. Moore GE. No exponential is forever: But “Forever” can be delayed!. In: IEEE International Solid-State Circuits Conference (ISSCC); 2003. p. 20–23. Murmann B. Digitally assisted analog circuits. IEEE Micro. 2006;26(2): 38–47. Dielacher F, Vogel C, Singerl P, et al. A holistic design approach for systems on chip. In: IEEE International SOC Conference (SOCC); 2009. p. 301–306. Murmann B, Vogel C, and Koeppl H. Digitally enhanced analog circuits: System aspects. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2008. p. 560–563. Vogel C and Mendel S. A flexible and scalable structure to compensate frequency response mismatches in time-interleaved ADCs. IEEE Transactions on Circuits and Systems I: Regular Papers. 2009;56(11):2463–2475. Vogel C and Krall C. Digital compensation of in-band image signals caused by M-periodic nonuniform zero-order hold signals. Ubiquitous Computing and Communication Journal (UBICC). 2009. Special Issue CSNDSP 2008. p. 1–10 Soudan M and Vogel C. Correction structures for linear weakly time-varying systems. IEEE Transactions on Circuits and Systems I: Regular Papers. 2012;59(9):2075–2084. Vogel C, Mendel S, Singerl P, et al. Digital signal processing for data converters in mixed-signal systems. e & i Elektrotechnik und Informationstechnik. 2009;126(11):390–395. IEEE Std 1241-2000. IEEE Standard for Terminology and Test Methods for Analog-to-Digital Converters; 2001. van Roermund A, Hegt H, Harpe P, et al. Smart AD and DA converters. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2005. p. 4062–4065.
16
Digitally enhanced mixed signal systems
[13] [14] [15] [16]
[17]
[18] [19] [20] [21]
[22] [23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
Eron M, Kim B, Raab F, et al. The head of the class. IEEE Microwave Magazine. 2011;12(7):S16–S33. Kenington PB. High-Linearity RF Amplifier Design. Artech House; 2000. Cripps SC. RF Power Amplifiers for Wireless Communications. 2nd ed. Artech House; 2006. Bösch W and Gatti G. Measurement and simulation of memory effects in predistortion linearizers. IEEE Transactions on Microwave Theory and Techniques. 1989;37(12):1885–1890. Singerl P and Kubin G. Chebyshev approximation of baseband Volterra series for wideband RF power amplifiers. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2005. p. 2655–2658. Candes EJ and Wakin MB. An introduction to compressive sampling. IEEE Signal Processing Magazine. 2008;25(2):21–30. Unser M. Sampling – 50 years after Shannon. Proceedings of the IEEE. 2000;88(4):569–587. Blu T, Dragotti PL, Vetterli M, et al. Sparse sampling of signal innovations. IEEE Signal Processing Magazine. 2008;25(2):31–40. Vogel C. Modeling, Identification, and Compensation of Channel Mismatch Errors in Time-Interleaved Analog-to-Digital Converters [Dissertation]. Graz University of Technology; 2005. Mendel S. Signal Processing in Phase-Domain All-Digital Phase-Locked Loops [Dissertation]. Graz University of Technology; 2009. Saleem S. Adaptive Calibration of Frequency Response Mismatches in TimeInterleaved Analog-to-Digital Converters [Dissertation]. Graz University of Technology; 2010. Henzler S. Digitalization of mixed-signal functionality in nanometer technologies. In: IEEE/ACM International Conference on Computer-Aided Design (ICCAD); 2010. p. 252–255. Freiberger K. Measurement Methods for Estimating the Error Vector Magnitude of OFDM Transceivers [Dissertation]. Graz University of Technology; 2017. Enzinger H. Behavioral Modeling and Digital Predistortion of Radio Frequency Power Amplifiers [Dissertation]. Graz University of Technology; 2018. Freiberger K, Enzinger H, and Vogel C. SLIC EVM – Error vector magnitude without demodulation. In: 2017 89th ARFTG Microwave Measurement Conference (ARFTG); 2017. p. 1–4. Pedro JC and de Carvalho NB. Characterizing nonlinear RF circuits for their in-band signal distortion. IEEE Transactions on Instrumentation and Measurement. 2002;51(3):420–426. Freiberger K, Enzinger H, and Vogel C. A noise power ratio measurement method for accurate estimation of the error vector magnitude. IEEE Transactions on Microwave Theory and Techniques. 2017;65(5):1632–1645. Freiberger K, Enzinger H, and Vogel C. The error power ratio estimates EVM for a wide class of impairments: Monte Carlo simulations. In: 2017
Digitally enhanced mixed signal systems—the big picture
[31]
[32] [33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
17
Integrated Nonlinear Microwave and Millimetre-wave Circuits Workshop (INMMiC); 2017. p. 1–3. Raab FH, Asbeck P, Cripps S, et al. Power amplifiers and transmitters for RF and microwave. IEEE Transactions on Microwave Theory and Techniques. 2002;50(3):814–826. McCune E. A technical foundation for RF CMOS power amplifiers: Parts 1 to 6. IEEE Solid-State Circuits Magazine. 2015–2016. Enzinger H, Freiberger K, and Vogel C. A joint linearity-efficiency model of radio frequency power amplifiers. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2016. p. 281–284. Camarchia V, Pirola M, Quaglia R, et al. The Doherty power amplifier: Review of recent solutions and trends. IEEE Transactions on Microwave Theory and Techniques. 2015;63(2):559–571. Enzinger H, Freiberger K, and Vogel C. Competitive linearity for envelope tracking: Dual-band crest factor reduction and 2D-vector-switched digital predistortion. IEEE Microwave Magazine. 2018;19(1):69–77. Han SH and Lee JH. An overview of peak-to-average power ratio reduction techniques for multicarrier transmission. IEEE Wireless Communications. 2005;12(2):56–65. Armstrong J. Peak-to-average power reduction for OFDM by repeated clipping and frequency domain filtering. Electronics Letters. 2002;38(5): 246–247. Kim WJ, Cho KJ, Stapleton SP, et al. An efficient crest factor reduction technique for wideband applications. Analog Integrated Circuits and Signal Processing. 2007;51(1):19–26. Cavers JK. Amplifier linearization using a digital predistorter with fast adaptation and low memory requirements. IEEE Transactions on Vehicular Technology. 1990;39(4):374–382. Raich R and Zhou GT. On the modeling of memory nonlinear effects of power amplifiers for communication applications. In: IEEE 10th Digital Signal Processing Workshop and 2nd Signal Processing Education Workshop; 2002. p. 7–10. Morgan DR, Ma Z, Kim J, et al. A generalized memory polynomial model for digital predistortion of RF power amplifiers. IEEE Transactions on Signal Processing. 2006;54(10):3852–3860. Benedetto S, Biglieri E, and Daffara R. Modeling and performance evaluation of nonlinear satellite links – A Volterra Series approach. IEEE Transactions on Aerospace and Electronic Systems. 1979;AES-15(4): 494–507. Zhou GT, Qian H, Ding L, et al. On the baseband representation of a bandpass nonlinearity. IEEE Transactions on Signal Processing. 2005;53(8): 2953–2957. Ding L and Zhou GT. Effects of even-order nonlinear terms on power amplifier modeling and predistortion linearization. IEEE Transactions on Vehicular Technology. 2004;53(1):156–162.
18
Digitally enhanced mixed signal systems
[45]
[46]
[47] [48]
[49]
[50]
[51]
[52]
[53]
[54] [55]
[56] [57]
[58] [59] [60]
Enzinger H, Freiberger K, and Vogel C. Analysis of even-order terms in memoryless and quasi-memoryless polynomial baseband models. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2015. p. 1714–1717. Enzinger H, Freiberger K, Kubin G, et al. Baseband Volterra filters with even-order terms: Theoretical foundation and practical implications. In: 50th Asilomar Conference on Signals, Systems, and Computers; 2016. p. 220–224. Psaltis D, Sideris A, and Yamamura AA. A multilayered neural network controller. IEEE Control Systems Magazine. 1988;8(2):17–21. Enzinger H, Freiberger K, Kubin G, et al. A survey of delay and gain correction methods for the indirect learning of digital predistorters. In: IEEE International Conference on Electronics, Circuits and Systems (ICECS); 2016. p. 285–288. Guan L and Zhu A. Dual-loop model extraction for digital predistortion of wideband RF power amplifiers. IEEE Microwave and Wireless Components Letters. 2011;21(9):501–503. Harmon J and Wilson SG. Complex nonlinear adaptive predistortion. In: 46th Annual Conference on Information Sciences and Systems (CISS); 2012. p. 1–6. Freiberger K, Wolkerstorfer M, Enzinger H, et al. Digital predistorter identification based on constrained multi-objective optimization of WLAN standard performance metrics. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2015. p. 862–865. Hausmair K, Chi S, Singerl P, et al. Aliasing-free digital pulse-width modulation for burst-mode RF transmitters. IEEE Transactions on Circuits and Systems I: Regular Papers. 2013;60(2):415–427. Enzinger H and Vogel C. Analytical description of multilevel carrier-based PWM of arbitrary bounded input signals. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2014. p. 1030–1033. Maloberti F. High-speed data converters for communication systems. IEEE Circuits and Systems Magazine. 2001;1(1):26–36. Vogel C and Johansson H. Time-interleaved analog-to-digital converters: status and future directions. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2006. p. 3386–3389. Poulton K, Corcoran JJ, and Hornak T. A 1-GHz 6-bit ADC system. IEEE Journal of Solid-State Circuits. 1987;22(6):962–970. Poulton K, Neff R, Setterberg B, et al. A 20 GS/s 8 b ADC with a 1 MB memory in 0.18 μm CMOS. In: IEEE International Solid-State Circuits Conference (ISSCC); 2003. p. 318–496. Papoulis A. Generalized sampling expansion. IEEE Transactions on Circuits and Systems. 1977;24(11):652–654. Brown J. Multi-channel sampling of low-pass signals. IEEE Transactions on Circuits and Systems. 1981;28(2):101–106. Petraglia A and Mitra SK. High-speed A/D conversion incorporating a QMF bank. IEEE Transactions on Instrumentation and Measurement. 1992;41(3):427–431.
Digitally enhanced mixed signal systems—the big picture [61]
[62]
[63]
[64]
[65]
[66]
[67] [68] [69]
[70]
[71]
[72]
[73]
[74]
[75]
19
Velazquez SR, Nguyen TQ, and Broadstone SR. Design of hybrid filter banks for analog/digital conversion. IEEE Transactions on Signal Processing. 1998;46(4):956–967. Löwenborg P. Asymmetric Filter Banks for Mitigation of Mismatch Errors in High-Speed Analog-to-Digital Converters [Dissertation]. Linköping University; 2002. Löwenborg P, Johansson H, and Wanhammar L. Two-channel digital and hybrid analog/digital multirate filter banks with very low-complexity analysis or synthesis filters. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing. 2003;50(7):355–367. Draxelmayr D. A 6b 600 MHz 10 mW ADC array in digital 90 nm CMOS. In: IEEE International Solid-State Circuits Conference (ISSCC); 2004. p. 264–527. Ginsburg BP and Chandrakasan AP. Dual time-interleaved successive approximation register ADCs for an ultra-wideband receiver. IEEE Journal of Solid-State Circuits. 2007;42(2):247–257. Doris K, Janssen E, Nani C, et al. A 480 mW 2.6 GS/s 10b time-interleaved ADC With 48.5 dB SNDR up to Nyquist in 65 nm CMOS. IEEE Journal of Solid-State Circuits. 2011;46(12):2821–2833. Black WC and Hodges DA. Time interleaved converter arrays. IEEE Journal of Solid-State Circuits. 1980;15(6):1022–1029. Kozak M and Kale I. Oversampled Delta-Sigma Modulators: Analysis, Applications and Novel Topologies. Kluwer; 2003. Kurosawa N, Kobayashi H, Maruyama K, et al. Explicit analysis of channel mismatch effects in time-interleaved ADC systems. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications. 2001;48(3):261–271. Leger G, Peralias EJ, Rueda A, et al. Impact of random channel mismatch on the SNR and SFDR of time-interleaved ADCs. IEEE Transactions on Circuits and Systems I: Regular Papers. 2004;51(1):140–150. Vogel C. The impact of combined channel mismatch effects in timeinterleaved ADCs. IEEE Transactions on Instrumentation and Measurement. 2005;54(1):415–427. Vogel C. Comprehensive error analysis of combined channel mismatch effects in time-interleaved ADCs. In: 20th IEEE Instrumentation Technology Conference; 2003. p. 733–738. Vogel C and Kubin G. Modeling of time-interleaved ADCs with nonlinear hybrid filter banks. AEU – International Journal of Electronics and Communications. 2005;59(5):288–296. Vogel C and Kubin G. Analysis and compensation of nonlinearity mismatches in time-interleaved ADC arrays. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2004. p. 593–596. El-Chammas M and Murmann B. General analysis on the impact of phase-skew in time-interleaved ADCs. IEEE Transactions on Circuits and Systems I: Regular Papers. 2009;56(5):902–910.
20
Digitally enhanced mixed signal systems
[76]
[77]
[78]
[79]
[80]
[81]
[82]
[83]
[84]
[85]
[86]
[87]
[88]
[89]
Fu D, Dyer KC, Lewis SH, et al. A digital background calibration technique for time-interleaved analog-to-digital converters. IEEE Journal of Solid-State Circuits. 1998;33(12):1904–1911. Jamal SM, Fu D, Chang NCJ, et al. A 10-b 120-Msample/s time-interleaved analog-to-digital converter with digital background calibration. IEEE Journal of Solid-State Circuits. 2002;37(12):1618–1627. Ferragina V, Fornasari A, Gatti U, et al. Gain and offset mismatch calibration in time-interleaved multipath A/D sigma-delta modulators. IEEE Transactions on Circuits and Systems I: Regular Papers. 2004;51(12):2365–2373. Johansson H and Löwenborg P. Reconstruction of nonuniformly sampled bandlimited signals by means of digital fractional delay filters. IEEE Transactions on Signal Processing. 2002;50(11):2757–2767. Divi V and Wornell G. Signal recovery in time-interleaved analog-to-digital converters. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP); 2004. p. 593–596. Jamal SM, Fu D, Singh MP, et al. Calibration of sample-time error in a two-channel time-interleaved analog-to-digital converter. IEEE Transactions on Circuits and Systems I: Regular Papers. 2004;51(1):130–139. Vogel C and Kubin G. Time-interleaved ADCs in the context of hybrid filter banks. In: URSI International Symposium on Signals, Systems, and Electronics; 2004. p. 214–217. Vogel C, Draxelmayr D, and Kubin G. Spectral shaping of timing mismatches in time-interleaved analog-to-digital converters. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2005. p. 1394–1397. Vogel C, Pammer V, and Kubin G. A novel channel randomization method for time-interleaved ADCs. In: IEEE Instrumentation and Measurement Technology Conference; 2005. p. 150–155. Seo M, Rodwell MJW, and Madhow U. Comprehensive digital correction of mismatch errors for a 400-msamples/s 80-dB SFDR time-interleaved analog-to-digital converter. IEEE Transactions on Microwave Theory and Techniques. 2005;53(3):1072–1082. Seo M, Rodwell MJW, and Madhow U. Blind correction of gain and timing mismatches for a two-channel time-interleaved analog-to-digital converter: experimental verification. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2006. p. 3394–3397. Divi V and Wornell G. Scalable blind calibration of timing skew in highresolution time-interleaved ADCs. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2006. p. 3390–3393. Huang S and Levy BC. Adaptive blind calibration of timing offset and gain mismatch for two-channel time-interleaved ADCs. IEEE Transactions on Circuits and Systems I: Regular Papers. 2006;53(6):1278–1288. Wang CY and Wu JT. A background timing-skew calibration technique for time-interleaved analog-to-digital converters. IEEE Transactions on Circuits and Systems II: Express Briefs. 2006;53(4):299–303.
Digitally enhanced mixed signal systems—the big picture [90]
[91]
[92]
[93]
[94]
[95]
[96] [97]
[98]
[99]
[100]
[101]
[102]
21
Oh Y and Murmann B. System embedded ADC calibration for OFDM receivers. IEEE Transactions on Circuits and Systems I: Regular Papers. 2006;53(8):1693–1703. Huang S and Levy BC. Blind calibration of timing offsets for four-channel time-interleaved ADCs. IEEE Transactions on Circuits and Systems I: Regular Papers. 2007;54(4):863–876. Divi V and Wornell G. Bandlimited signal reconstruction from noisy periodic nonuniform samples in time-interleaved ADCS. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2008. p. 3721–3724. Divi V and Wornell GW. Blind calibration of timing skew in time-interleaved analog-to-digital converters. IEEE Journal of Selected Topics in Signal Processing. 2009;3(3):509–522. Johansson H and Löwenborg P. Reconstruction of nonuniformly sampled bandlimited signals by means of time-varying discrete-time FIR filters. EURASIP Journal on Applied Signal Processing. 2006;2006:1–18. Prendergast RS, Levy BC, and Hurst PJ. Reconstruction of bandlimited periodic nonuniformly sampled signals through multirate filter banks. IEEE Transactions on Circuits and Systems I: Regular Papers. 2004;51(8):1612–1622. Vogel C. A frequency domain method for blind identification of timing mismatches in time-interleaved ADCs. In: NORCHIP; 2006. p. 45–48. Johansson H, Löwenborg P, and Vengattaramane K. Least-squares and minimax design of polynomial impulse response FIR filters for reconstruction of two-periodic nonuniformly sampled signals. IEEE Transactions on Circuits and Systems I: Regular Papers. 2007;54(4):877–888. Elbornsson J, Gustafsson F, and Eklund JE. Blind equalization of time errors in a time-interleaved ADC system. IEEE Transactions on Signal Processing. 2005;53(4):1413–1424. Haftbaradaran A and Martin KW. A background sample-time error calibration technique using random data for wide-band high-resolution time-interleaved ADCs. IEEE Transactions on Circuits and Systems II: Express Briefs. 2008;55(3):234–238. Saleem S and Vogel C. LMS-based identification and compensation of timing mismatches in a two-channel time-interleaved analog-to-digital converter. In: NORCHIP 2007; 2007. p. 1–4. Vogel C, Saleem S, and Mendel S. Adaptive blind compensation of gain and timing mismatches in M-channel time-interleaved ADCs. In: 15th IEEE International Conference on Electronics, Circuits and Systems (ICECS); 2008. p. 49–52. Iroaga E, Murmann B, and Nathawad L. A background correction technique for timing errors in time-interleaved analog-to-digital converters. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2005. p. 5557–5560.
22
Digitally enhanced mixed signal systems
[103]
[104]
[105]
[106]
[107]
[108]
[109]
[110]
[111]
[112]
[113]
[114]
[115]
[116]
Marelli D, Mahata K, and Fu M. Linear LMS compensation for timing mismatch in time-interleaved ADCs. IEEE Transactions on Circuits and Systems I: Regular Papers. 2009;56(11):2476–2486. Vogel C, Hotz M, Saleem S, et al. A review on low-complexity structures and algorithms for the correction of mismatch errors in time-interleaved ADCs. In: 10th IEEE International NEWCAS Conference; 2012. p. 349–352. Tsai TH, Hurst PJ, and Lewis SH. Bandwidth mismatch and its correction in time-interleaved analog-to-digital converters. IEEE Transactions on Circuits and Systems II: Express Briefs. 2006;53(10):1133–1137. Satarzadeh P, Levy BC, and Hurst PJ. Bandwidth mismatch correction for a two-channel time-interleaved A/D converter. In: IEEE International Symposium on Circuits and Systems; 2007. p. 1705–1708. Mendel S and Vogel C. A compensation method for magnitude response mismatches in two-channel time-interleaved analog-to-digital converters. In: IEEE International Conference on Electronics, Circuits and Systems (ICECS); 2006. p. 712–715. Mendel S and Vogel C. On the compensation of magnitude response mismatches in M-channel time-interleaved ADCs. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2007. p. 3375–3378. Johansson H and Löwenborg P. A least-squares filter design technique for the compensation of frequency response mismatch errors in time-interleaved A/D converters. IEEE Transactions on Circuits and Systems II: Express Briefs. 2008;55(11):1154–1158. Lim YC, Zou YX, Lee JW, et al. Time-interleaved analog-to-digital-converter compensation using multichannel filters. IEEE Transactions on Circuits and Systems I: Regular Papers. 2009;56(10):2234–2247. Tsai TH, Hurst PJ, and Lewis SH. Correction of mismatches in a timeinterleaved analog-to-digital converter in an adaptively equalized digital communication receiver. IEEE Transactions on Circuits and Systems I: Regular Papers. 2009;56(2):307–319. Satarzadeh P, Levy BC, and Hurst PJ. Adaptive semiblind calibration of bandwidth mismatch for two-channel time-interleaved ADCs. IEEE Transactions on Circuits and Systems I: Regular Papers. 2009;56(9):2075–2088. JenqYC. Digital spectra of nonuniformly sampled signals: A robust sampling time offset estimation algorithm for ultra high-speed waveform digitizers using interleaving. IEEE Transactions on Instrumentation and Measurement. 1990;39(1):71–75. Pereira JMD, Girao PMBS, and Serra AMC. An FFT-based method to evaluate and compensate gain and offset errors of interleaved ADC systems. IEEE Transactions on Instrumentation and Measurement. 2004;53(2):423–430. Elbornsson J, Gustafsson F, and Eklund JE. Blind adaptive equalization of mismatch errors in a time-interleaved A/D converter system. IEEE Transactions on Circuits and Systems I: Regular Papers. 2004;51(1): 151–158. Seo M, Rodwell M, and Madhow U. Generalized blind mismatch correction for two-channel time-interleaved A-to-D converters. In: IEEE International
Digitally enhanced mixed signal systems—the big picture
[117]
[118]
[119]
[120]
[121]
[122] [123]
[124]
[125]
[126]
[127]
[128]
[129]
[130]
23
Conference on Acoustics, Speech and Signal Processing (ICASSP); 2007. p. 1505–1508. Seo M and Rodwell M. Generalized blind mismatch correction for a twochannel time-interleaved ADC: Analytic approach. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2007. p. 109–112. Saleem S and Vogel C. On blind identification of gain and timing mismatches in time-interleaved analog-to-digital converters. In: 33rd International Conference on Telecommunications and Signal Processing; 2010. p. 151–155. Saleem S and Vogel C. Adaptive compensation of frequency response mismatches in high-resolution time-interleaved ADCs using a low-resolution ADC and a time-varying filter. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2010. p. 561–564. Alen DJV and Somani AK. An all digital phase locked loop fault tolerant clock. In: IEEE International Symposium on Circuits and Systems (ISCAS); 1991. p. 3170–3173. Kajiwara A and Nakagawa M. A new PLL frequency synthesizer with high switching speed. IEEE Transactions on Vehicular Technology. 1992;41(4):407–413. Best RE. Phase-Locked Loops Design, Simulation, and Applications. 5th ed. McGraw-Hill; 2003. Waheed K and Staszewski RB. Time-domain behavioral modeling of a multigigahertz digital RF oscillator using VHDL. In: 48th Midwest Symposium on Circuits and Systems; 2005. p. 1669–1672. Waheed K and Staszewski RB. Characterization of deep-submicron varactor mismatches in a digitally controlled oscillator. In: IEEE Custom Integrated Circuits Conference; 2005. p. 605–608. Bashir I, Staszewski RB, and Eliezer O. Tuning word retiming of a digitallycontrolled oscillator using RF built-in self test. In: IEEE Dallas/CAS Workshop on Design, Applications, Integration and Software; 2006. p. 103–106. Waheed K, Staszewski RB, and Wallberg J. Injection spurs due to reference frequency retiming by a channel dependent clock at the ADPLL RF output and its mitigation. In: IEEE International Symposium on Circuits and Systems (ISCAS); 2007. p. 3291–3294. Staszewski RB, Leipold D, Muhammad K, et al. Digitally controlled oscillator (DCO)-based architecture for RF frequency synthesis in a deepsubmicrometer CMOS Process. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing. 2003;50(11):815–828. Staszewski RB, Hung CM, Maggio K, et al. All-digital phase-domain TX frequency synthesizer for Bluetooth radios in 0.13μm CMOS. In: IEEE International Solid-State Circuits Conference (ISSCC); 2004. p. 272–527. Staszewski RB, Muhammad K, Leipold D, et al. All-digital TX frequency synthesizer and discrete-time receiver for Bluetooth radio in 130-nm CMOS. IEEE Journal of Solid-State Circuits. 2004;39(12):2278–2291. Staszewski RB and Balsara PT. Phase-domain all-digital phase-locked loop. IEEE Transactions on Circuits and Systems II: Express Briefs. 2005;52(3):159–163.
24
Digitally enhanced mixed signal systems
[131] [132]
[133]
[134]
[135]
[136]
[137] [138]
[139]
[140]
[141]
[142] [143]
[144] [145]
[146]
Staszewski RB, Shriki G, and Balsara PT.All-digital PLL with ultra fast acquisition. In: IEEE Asian Solid-State Circuits Conference; 2005. p. 289–292. Staszewski RB, Fernando C, and Balsara PT. Event-driven simulation and modeling of phase noise of an RF oscillator. IEEE Transactions on Circuits and Systems I: Regular Papers. 2005;52(4):723–733. Staszewski RB, Hung CM, Barton N, et al. A digitally controlled oscillator in a 90 nm digital CMOS process for mobile phones. IEEE Journal of Solid-State Circuits. 2005;40(11):2203–2211. Staszewski RB, Leipold D, and Balsara PT. Direct frequency modulation of an ADPLL for Bluetooth/GSM with injection pulling elimination. IEEE Transactions on Circuits and Systems II: Express Briefs. 2005;52(6):339–343. Staszewski RB, Staszewski R, Wallberg JL, et al. SoC with an integrated DSP and a 2.4-GHz RF transmitter. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2005;13(11):1253–1265. Staszewski RB, Wallberg J, Rezeq S, et al. All-digital PLL and GSM/EDGE transmitter in 90 nm CMOS. In: ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005; 2005. p. 316–600, Vol. 1. Staszewski RB and Balsara PT. All-Digital Frequency Synthesizer in Deep-Submicron CMOS. Wiley; 2006. Staszewski RB, Wallberg J, and Balsara PT. All-digital PLL with variable loop type characteristics. In: IEEE Dallas/CAS Workshop on Design, Applications, Integration and Software; 2006. p. 115–118. Staszewski RB, Vemulapalli S, Vallur P, et al. 1.3 V 20 ps time-to-digital converter for frequency synthesis in 90-nm CMOS. IEEE Transactions on Circuits and Systems II: Express Briefs. 2006;53(3):220–224. Staszewski RB, Wallberg J, Hung CM, et al. LMS-based calibration of an RF digitally controlled oscillator for mobile phones. IEEE Transactions on Circuits and Systems II: Express Briefs. 2006;53(3):225–229. Staszewski RB and Balsara PT. All-digital PLL with ultra fast settling. IEEE Transactions on Circuits and Systems II: Express Briefs. 2007;54(2): 181–185. Xiu L, Li W, Meiners J, et al. A novel all-digital PLL with software adaptive filter. IEEE Journal of Solid-State Circuits. 2004;39(3):476–483. Mendel S and Vogel C. Improved lock-time in all-digital phase-locked loops due to binary search acquisition. In: 15th IEEE International Conference on Electronics, Circuits and Systems (ICECS); 2008. p. 384–387. Namgoong W. Observer-controller digital PLL. IEEE Transactions on Circuits and Systems I: Regular Papers. 2010;57(3):631–641. Shayan YR and Le-Ngoc T. All digital phase-locked loop: Concepts, design and applications. IEE Proceedings F – Radar and Signal Processing. 1989;136(1):53–56. Almeida TM and Piedade MS. High performance analog and digital PLL design. In: Circuits and Systems, 1999. ISCAS ’99. Proceedings of the 1999 IEEE International Symposium on; 1999. p. 394–397, Vol. 4.
Digitally enhanced mixed signal systems—the big picture [147] [148]
[149]
[150]
25
Mendel S and Vogel C. A z-domain model and analysis of phase-domain all-digital phase-locked loops. In: NORCHIP 2007; 2007. p. 1–6. Mendel S, Vogel C, and Dalt ND. Signal and timing analysis of a phasedomain all-digital phase-locked loop with reference retiming mechanism. In: MIXDES-16th International Conference Mixed Design of Integrated Circuits & Systems; 2009. p. 681–687. Mendel S, Vogel C, and Dalt ND. A phase-domain all-digital phase-locked loop architecture without reference clock retiming. IEEE Transactions on Circuits and Systems II: Express Briefs. 2009;56(11):860–864. Mendel S and Vogel C. Frequency to phase converter with uniform sampling for all digital phase locked loops. Patent application US 2010074387 A1, March 25, 2010.
This page intentionally left blank
Chapter 2
Nonlinear modeling Raphael Vansebrouck1 , Dang-Kièn Germain Pham2 , Chadi Jabbour2 , and Patricia Desgreys2
2.1 Introduction In a transceiver, as shown in Figure 2.1, blocks can be first considered by linear models as long as only the wanted signal is collected at the receiver input. However, this assumption quickly falls apart when unwanted signals (interferers) are considered at the receiver input. Indeed, when interferers are collected by the receiver along with the desired signal, harmonic distortions and intermodulation terms can then jam the desired signal. These terms are generated by the nonlinear behavior of several blocks in the transceiver. It is then mandatory to use nonlinear models for these blocks in order to predict the behavior of the transceiver in every case and to suppress these nonlinear distortions. Regarding the transmitter, the power amplifier is the largest contributor for the nonlinear distortions. On the contrary, in the receiver, several blocks can be nonlinear and the main contributor can change depending on the receiver design and its context of use. To begin with this chapter, in Section 2.2, we introduce some popular nonlinear models used in the literature which will be split in two parts: parametric in and nonparametric models. After this short introduction on nonlinear models, in Section 2.3, we will then pursue our study with nonlinear models best fitting with the main blocks of a transceiver: power amplifiers, low noise amplifiers (LNAs) and baseband blocks. Finally, in Section 2.4, an introduction on digital compensation of nonlinear distortions will be done showing the main correction architectures.
2.2 Nonlinear models In this chapter, we distinguish two kinds of nonlinear models, parametric models and the nonparametric ones. Parametric models are characterized by a fixed number of coefficients which is determined in advance. Nonparametric models, on the contrary,
1 2
CEA LETI, DRT/DACLE/SCCI/LAIR, France ComElec Department, LTCI, Telecom Paristech, France
28
Digitally enhanced mixed signal systems DAC +
PA
LO
VGA 90°
–
DAC
LNA
VGA
ADC
VGA
ADC
LO 90°
Figure 2.1 Transceiver block diagram do not have a fixed number of coefficients defined at the beginning. In nonparametric model estimation, a coefficient is added to the model until the system estimation is good enough.
2.2.1 Parametric models In system estimation, parametric models are often preferred to nonparametric estimation because they are related to physical models. We denote two types of models: memory and memoryless models. In memoryless models, the output signal depends on a nonlinear combination of the input value for the current instant. For memory models, the output signal depends in a nonlinear way not only on the current input value but also on past values of the input.
2.2.1.1 Memoryless models Nonlinear memoryless models have been intensively used due to their simplicity and compactness. In radio frequency (RF) systems, especially for receivers’ front-end, the considered nonlinear system is usually a cascade of nonlinear blocks. As the polynomial orders sum up for cascaded systems, the theoretical polynomial order of a cascaded system can quickly be significant. However, in practice, one or two blocks dominate the nonlinear behavior of the system, and high-order distortions can then be neglected. This means that the model polynomial order in front-end receivers may be often limited to five. At the transmitter side, higher order distortions greater than five may be considered for the power amplifier.
Polynomials The polynomial model is surely the most known model. For a polynomial function of degree n, the nonlinear output is given by y(t) =
N
αn xn (t) = α0 + α1 x(t) + α2 x2 (t) + · · · + αN xN (t)
(2.1)
n=0
where α0 is the offset, α1 is the static gain, α2 is the second-order coefficient, α3 is the third-order coefficient and αN is the N -order coefficient. Therefore, the model is
Nonlinear modeling
29
fully described by its coefficients α0 to αN . The offset coefficient α0 is often neglected in studies for the sake of simplicity. Although the model is quite simple compared to a linear system, a much higher computation cost is needed due to the number of bits and multiplications required. Furthermore, let us assume that the model inverse is also a polynomial model; then, in most cases, a much higher order than the estimated system is needed to perform the inverse.
Orthogonal polynomials Orthogonal polynomial models [1] offer a good alternative to polynomial models and can bring significant advantages in some cases. Indeed, the orthogonality property avoids the interference between the different nonlinear orders with the same parity. For example, with a one-tone signal of frequency f1 , the third-order term of a classical polynomial model generates a component at 3f1 and a component at f1 which interfere with the first-order term. However, with the Chebyshev polynomial model, the thirdorder term generates only a component at 3f1 and no term at f1 . They are also less prone to local minimum in the estimation process. Furthermore, techniques like the pth-order inverse perform better than with a polynomial model. However, these models only work well with one type of signal, which limits their usage to specific modulation. In the following, we will see two orthogonal polynomials models, the Hermite polynomials limited to Gaussian inputs and the Chebyshev polynomials limited to sinusoidal inputs. Hermite polynomials: The nonlinear system output expressed with Hermite polynomials [2] is given by yn = xn +
N
kj Hj (xn )
(2.2)
j=1
where Hj is the Hermite polynomials of degree n. The first Hj are given by H0 = 1 H1 = x H2 = x 2 − 1 H3 = x3 − 3x The orthogonal property is true only when the input x is Gaussian which limits its use to orthogonal frequency-division multiplexing (OFDM) and phase-shift keying (PSK) modulations. For example, Hermite polynomials are used in identification of nonlinear systems in [3] as well as in the inverse determination of a nonlinear system in [4]. Chebyshev polynomials: A nonlinear system with Chebyshev polynomials can be written with the following relation: yn = xn +
N j=1
kj Tj (xn ),
(2.3)
30
Digitally enhanced mixed signal systems
where Tj is the j-order Chebyshev polynomial of the first kind. The first Chebyshev polynomials of the first kind are given by T0 (x) = 1 T1 (x) = x T2 (x) = 2x2 − 1 T3 (x) = 4x3 − 3x The orthogonality property is valid for Chebyshev polynomials when the input is sinusoidal, which limits its use to ultra-narrowband signals.
2.2.1.2 Memory models In narrowband systems, the nonlinear model can be considered as memoryless since gain and phase shift due to the memory effect are almost constant in the band of interest. As a matter of fact, the gain can be included in the polynomial coefficients and the phase shift can be expressed by a simple delay or complex coefficients when models are expressed in baseband. However, in wideband systems, the memory effect must be taken into account to keep good model accuracy. A significant number of nonlinear models with memory exist in the literature. Two main families can be distinguished: nonlinear models with linear memory and nonlinear memory. The first family is made of block models, using nonlinear memoryless blocks and linear filters. The second family can model most of nonlinear systems; however, their complexity can be significantly higher.
Block models In nonlinear models with linear memory, linear filters are used to model gain and phase shift over the entire system bandwidth. Several structures exist with varying levels of sophistication. The simplest model which can be made with a nonlinear memoryless system (also called nonlinear static system) and filters is a two-block model. Two configurations exist for two-block models, the Wiener model that consists of a linear filter followed by the nonlinear memoryless system (Figure 2.2) and the Hammerstein model in which the nonlinear model is placed before the filter (Figure 2.3).
x(t) Filter
Static nonlinear system
y(t)
Figure 2.2 Wiener model
x(t)
Static nonlinear system
y(t) Filter
Figure 2.3 Hammerstein model
Nonlinear modeling
31
Wiener: Let us denote the impulse response of the linear filter by f (n) and the nonlinear function by g(.). Furthermore, g(.) is defined here as a polynomial function and f (n) as a finite impulse response (FIR) filter. Therefore, the nonlinear system output for a Wiener model is y(n) = g( f (n)×x(n)) =g f (n − l)x(l) l
Hammerstein: For Hammerstein block model the output signal can be expressed as follows: y(n) = f (n)×g(x(l)) f (n − l)g(x(l)) = l
Let us assume that the nonlinear system g(.) has an inverse denoted g −1 and that the filter f (n) has an inverse denoted f −1 . Therefore, a Wiener model can be used to linearize a Hammerstein model and vice versa. Wiener–Hammerstein: If Wiener and Hammerstein models do not fit well the nonlinear system, a three-block model can be used to increase the model accuracy. A common three-block model is the Wiener–Hammerstein model whose name refers to the contraction of the two previous models, Wiener and Hammerstein. In Figure 2.4, the Wiener–Hammerstein model is depicted. It consists of a linear filter, connected to a static or memoryless nonlinear system and followed by another linear filter. Using FIR filters and polynomial model as a static nonlinear system, the Wiener– Hammerstein output can be expressed by y(n) = fW (n)×g( fH (n)×x(n)) = α1 fW (n − lW ) fh (lW − lH )x(lH )] lW
+ α2
lH
⎡
fW (n − lW ) ⎣
lW
+ α3
fH (lW − lH )x(lH )⎦
lH
⎡
fW (n − lW ) ⎣
lW
⎤2
⎤3 fH (lW − lH )x(lH )⎦ + · · ·
lH
x(t) Filter
Static nonlinear system
y(t) Filter
Figure 2.4 Wiener–Hammerstein model
32
Digitally enhanced mixed signal systems
Wiener, Hammerstein and Wiener–Hammerstein models can be parallelized in order to set different memory effects for each nonlinear order.
Continuous time Volterra series Nonlinear models with nonlinear memory are suitable for a very large number of nonlinear systems. In return, their complexity may be significantly higher than aforementioned models, especially for high polynomial orders and high memory depths. A usual nonlinear model with nonlinear memory is the Volterra series, which is able to model all mildly and weakly nonlinear systems. As a matter of fact, Volterra series present a limited radius of convergence and strongly nonlinear systems exceed this radius. Strongly nonlinear systems are those which show strong discontinuities as bifurcations or chaotic systems. The nonlinear system output expressed with Volterra series is given by +∞
+∞ d
. . . hd (τ1 , τ2 , . . . , τd ) x(t − τi )dτi ,
+∞
y(t) =
d=1
−∞
−∞
(2.4)
i=1
where hd (τ1 , τ2 , . . . , τd ) is the d-order kernel and d is the polynomial degree. As can be seen, the d-order kernel is d-dimensional function which makes it very difficult to study. The first sum of (2.4) can be decomposed to show the different polynomial degree: +∞ y(t) = h1 (τ1 )x(t − τ1 )dτ1 −∞ +∞
+
h2 (τ1 , τ2 )x(t − τ1 )x(t − τ2 )dτ1 dτ2 −∞
+∞ +
h3 (τ1 , τ2 , τ3 )x(t − τ1 )x(t − τ2 )x(t − τ3 )dτ1 dτ2 dτ3 −∞
+ ···
(2.5)
As can be seen, the first term of the Volterra series in (2.5) is the linear part of the system, modeled by a simple convolution. The second term, which is the second-order term, is modeled by a two-dimensional (2-D) convolution between x(t − τ1 )x(t − τ2 ) and h2 (τ1 , τ2 ).
Kernels of block-models At the beginning of the Section 2.2.1.2, block models have been introduced. As a matter of fact, these models are actually sub-models of Volterra series. Therefore, each aforementioned model can be expressed under the form of a Volterra series. For a Wiener model, the d-order kernel is hd (τ1 , τ2 , . . . , τd ) = αd f (τ1 )f (τ2 ) · · · f (τd ).
(2.6)
Nonlinear modeling
33
This kernel has the interesting property to be separable: h1 (τ ) = α1 f (τ ) αd hd (τ1 , τ2 , . . . , τd ) = h1 (τ1 )h1 (τ2 ) · · · h1 (τd ). (α1 )d On the other hand, the d-order kernel of a Hammerstein model is hd (τ1 , τ2 , . . . , τd ) = αd f (τ1 )δ(τ1 − τ2 )δ(τ1 − τ3 ) · · · δ(τ1 − τd )
(2.7)
As can be seen, the Hammerstein model has coefficients only on the kernel diagonal and has therefore no cross terms. Finally, the d-order kernel for the Wiener–Hammerstein model is given by hd (τ1 , τ2 , . . . , τd ) = αd
+∞ fW (τ )fH (τ − τ1 )fH (τ − τ2 ) · · · fH (τ − τd )dτ
(2.8)
−∞
Symmetry property The kernel of a nonlinear system is not unique; however, only one symmetric kernel exists. The symmetry property of a kernel can be written as follows: hd (τ1 , τ2 , . . . , τd ) = hd (τπ(1) , τπ (2) , . . . , τπ (d) ),
(2.9)
where π(.) denotes any permutation of integers between 1 and d. The symmetric kernel can be obtained from any kernel using the formula sym
hd (τ1 , τ2 , . . . , τd ) =
1 hd (τ1 , τ2 , . . . , τd ), n! p
(2.10)
where the sum is performed over all the permutations p.
Discrete Volterra series Nowadays, most of processing is done in the digital domain; therefore, discrete Volterra series have to be used. In the digital domain, Volterra series become y[n] =
+∞ +∞ +∞
···
d=1 l1 =0 l2 =0
+∞
hd (l1 , l2 , . . . , ld )
ld =0
d
x[n − li ].
(2.11)
i=1
However, for a physical implementation, the polynomial order and the memory depth have to be truncated, which gives y[n] =
D L L d=1 l1 =0 l2 =0
···
L ld =0
hd (l1 , l2 , . . . , ld )
d
x[n − li ],
(2.12)
i=1
where D is the maximal polynomial order and L is the system memory depth. The number of coefficients exponentially grows with polynomial order and memory depth.
34
Digitally enhanced mixed signal systems
In fact, the coefficients number of the Volterra series of (2.12) with symmetric kernels is given by N =
D (L + d)! d=2
L!d!
(2.13)
.
Therefore, discrete Volterra series have to be severely truncated in order to limit the model complexity. As a consequence, the polynomial order truncation limits discrete Volterra series to weakly or mildly nonlinear systems, whereas the memory depth truncation limits the model to short-term memory effects. Actually, these truncated Volterra series are also called nonlinear eXogenous systems. Since Volterra series can be represented by a linear combination of the kernel coefficients in a nonlinear basis, a matrix form can be used y[n] =
D
hd T ud [n],
(2.14)
d=1
= hT u[n]
(2.15)
where hd , h, ud [n] and u[n] are column vectors: ⎡ ⎤ h(l1 = 0, l2 = 0, . . . , ld = 0) ⎢ h(l = 1, l = 0, . . . , l = 0) ⎥ 2 d ⎢ 1 ⎥ ⎥, d ∈ [1; D], hd = ⎢ . ⎢ ⎥ . ⎣ ⎦ . h(l1 = L, l2 = L, . . . , ld = L) ⎡
d ∈ [1; D],
⎤ xd [n] ⎢ x[n − 1]xd [n] ⎥ ⎢ ⎥ ⎥, ud = ⎢ .. ⎢ ⎥ ⎣ ⎦ . d x [n − L]
⎡
u1 ⎢u ⎢ 2 u=⎢ ⎢ .. ⎣ .
⎤ h1 ⎢h ⎥ ⎢ 2⎥ ⎥ h=⎢ ⎢ .. ⎥ ⎣ . ⎦ hD ⎡
(2.16)
⎤ ⎥ ⎥ ⎥ ⎥ ⎦
(2.17)
uD
NARMAX model In order to model long-term memory effects with a reasonable number of coefficients, nonlinear auto-regressive moving-average with eXogenous inputs (NARMAX) models have been introduced. As a reminder, “auto-regressive” means that the current output depends on past values of the system output, “moving average” represents the noise model whose average is time dependent and eXogenous indicates that the system relies on past values of the system input. NARMAX models can represent systems depicted in Figure 2.5. The NARMAX model is described by a nonlinear function f (.) in addition to a noise term. y[n] = f (x[n], x[n − 1], . . . , x[n − nx ], y[n − 1], y[n − 2], . . . , y[n − ny ], e[n − 1], e[n − 2], . . . , e[n − ne ]) + e[n],
(2.18)
Nonlinear modeling Noise
x(t)
Input
35
e (t) (Unobservable)
Nonlinear dynamical system
y(t)
Output
Figure 2.5 Block diagram of a nonlinear dynamic system where x[n], y[n] and e[n] are, respectively, the system input, the system output and the noise term. In (2.18), nx , ny and ne are, respectively, the maximum input, output and noise lags. As it can be observed, Volterra series are included in the NARMAX model. For example, a NARMAX model can be y[n] = −a0 + a1 y(n − 1) + a2 x(n) +a3 x[n − 1] + a4 y2 [n − 1] +a5 y[n − 1]x[n] − a6 y[n − 1]x[n − 1] −a7 x2 [n] − a8 x2 [n − 1]
(2.19)
A clear advantage of NARMAX models on Volterra series is that few coefficients are needed to model long-term memory effects since past values of the output are used. However, as for infinite impulse response (IIR) filters, they can face instability issues.
2.2.2 Nonparametric models Among the nonparametric models, neural network, spline and statistic models are the most common. In this chapter, only the neural network will be presented; however, spline and statistics models will be mentioned in the next chapters.
2.2.2.1 Neural network Artificial neural networks (ANNs) can also be used to model nonlinear systems with memory effects. ANN inspiration came from the way brain works. The multilayer perceptron (MLP) is one of the most used types of ANN. In Figure 2.6, an ANN with an MLP topology is depicted. An MLP is made of several neurons organized in layers. It has one input layer, in order to connect the inputs, optionally, one or several hidden layers and one output layer. As can be seen in Figure 2.6, a neuron (except for the input layer) is connected to several other neurons from the previous layer. This can lead to highly interconnected systems. In Figure 2.7, one neuron is pictured with its synapses linking it to neurons located in the previous layer. Each synapse is weighted by a coefficient which is obtained through a learning process. Synapses are connected to a summation block whose result goes to a function σ (.) called activation function which gives the neuron output.
36
Digitally enhanced mixed signal systems
Input layer
Hidden layers
Output layer
Figure 2.6 Artificial neural network with two hidden layers Synapse w l1, k
Neuron
w l2,k g lk
w lN,k
Figure 2.7 Neuron in an ANN The activation function is not limited to a single type of function. Nevertheless, commonly used types are the sigmoid functions. Sigmoid functions, have an S shape, a finite value at +∞ and −∞, and a smooth transition between. A fundamental theorem named the universal approximation theorem or Cybenko theorem has proven that an MLP with one hidden layer using the same sigmoid function for each neuron could approximate any continuous function. For example, two sigmoid functions are depicted in Figure 2.8 given by (2.20): σ (γ ) =
1 , 1 + e−γ
(2.20)
σ (γ ) = tanh (γ ). In order to model nonlinear memory systems, a particular form of the MLP is used, the time delay neural network (TDNN). It consists of an MLP with N inputs delayed
Nonlinear modeling
37
versions of the input signal, with one or several hidden layers and one output. In Figure 2.9, a TDNN with one hidden layer and a maximum lag of L is shown. Actually, in this TDNN, the activation function used for each neuron in the hidden layer is a nonlinear function, which can be a sigmoid or a polynomial function. In the output layer, there is only one neuron with a linear activation function. The kth neuron output in the hidden layer is L 1 yk [n] = g wlk x[n − l] , (2.21) l=0
where g(.) is a nonlinear memoryless system, k represents the kth neuron in the hidden layer, wlk is the synapse weight linking the lth neuron in the input layer and the kth neuron in the hidden layer. As can be seen, this TDNN is actually a bank of filters followed by a nonlinear memoryless system, which corresponds to a Wiener model. 1/(1+e–γ)
1
tanh (γ)
1
σ (γ)
σ (γ)
0.5 0.5
0 –0.5
0 –10
0 γ
10
–1 –10
0 γ
10
Figure 2.8 Example of sigmoid functions
x(n)
z–1
z–L
x(n – 1)
y(n)
x(n – L)
Input layer
Hidden layers
Output layer
Figure 2.9 TDNN with one hidden layer
38
Digitally enhanced mixed signal systems
2.3 Suited models for each RF bloc 2.3.1 Extension to complex models Observation and modeling of nonlinearities requires adapted mathematical tools and representations. So far the nature of the signals has not been specified, but the type of the signals handled by circuits, or systems, has an impact on the way they can be modeled. The construction and extraction of behavioral models is motivated by maximizing the accuracy and the computational efficiency. Indeed, physical simulations or equivalent circuits can be used to evaluate the behavior of a system in a precise way, but these simulations are time-consuming. Often, a behavioral model of a higher level, in terms of abstraction, can be extracted and used to predict the response of the system to other input signals with less computing effort. For many analog circuits, the performance with respect to their nonlinear behavior is often expressed in terms of parameters that are measured in the frequency domain. This explains the classical approach of using harmonic signals which ease the visualization of nonlinear effects because they are manifested as the production of a number of harmonics at frequencies that depend on the mixing properties of the nonlinear characteristics and on the input frequencies. From a computational complexity point of view, the physical simulation of circuits by a harmonic or multitone approach requires fewer resources than the use of real communication signals. In addition, the harmonic representation of the signals allows direct connections between the harmonic balance simulations and the nonlinear microwave circuit measurements. However, the analysis of systems with more realistic signals, close to the signals to be truly transmitted, is more and more desired in order to better characterize and optimize circuits and systems. But this can be problematic in the case of RF circuits because the very high carrier frequency of RF systems makes computer simulations of communication signals inefficient since the simulator computation sampling frequency of RF modulated signals is proportional to the carrier frequency in order to verify the Nyquist sampling theorem. The general form of a modulated signal around the carrier f0 is written as x(t) = I (t) cos(2πf0 t) − Q(t) sin(2π f0 t).
(2.22)
The two modulating signals I (t) and Q(t)—which can be independent or not—are multiplexed in the carrier, a priori without increasing the bandwidth (unlike a plain frequency multiplexing). This real-valued signal can also be written in the following forms:
x(t) = Re x˜ (t)ej2π f0 t (2.23) 1 [˜x(t)ej2π f0 t + x˜ ∗ (t)e−j2π f0 t ] (2.24) 2 with x˜ (t) = I (t) + j · Q(t) and x˜ ∗ is the conjugate complex of the x˜ signal. x˜ is called the complex envelope, and the support of its spectrum defines the bandwidth of the signal x(t). The (complex valued) signal x˜ and (real-valued) signals I (t) and Q(t) are x(t) =
Nonlinear modeling
39
so-called baseband or lowpass signals. In contrast, the (real) signal x(t) is said to be bandpass or also, with abuse of language, RF signal. If this RF signal is distorted with a polynomial nonlinearity, the signal will be written as y(t) =
P
xk (t).
(2.25)
k=1
When modeling RF circuits, the carrier frequency f0 is generally much greater than the bandwidth of the complex envelope Bx˜ ,max (and a few multiples N of the maximum envelope frequency1 ) of the input signal: f0 N × Bx˜ ,max .
(2.26)
The corresponding distorted signal will contain intermodulation and harmonic distortion as shown in Figure 2.10. The part of the distorted signal which is situated in the same band as the initial signal x(t) occupies a band equal to N × Bx˜ ,max .2 This band is often called zonal band because the suppression of the harmonics of the signal is performed by a zonal filter. The operation of relating the zonal band of the signal x(t) and the zonal band of the signal y(t) forms the basis of the baseband modeling. Indeed, by transposing these signals to DC by multiplying by a (hypothetical) complex exponential and filtering by a (hypothetical) low-pass filter, these signals are located in baseband, that is, they are of the low-pass type and are characterized by complex values. Regarding the initial signal x(t), these two operations (transposition in frequency then filtering) result in x˜ (t); in the case of the distorted signal, these operations result in a signal which is noted as y˜ (t) and which is called the (complex) baseband equivalent of y(t). It should be noted that the relationship between y(t) and y˜ (t) is not identical to the (2.23) but is given by
y(t) = Re y˜ (t)ej2π f0 t + yharm (t) (2.27) where yharm (t) represents the components of the distorted signal that have been cut off by the zonal filter. X (f)
Y (f)
Zonal band
Nonlinear circuit
f0
f
DC
f0
2f0
3f0
f
Figure 2.10 Actual RF nonlinear system
1 In nonlinear systems, N corresponds to the largest odd order of significant nonlinearities of the system. It can be considered here that, N = P if P is odd and N = P − 1 if P is even. 2 This can be demonstrated analytically by using the expressions (2.25) and (2.24) with the condition in (2.26).
40
Digitally enhanced mixed signal systems
The complex baseband analysis of nonlinear RF systems provides a tool for efficient computer simulations of such systems. In complex baseband simulations, only the information signal (the complex envelope of the modulated signal), and not the carrier, is processed by the simulator. Thus, in this simulation mode, the carrier frequency of the modulated signal is converted to zero, which allows for a much lower calculation sampling rate and, therefore, more efficient simulations (Figure 2.11). For polynomial type models, one can write the relation between the nonlinear bandpass model and the complex low pass equivalent model. This has been done for the general case of the Volterra series [5–7]. When x(t) and y(t) are related by a Volterra series: L
y(t) = · · · h (τ ) (2.28) x(t−τi ) dτ i=1
=1
where τ = [τ1 , . . . , τ ], h (·) is the th-order Volterra kernel and dτ = dτ1 dτ2 · · · dτ , then the baseband equivalent signal is given by y˜ (t) =
K
···
h˜ 2k+1 (τ 2k+1 )
k+1
i=1
k=0
2k+1
x(t−τi )
x∗ (t−τi ) dτ 2k+1
(2.29)
i=k+2
where
2k+1 k+1 ˜h2k+1 (τ 2k+1 ) = 1 2k + 1 h2k+1 (τ 2k+1 ) × e−j2π fc i=1 τi − i=k+2 τi k 22k n where m is the binomial coefficient and L if L is odd 2K + 1 = L − 1 if L is even.
X( f )
Nonlinear circuit
f0
f
DC
Baseband equivalent model
X( f )
f0
f
Y( f )
(2.30)
(2.31)
Zonal band
3f0
2f0
f0
Zonal filtering Y(f )( f ) 0
f0
f
Figure 2.11 Baseband modeling of nonlinear system
f
Nonlinear modeling
41
2.3.2 Models for power amplifiers The design of electronic systems requires the use of appropriate models to understand their limits and then optimize their performance. We distinguish two approaches that are mainly differentiated by the precision obtained on the output signals of the devices: physical models and behavioral models. Physical models require knowledge of the electronic elements3 that make up the electronic device and must describe their interactions with theoretical rules. They use nonlinear models of the active and passive components to form a set of nonlinear equations connecting the voltages and currents of each node of the circuit. Often, an equivalent circuit is extracted by simulation of the physical model, which allows to make simulation with high accuracy. However, this accuracy is paid in terms of computational resources (time and/or memory), and the exact internal structure of the circuits is not always available. The second modeling approach is called behavioral modeling. It is used when the hardware description of the circuit is not available, or whenever a full simulation at the system level is desired. Since they are based solely on input–output (behavioral) observations, their precision is very sensitive to the structure of the adopted model and to the parameter-extraction procedure. In fact, although behavioral modeling can guarantee the exact reproduction of the data set used for its extraction, it is not certain that it will also produce accurate results for a different data set,4 a fortiori for a different circuit or a circuit based on a completely different technology. Thus, the predictive ability of any behavioral model will have to be carefully studied and verified. There is abundant literature on different PA modeling approaches and in particular behavioral modeling [8–13]. However, regarding the latter, only a few works are devoted to analyzing and comparing models. This section picks up together elements from the books by Vuolevi and Rahkonen [10] and Schreurs et al. [5]. The latter provides a discussion of behavioral models based on the articles of Isaksson et al. [14] and Pedro and Maas [15] which are given as internal reference of the book.
2.3.2.1 Circuit-level PA models Circuit-level PA models, a.k.a. device models, process the true RF modulated signal. They are designed to handle real excitations taking into account the possible harmonic content and distinct time scales of the RF carrier and of the envelope of the RF signal. Indeed, the high-frequency carrier requires a small time step in a transient analysis, while the low-frequency modulation necessitates a long simulation interval. These models can also account for mismatches between stages in the input and output ports of the PA. The most classical approach of circuit modeling is to extract an equivalent circuit from the physical model, mainly by simplifying it, in order to reduce the simulation time [10]. There is also a behavioral modeling approach at the circuit level in which the detailed constitution of the PA is ignored. Although intended to be
3 4
Up to the transistor level. Especially for a different class of excitation signals.
42
Digitally enhanced mixed signal systems
simpler, these models retain the characteristic of integrally representing in the time domain the relationships between the incident and reflected voltage and current waves. An example of Volterra type behavioral modeling is called Volterra input–output map (VIOMAP). It is conceptually a nonlinear extension of normal S-parameters, including harmonic responses, and it has been successfully used in single-tone loadpull simulations [10]. As reported by Schreurs et al. [5], this model is a simplification of the complete dual Volterra model. Moreover, general polynomials and ANNs have been proposed as multidimensional dynamic functions to overcome the limitations of weak nonlinearity in the dual model of Volterra.
2.3.2.2 System-level PA models System-level behavioral PA modeling is intended for system-level simulators. These models use a complex-valued representation of the modulating signal envelope and do not represent the RF carrier. They are equivalent single-input single-output low-pass models, a description of which has already been given in the previous Section 2.3.1.
2.3.2.3 Memoryless models Behavioral models without memory, also called static models, are those in which the output envelope reacts instantly to changes in the input envelope. In the static case, the output of the system y(t) can be uniquely defined as a function of the instantaneous input x(t), and the model reduces to y(t) = f (x(t))
(2.32)
where f ( · ) represents a nonlinear function. Two commonly used examples of low-pass equivalent memoryless models are ●
a polynomial with complex coefficients a2n+1 y(t) = f (rx (t))2n+1
●
where rx (t) is the instantaneous input envelope’s amplitude the Saleh model αr rx (t) ry (rx (t)) = 1 + βr rx (t)2
y (rx (t)) =
α rx (t) 1 + β rx (t)2
(2.33)
(2.34) (2.35)
where αr , βr , α and β are fitting parameters for the measured PA’s amplitude to amplitude modulation (AM–AM) characteristics ry (rx (t)) and amplitude to phase modulation (AM–PM) characteristics y (rx (t)). These models are suitable for amplifiers where the input and output filters have a much higher bandwidth than the excitation bandwidth, and where the active device does not generate dynamic distortions called memory effects.
2.3.2.4 Memory models When the PA has memory effects on the modulated RF signal or on the modulating envelope, it is said dynamic. The output can no longer be uniquely determined from
Nonlinear modeling
43
the instantaneous input and also depends on the past values of the input and/or system state. This is the case when considering broadband signals because the narrowband approximation supported by the equivalent AM–AM and AM–PM lowpass models is no longer fully valid. The input–output signals of the PA are then connected by a forced nonlinear differential equation: dy(t) d p y(t) d r x(t) dx(t) f y(t), ,..., , . . . , =0 (2.36) , x(t), dt dt p dt dt r Since PA behavioral models are evaluated on digital computer, a discrete time environment should be adopted: y(n) = fR (y(n − 1), . . . , y(n − Q1 ), x(n), x(n − 1), . . . , x(n − Q2 ), )
(2.37)
where y(n), the present output at time instant nTS , depends in a nonlinear way, given by the nonlinear function fR ( · ). This expression can be seen as a nonlinear extension of IIR digital filters (nonlinear IIR) which is considered to be the general form for recursive PA behavioral models [5]. Several system identification results have shown that, under a wide range of conditions (essentially causality, stability, continuity and fading of the operator), such a system can also be represented with an acceptable error by a non-recursive form where the corresponding past is limited to q ∈ {0, 1, 2, . . . , Q}, the so-called system memory range [5]: y(n) = fD (x(n), x(n − 1), . . . , x(n − Q))
(2.38)
in which fD ( · ) is again a multidimensional nonlinear function. This nonlinear extension of FIR digital filters (nonlinear FIR) is also considered as the general form for a direct, or feedforward, behavioral model [5]. Various forms have been proposed for the multidimensional functions fR ( · ) and fD ( · ), although two of them have received special attention. These two forms are the polynomial filters, in broad sense, of the Volterra series type and ANNs. There is essentially no distinction between a feedforward time-delay ANN and a non-recursive polynomial filter. They are simply two alternative ways of approximating the multidimensional function fD ( · ). However, there are slight differences between these two approaches, the polynomial filters can be extracted directly, whereas the ANN parameters can only be obtained from a nonlinear optimization scheme. Moreover, despite the universal approximation properties of ANNs, there is no way to know a priori how many hidden neurons are needed to represent a specific system, nor to predict the improvement in modeling obtained when this number increases. We cannot even ensure that the extracted ANN is unique or optimal for a number of neurons. However, unlike the intrinsically local approximation properties of polynomials, ANNs behave as global approximations, an important advantage when modeling strongly nonlinear systems. In addition, since the sigmoidal activation functions used in ANNs are limited in output amplitude, ANNs are basically better than polynomials to extrapolate beyond the area where the system was operated during parameter extraction. Among the PA memory models, there are linear memory models and nonlinear memory models. Linear memory effects are memory behaviors uncorrelated with the
44
Digitally enhanced mixed signal systems
nonlinear response of the power amplifier. The most famous linear memory model is the two-box Wiener model which is a concatenation of a linear FIR filter and a memoryless nonlinear function. This model belongs to a broader class of models called box (or block)-oriented models (which are not linear memory in general) which aim at reducing the complexity and enhancing the numerical stability of the system [13]. Another famous model of this class of models is the two-box Hammerstein model which is composed of a memoryless nonlinear function followed by a linear FIR filter. Nonlinear memory effects are memory behaviors mixed with the nonlinearity of the transistor. It has been showed that such effects can be represented by a memoryless nonlinearity and a filter in a feedback path [5]. However, for computational reasons, direct (or feedforward) models are preferred such as Volterra series-based models. The memory effects can be caused by the active device’s temperature modulation. This category of memory effects is called electrothermal or thermal memory effects. Since it is the function of the temperature change in the junction of the transistor, this category of memory effect has a long-term effect and affects narrow bandwidths of the signal spectrum. Memory effects may also be produced by the external terminations, including parasitic elements and matching networks of the power amplifier. This type of memory effects is called electrical memory effects. The properties of these terminations across the fundamental frequency, baseband frequency, and all the harmonic frequencies shape the power amplifier response around the carrier frequency [13].
2.3.3 Models for low-noise amplifiers In narrowband systems, designers optimize the LNA with little concerns for the linearity [16]. However, for full-duplex or wideband systems, the LNA linearity can be critical for the system especially when a large number of strong interferers are encountered. In a full-duplex system, the transceiver must transmit and receive at the same time. If the transmitted signal is not attenuated enough by the duplexer (around 50 dB of attenuation max), the LNA input sees a large input signal along with small desired signals as showed in Figure 2.12. The required dynamic and linearity is then critical. The same thing appends with wideband systems.
LNA
PA
Figure 2.12 Full-duplexer receiver
Nonlinear modeling
45
Regarding the receiver design, the LNA can be first modeled by a polynomial model: y(x) = α0 + α1 x + α2 x2 + α3 x3 .
(2.39)
This model is accurate enough to give the critical constraints on the linearity in order to size each block of a receiver. The polynomial order of the model can generally be limited to the third order. Indeed, the LNA low input allows to neglect higher nonlinear order. This behavior is often verified in practice through simulations and measurement results. In a second time, when the LNA linearity must be optimized by design, continuous third-order Volterra series are usually used [17,18]. Volterra series appear naturally when capacitor and inductor are used in the LNA design as done in the harmonic termination technique. In order to determine the Volterra kernels, the method of nonlinear currents is often used. The method starts with the drain current of input stage transistor expressed by the use of Taylor series: 2 3 id = gm1 vgs + gm2 vgs + gm3 vgs .
(2.40)
As can be seen, the drain current is expressed here by replacing the transconductance value of the small signal model gm by a Taylor series of order three with the coefficients gm1 , gm2 and gm3 . In the literature, we can find different architectures of LNA modeled with Volterra series. Common source LNAs (CS-LNAs) are modeled in [19–23]. Reference [19] shows a cascode stage CS-LNA, [20,22] CS-LNAs with a harmonic termination technique and [21] CS-LNA using a derivative superposition technique. A common emitter LNA is modeled in [24]. Finally, common gate LNAs are modeled in [25,26]. When the model is aimed to be used in a digital non-linear distortions correction scheme, then Volterra series are usually used. Nevertheless, the polynomial order used is still 3. In some papers, the LNA is just modeled by a third-order polynomial system as in [27,28]. Without too much complexity added, a memory polynomial can be used for baseband corrections [29]. However most of the time, the digital correction takes into account the memory effect with the use of Wiener [30] or Hammerstein models [31]. For even more accurate models, a generalized Hammerstein model can be used [32]. A third-order generalized Hammerstein model is shown in Figure 2.13 as an example. Each branch matches to a specific order of the nonlinear model. For each order, there is the power block (.)n followed by the polynomial coefficient αn and by a specific filter Gn (ω). Compared to the classic Hammerstein model, the generalized Hammerstein gives more degrees of freedom to filter each order differently. x(t)
a1
G1(w)
(.)2
a2
G2(w)
(.)3
a3
G3(w)
+
y(t)
Figure 2.13 Third-order generalized Hammerstein model
46
Digitally enhanced mixed signal systems
As shown here, the model highly depends on the purpose. For system considerations, a simple polynomial system can be considered. For the design itself, Volterra series are generally considered. Finally for digital compensations, the model complexity is between the system model and the design model: enough complex to be accurate and not too much in order to limit the area and the power consumption of the signal processing.
2.3.4 Models for baseband blocks Similarly to RF blocks, baseband analog blocks such as analog-to-digital converter (ADC) drivers, anti-alias or reconstruction filter and variable gain amplifiers also contribute to the nonlinearities of wireless and wired transceivers. However, the nonlinearity causes and consequences are somehow different compared with their RF counterparts such as LNA and PA. First, in baseband blocks, since the signal has been down-converted or has not been yet up-converted around a central frequency, the ratio between the highest and lowest frequency processed by the block is orders higher than an RF block. This increases significantly the impact of memory effect. This phenomenon could be easily noticed by consulting datasheets of the aforementioned blocks. One could easily observe that the linearity degrades drastically as the input frequency is increased. References [33,34] show two examples of this behavior. The second specificity of baseband blocks is that even order nonlinearities, since they fall inside the useful bandwidth, have similar impact as odd order nonlinearities. The third difference which is specific to receivers is that the main power seen by baseband blocks is the useful power as most of the interferers have been already filtered by RF blocks. This difference does not impact the modeling of the block, but it allows to employ digital post-distortion techniques to cancel the nonlinearities of baseband blocks, whereas for RF blocks, this operation is significantly more difficult and sometimes impossible because the main sources of nonlinearities for these blocks, i.e., interferers, are filtered before the analog to digital conversion. Based on the first two observations, it can be concluded that modeling baseband blocks requires models that take into account even order components and that can handle significant memory effect. Volterra model is a very good candidate to achieve this goal; however, its extraction is not always a simple task and can often diverge. To overcome this limitation, simplified Volterra models such as Wiener, Hammerstein and memory polynomial can be used instead of allowing a more robust convergence but a cost of a less accurate precision.
2.4 Digital compensation of nonlinear distortions Pushed by the digital needs for more computation power and functionalities, transistors shrink every 1 or 2 years. The length and width reduction of complementary metal oxide semi-conductor (CMOS) gate allow today to integrate millions of transistors into a single chip which gives more computational power for the same cost. The dynamic power of digital circuits is also reduced thanks to smaller supply voltages and shrunken capacitance.
Nonlinear modeling
47
While numerous advantages exist from the transistor downsizing for digital systems design, it is different for analog systems. The reduction of supply voltages decreases the system dynamic range. The ratio gm/gds decreases with the scaling. The ids IIP3 decreases along with the transistors size. As a consequence, it becomes harder and harder to design analog IC with new CMOS nodes. Furthermore, telecommunication systems tend to have wider and wider bandwidth which increases the dynamic range issue. A wider bandwidth can integrate more signals, thus more input power, demanding instead more dynamic range. The increased input signal power and the IIP3 scale down make transceivers much more prone to generate high nonlinear distortion terms which can pollute the spectrum for the transmitter or jammed the desired input channel for the receiver. To cope with this issue, bigger transistors with multiple fingers can be used as well as analog design techniques such as harmonic termination or derivative superposition. However, these solutions are either limited to narrowband applications or increase significantly the power consumption. An alternative to these techniques is to use digital-assisted techniques and benefit from the gain in large integration capability and power consumption reduction of transistors downsizing. In transceivers, to maximize PA efficiency, the PA works in a very nonlinear region generating distortions which may pollute nearby channels [35]. To tackle this issue, a digital correction can be applied. A common architecture for the compensation of nonlinear distortions in a PA is the digital pre-distortion [35,36] . In this solution, the signal is pre-distorted with the inverse of the PA nonlinear model, thereby linearizing the system. The inverse model estimation is performed thanks to a feedback path which recovers the PA output. At the receiver side, the linearity is critical especially when strong interferers are close to the band of interest. Intermodulation terms due to strong interferers can severely impact the performance. To linearize the system, a technique called digital post-distortion is used. There are two types of post-distortion techniques: direct and indirect correction. In the former technique, the nonlinear system is estimated and then its inverse is calculated. In the latter, the nonlinear system inverse is directly estimated. Unlike for pre-distortion techniques employed for linearizing the PA, in the post-distortion case, the system input is unknown; therefore, blind estimation techniques are often mandatory. In this section, both correction methods are described.
2.4.1 Direct learning architecture In direct learning architecture (DLA), two approaches are possible according to the context. The first approach is the aided approach depicted in Figure 2.14(a) and 2.14(b). It is used when the input of the nonlinear system is known. It is the case at the transmitter side with the power amplifier where the input is known or at the receiver side for the ADC where calibration signals can be used. As can be seen, the aided DLA is a feedforward architecture. Using the input and output of the nonlinear system, coefficient of a nonlinear model is estimated. Once the coefficients of the nonlinear model are estimated, an inverse can be calculated. The nonlinear model inverse is then
48
Digitally enhanced mixed signal systems
applied to the input Figure 2.14(a) or to the output Figure 2.14(b) of the nonlinear system. In the unfortunate case where the input signal is not available or no calibration signal can be used, the blind approach depicted in Figure 2.15(a) and 2.15(b) is adopted instead. The blind estimation can be performed using prior hypothesis on the input signal. It can be the statistic nature of the input signal or some band-limited hypothesis. Blind approaches are generally more complex and less effective than their
Pre -inverse
Nonlinear system
Nonlinear system
System identification
System identification
Post -inverse
Inverse calculation
Inverse calculation (a)
(b)
Figure 2.14 Aided direct correction architectures: (a) with pre-inverse and (b) with post-inverse
(a)
Pre -inverse
Nonlinear system
Inverse calculation
Blind system identification
Post inverse
Nonlinear system
(b)
Blind system identification
Inverse calculation
Figure 2.15 Blind direct correction architectures: (a) with pre-inverse and (b) with post-inverse
Nonlinear modeling
49
aided counterpart; therefore, they are used only when it is mandatory. In the literature, the aided DLA with pre-distortion in Figure 2.14(a) is the most used among the DLA for the power amplifier nonlinear distortion compensation. For the receiver, the blind DLA with post-distortion in Figure 2.15(b) is commonly used. We will now detail first the aided and blind identification blocks, and then the nonlinear inverse calculation block.
2.4.1.1 Identification The nonlinear system-identification process includes two aspects. This first one is the choice of the nonlinear model used to fit with the nonlinear physical system. The model is always an approximation of the physical system and is valid only for a specific input level and frequency range. The higher the input signal dynamic range and the bandwidth, the more complex the required model. A second aspect of the nonlinear system identification is the algorithm used to actually estimate the coefficients of the chosen nonlinear model. The choice of the estimation algorithm will determine along with the nonlinear model the complexity of the identification block.
Aided identification To perform an aided identification, the input and the output of the nonlinear system are used. Once the nonlinear model is chosen, for example, a simple third-order polynomial model, an estimation algorithm should be picked. The estimation is usually performed by the way depicted in Figure 2.16. In this figure, we can observe the nonlinear system with an eventual ideal digital-to-analog converter (DAC) at the front or an ideal ADC at the back. This allows to always use discrete signals such as x(n) and y(n) for the input and output of the nonlinear system, respectively. For the power amplifier case, an ADC is placed at its output; for an ADC, a DAC is placed at the front and for an LNA, a DAC and an ADC are used. The identification process starts with an initialization of the vector w(n) which includes the coefficients of the estimated nonlinear model. The initialized vector is
x(n)
DAC
Nonlinear system
Estimated nonlinear model
y(n) ADC
d(n)
+ – +
w (n) Estimation algorithm
e(n)
Figure 2.16 Aided identification
50
Digitally enhanced mixed signal systems
then used to calculate the estimated nonlinear output d(n). The estimated output d(n) and the actual nonlinear system output y(n) are compared to give the error signal e(n). Using the nonlinear system input x(n) and the error signal e(n), the estimation algorithm updates the coefficients of the w(n) vector. The identification process keeps running until the algorithm has converged. The estimation algorithms are generally based on a least squares criteria. The optimal estimator in a least squares sense is given by the Wiener–Hopf equation based on the minimization of the cost function J = E[ e(n)2 ]. It is given by wopt = (Rxx )−1 Ryx ,
(2.41)
where wopt is the optimal estimation of the vector w in the least squares sense, Rxx is the autocorrelation matrix of the input and Ryx is the cross-correlation vector between the system output y and the system input x. The Wiener–Hopf equation gives an optimal solution; however, it requires a matrix inversion which can be problematic on a physical system with a limited amount of processing power and memory. Furthermore, the behavior of the nonlinear system can deviate over time, as the power amplifier do when the temperature increases. To cope with these issues, two main algorithms exist: the least mean squares (LMS) algorithm and the recursive least squares (RLS) algorithm. These two algorithms do not use matrix inversion which save the computation power and the needed memory . Besides, they are adaptive and can update the model estimation with time. The LMS algorithm is based on the minimization of the cost function J (n) = e(n)2 . As can be seen, to limit the amount of calculation and memory, the expectation found in the cost function of the Wiener–Hopf equation is approximated with the current error signal. The recursive equation of the LMS is given by w(n) = w(n − 1) + μx(n − 1)e∗ (n − 1),
(2.42)
where w(n) is the estimated coefficients vector at the sample n, x(n) is the system input at the sample n, e(n) = y(n) − d(n) is the error signal between the system output and the estimated output and μ is the algorithm step size. The convergence speed of the algorithm increases with the step size value. The convergence can be speed up until a certain value of step size for which the algorithm diverge. The limit value for the convergence is then given by μmax =
2 λmax
,
(2.43)
where λmax is the highest eigenvalue of the input autocorrelation matrix Rxx = E[x(n)∗ x(n)]. The LMS algorithm is very light in terms of computation, but in return, it can take a lot of samples to reach the convergence. To deal with this issue, RLS-type algorithm can be used to speed up the convergence time. However, this comes with the cost of extra complexity. The RLS can be derived by two ways. The first one is based on the Wiener–Hopf equation. As we discussed earlier, the Wiener–Hopf equation gives an optimal solution but at the cost of a matrix inversion. To bypass the matrix inversion, the inversion matrix lemma can be used to express in a recursive way the inversion
Nonlinear modeling
51
of the autocorrelation matrix Rxx . The second method to derive the RLS algorithm is to express its cost function, and partial derivatives of its cost function over the vector coefficient w(n). The RLS cost function is given by J (n) =
n
λ(n−i) e2 (i)
(2.44)
i=0
where λ is a coefficient between 0 and 1 used to forget past values of the signal. When λ = 1, the algorithm does not forget past samples and therefore is not adaptive. On the contrary, when λ → 0, the algorithm uses weights near 0 for past samples. This makes the algorithm very adaptive but also less effective. The LMS and RLS algorithms are of course not the only estimation algorithms used. However, most of them are based on one of these two.
Blind identification For various reasons, the nonlinear system input may not be available. It can be for practical reasons or for a matter of cost. To perform the nonlinear system estimation anyway, blind identification techniques must be used. These techniques exploit input signal characteristics to perform the estimation. A common information used is the input statistic. Blind estimation techniques often use higher order statistics (HOS) methods. Contrary to second-order statistics (SOS) which use only the mean and variance to perform the estimation, HOS methods use higher order moments. HOS are usually calculated with cumulant. The main reason to use cumulants instead of correlations is their simpler calculations and expression. However, the number of samples needed for the estimation is higher for cumulants than for correlation. In this chapter, we will not go into the details of these techniques but pros and cons of several techniques found in the literature will be given. In [37], a cumulant-based technique is introduced with a generalizedHammerstein model. Based on the input statistic knowledge and a normalization assumption on the estimated coefficients vector, the coefficients of the generalized Hammerstein model can be determined. If the input statistics are not known, a Gaussian distribution is then assumed which is suitable for OFDM modulations. Furthermore, this technique is robust against the noise if its distribution is Gaussian. However, a major drawback of this technique is the normalization assumption taken which can be restrictive, since simple polynomial models cannot be modeled. Moreover, to be completely blind, the input distribution have to be Gaussian which limits the number of use cases. An alternative technique proposed in [38] is based on the maximum-likelihood criteria for Wiener systems. In this method, the conditional negative log-likelihood is determined and used by a Gauss–Newton algorithm to determine iteratively the best estimation. In this solution, a white Gaussian input is assumed, therefore dedicating this method to communication standards using OFDM modulations [39]. Moreover, a major drawback is the sensitivity to output noise; therefore, authors suggest to use this method for the identification of an RF amplification chain when the input is an OFDM communication signal. Based on [38], [40] proposes the blind identification
52
Digitally enhanced mixed signal systems
in two different cases, QAM and PSK modulations and OFDM modulations. Compared to [38], normalization constraints have been relaxed and models have been extended to complex values. In [41], a blind system identification technique is proposed based on polyspectra with a Wiener–Hammerstein model. Similar to power spectrum density which is defined by the Fourier transform of the autocorrelation, the N th-order polyspectrum is defined by the N -dimensional Fourier transform of the N -dimensional correlation function. In this solution, polyspectrum for different orders is calculated, and then 2-D slices are extracted to make the estimation. The major limitations of this solution are the input which have to be complex Gaussian and the minimum phase assumption for the first filter of the model (Table 2.1). As a conclusion, contrary to blind estimation technique of linear systems [42,43], the knowledge on blind identification techniques of nonlinear systems is rather limited. Table 2.1 list the main approaches with the according assumptions and properties of each method. Blind identification techniques of nonlinear systems are dedicated to signals with a particular probability density, usually two cases can be distinguished, Gaussian and non-Gaussian. Furthermore, in blind identification methods, HOS methods show to be more robust against noise than SOS mainly due to the denoising property of cumulant with additive white Gaussian noise. Finally, these solutions require strong nonlinear systems (polynomial coefficients greater than 10−1 ) to be accurate [44]. Table 2.1 Blind digital estimation techniques Model approach
Assumptions and properties of the method
Generalized Hammerstein: cumulant based [37]
• • • • •
Input: Gaussian and white Robust to low-order moving average output noise Consistent Parametric Stringent normalization assumption
Wiener: maximum likelihood [38]
• • • • • •
Input: Gaussian and white Invertible nonlinearity and filtering Not robust to output noise Consistent and efficient Iterative Gauss–Newton-based methods Parametric
Generalized Hammerstein: cumulant based [40]
• • • • •
QAM, PSK and OFDM. Robust to low-order moving average output noise Consistent Parametric Relaxed normalization assumption
Wiener–Hammerstein: polyspectra slices [41]
• • • • • •
Input: circularly symmetric Gaussian Polynomial nonlinearity First linear system has minimum phase Robust to circular symmetric output noise Nonlinearity is not identified Nonparametric
Nonlinear modeling
53
2.4.1.2 Inversion pth-Order inverse In order to perform the linearization, the nonlinear estimated system inverse has to be determined. As we have seen above in Figures 2.14(b), 2.14(a), 2.15(a), 2.15(b), the inverse can be located either at the front of the nonlinear system (pre-distortion) either at the back (post-distortion). A common technique to inverse a nonlinear system is the pth-order inverse proposed by Schetzen [45]. In this technique, distortions up to the order p are canceled. In Figure 2.17, the nonlinear system depicted by its Volterra operator H [.] is followed by the pth-order inverse depicted by the Volterra operator K [.] The combination of the nonlinear system and its inverse forms a global system depicted by the Volterra operator Q [.] which should be more linear than H [.]. The nonlinear system input is denoted x(t) and its output y(t), whereas the pth-order inverse output is denoted z(t). Let us write the nonlinear system output y(t) = H [x(t)] =
∞
Hn [x(t)]
(2.45)
n=1
where Hn [.] is the nth-order Volterra operator. ∞ Hn [x(t)] =
∞ ···
τ1 =0
hn (τ1 , . . . , τn )
n
x(t − τi )dτi
(2.46)
i=1
τn =0
Let us express now the system output made up of the nonlinear system and its pth-order inverse K [.]: z(t) = K [y(t)]
(2.47)
p
=
K n [y(t)]
(2.48)
n=1
Furthermore, using the operator of the cascaded system Q [.]: z(t) = Q [x(t)]
(2.49)
= K [ H [x(t)]] = x(t) +
∞
(2.50) Qn [x(t)]
(2.51)
n=p+1
x(t)
y(t) H
z(t) K
x(t)
z(t) Q
Figure 2.17 Cascade of the nonlinear system followed by its pth-order inverse
54
Digitally enhanced mixed signal systems
In the pth-order inverse technique, K n [.] is determined such as p
Qn [x(t)] = x(t).
(2.52)
n=1
The remaining term +∞ n=p+1 Qn [x(t)] will be non-zero for models using a nonorthogonal basis such as Volterra series or polynomial models. In order to show the non-orthogonality of the polynomial model, let us take a simple example described by Figure 2.18. In this example, the nonlinear system is a fifth polynomial order model and the input is a unique tone at the radial frequency ω0 . The nonlinear system output with its input can be expressed as y = α1 x + α2 x2 + α3 x3 + α4 x4 + α5 x5 ,
(2.53)
x = A sin(ω0 t).
(2.54)
In an orthogonal basis, the x2 would contribute only to the second harmonic at 2ω0 , x3 would contribute only to the third harmonic at 3ω0 and so on. However, since the polynomial model has not an orthogonal basis, the x2 term contributes to the second harmonic and to the DC, the x3 term contributes to the third harmonic and fundamental, the term x4 contributes to the fourth, second and DC harmonics, the term x5 contributes to the fifth, third and first harmonics. For the pth-order inverse technique, it means that with Taylor or Volterra series, the term ∞ n=p+1 Qn [x(t)] will always have distortions terms within the band of the input signal x(t). Therefore in practice, this technique is used only for weakly nonlinear systems in a limited range of input power. Moreover, the inverse complexity increases exponentially with p which limits in practice the inverse to a low value of p. To tackle this issue, [46] proposed to reduce the complexity by relaxing the maximum distortion order (p in Schetzen’s technique) of the inverse. The simplified inverse algorithm is then expressed by a recursive equation. Another approach in [4] proposes to perform the pth-order inverse of Schetzen with orthogonal basis for memoryless models. As a matter of fact, distortions of different orders are uncorrelated with orthogonal
(4) (2)
(4) (2) (5) (3)
Nonlinear system w0
w
(5) (3) (1)
0
(4) (5)
w0 2w0 3w0 4w0 5w0
Figure 2.18 Harmonic distortion: high-order distortion
w
Nonlinear modeling
55
polynomial models. For example, given the third-order Chebyshev polynomial and a normalized tone, x(t) = cos(ωt): T3 (x) = 4x3 − 3x 1 3 cos(ωt) + cos(3ωt) − 3 cos(ωt) = 4 4 4 = cos(3ωt).
(2.55)
As can be seen, only a third-order distortion exists in a third-order Chebyshev polynomial contrary to power series that also contain a term at the radial frequency ω. Therefore, a full cancellation of distortion terms up to the order p can be achieved. Thereby using a Chebyshev model with a sinusoidal input, or a Hermite model with a Gaussian input allows to compute an exact inverse.
DLA analytical method This method, first demonstrated in [47] and showed in Figure 2.19, aims to find a pre-inverse (c.f. Figure 2.20) for a memory polynomial model with an analytical method. First the output of the nonlinear system is expressed using the memory polynomial model: l=L yn = Bl (bl , xn−l ) (2.56) l=0
where L is the maximum delay and,
u(n)
Pre -inverse
x(n)
y(n)
Nonlinear system System identification
Inverse calculation
Figure 2.19 Digital pre-distortion scheme 1 ß0 (.) z –1 u(n) M
+ + –
x(n)
×
M
M
ß1(.)
z –1
Figure 2.20 Pre-inverse: DLA AM algorithm scheme
56
Digitally enhanced mixed signal systems Bl (bl , xn − l) = xn−l
Dl
bld |xn−l |d−1
d=1
= xn−l βl (|xn−l |).
(2.57)
βl is therefore a polynomial function which take the input x with a specific delay. In a perfect pre-distortion system, the nonlinear system output y(n) should be equal to the pre-distorter input u(n). Therefore, we can write un =
l=L
xn−l βl (|xn−l |).
(2.58)
l=0
Finally, let us isolate the term xn which is the pre-distorter output: 1 xn−l βl (|xn−l |)). (un − xn = β0 (|xn |) l=1 l=L
(2.59)
However, the expression also depends on xn . To cope with that issue, xn is approximated iteratively: 1 (un − xn−l βl (|xn−l |)) β0 (|˜xn (i)|) l=1 l=L
x˜ n (i + 1) =
(2.60)
where x˜ n is an estimation of xn and x˜ n (0) is set to un . The algorithm converges typically after five to ten iterations (Figures 2.19 and 2.20).
DLA based on nonlinear filter As for the DLA-AM, the DLA based on nonlinear filters (DLA-NF) is used to compute the coefficients of a pre-distorter. This method and its calculations are showed in [48] as for the DLA-AM architecture. In the DLA-NF, we assume that the PA has been modeled with a memory polynomial model and that its coefficients have been properly estimated. Using an adaptive algorithm, the coefficients of the pre-inverse are calculated such that the error between the PA output and the pre-inverse input tends to zero. Usually, adaptive filters are used for the estimation, but they face numerical instabilities due to local minima. Local minima appear for the main reason that we face a nonlinear optimization problem. In the digital pre-distortion scheme (Figure 2.19) where a PA is considered as nonlinear system, we can express the output of the pre-distorter x as a function of the input u using a memory polynomial model: x(n) =
P−1 M
wmp φmp [u(n)]
(2.61)
p=0 m=0
As can be seen in this expression, x(n) depends linearly on the weight wmp in a nonlinear basis φ[] of the input u(n). To get a more compact expression of x, a matrix form can be used: x(n) = wT φ[u(n)]
(2.62)
Nonlinear modeling
57
The output of the pre-distorter x(n) is the PA input which model coefficients akl have been estimated. In the ideal case where wmp are perfect, the signal x(n) at the PA input gives d(n) = gu(n) as output where g is the PA gain. The error function is therefore given by e(n) = d(n) − y(n). A cost function could be the mean square of the error, E[|e(n)|2 ]. However, in practice, we do not have this expected value; therefore, we approximate this value by the current square error |e(n)|2 . This gives a well-known adaptive filter, the LMS algorithm. The coefficients vector w is updated at every iteration such that when w converges, it minimize the error |e(n)|2 : w(n + 1) = w(n) + μe∗ (n)ψ(n)
(2.63)
where μ is the step size parameter which set the convergence speed as well as the variance of the estimator and ψ(n) = (∂y(n)/∂x(n)). In order to determine ψ(n), the partial derivative is rewritten using the estimated coefficients vector: ∂y(n) ∂y(n) ∂w(n) = ∂x(n) ∂w(n) ∂x(n)
(2.64)
Assuming that the vector w(n) changes with very small steps, we can write w(n) ≈ w(n − l) where l ∈ {1, . . . , L} and L is the memory depth of the PA memory polynomial model and determine the final expression of the LMS: w(n + 1) = w(n) + μe∗ (n)
L K k +2 akl |x(n − l)|k u(n − l) 2 l=0 k=0
(2.65)
As can be seen in the above expression, each new iteration depends on the error committed in the previous sample, on the pre-distorter input and output, and finally on the model coefficients of the PA akl .
2.4.2 Indirect learning architecture 2.4.2.1 Introduction In the direct correction architecture, we have seen that for estimating the nonlinear system inverse, two steps were required. First an estimation of the nonlinear system is done with a predetermined nonlinear model. Then, using the estimated nonlinear model, an inverse is calculated. An alternative way is to bypass the calculation of the inverse from the nonlinear system model and directly estimate the nonlinear system inverse. This is called the indirect learning architecture. As shown with the DLA, four different cases can be drawn, aided estimation and pre-inverse in Figure 2.21(a), aided estimation and post-inverse in Figure 2.21(b), blind estimation and pre-inverse in Figure 2.22(a), blind estimation and post-inverse in Figure 2.22(b). Regarding the compensation of nonlinear distortions at the PA output, the aided architecture with pre-inverse is preferred. In some cases, especially for mobile applications, the blind architecture with pre-inverse can be used. Indeed, to save power consumption, area and cost, the estimation of the nonlinear model inverse in a mobile phone can be performed at the base station. This scenario assumes a feedback path to send back the estimated model to the mobile. To compensate nonlinear distortions from the
58
Digitally enhanced mixed signal systems
Estimated inverse
Nonlinear system
Nonlinear system
system inverse identification
system inverse identification
(a)
Estimated inverse
(b)
Figure 2.21 Aided indirect correction architectures: (a) with pre-inverse and (b) with post-inverse
Estimated inverse
Nonlinear system
Nonlinear system
System inverse identification (a)
Estimated inverse System inverse identification
(b)
Figure 2.22 Blind indirect correction architectures (a) with pre-inverse and (b) with post-inverse
receiver, the blind architecture with post-inverse is often the only alternative, unless pilot signal can be used to perform an aided estimation.
2.4.2.2 Nonlinear system inverse estimation Aided estimation The aided estimation of the nonlinear system inverse is described in Figure 2.23. The nonlinear system input x(n) is first distorted by the nonlinear system whose output is y(n). The distorted signal y(n) is then processed by the estimated nonlinear inverse with the initialization vector w0 (n). The resulting signal d(n) is then compared to the nonlinear system input x(n) to give the error signal e(n). An estimation algorithm can be used to estimate the next coefficients vector w(n) used by the estimated nonlinear inverse. To perform its estimation, the algorithm uses the error signal e(n) as well as the nonlinear system output y(n). As in DLA, the estimation algorithm is usually a least square-based algorithm. Once estimated, the inverse model can be applied as a pre or a post-inverse.
Blind estimation In the blind scenario, we assume that the input signal x(n) is unknown. Therefore, it is impossible to compare the output of the nonlinear system inverse d(n) with the
Nonlinear modeling x(n)
DAC
+ +
d(n) –
59
y(n)
Nonlinear system
ADC
Estimated nonlinear model inverse w (n)
e(n)
Estimation algorithm
Figure 2.23 ILA-aided identification schematic Nonlinear distortions
x(n) Interferer
Nonlinear system
y(n)
d(n)
Estimated inverse w (n)
Desired channel
System inverse identification
є(n)
Figure 2.24 ILA Blind identification schematic with post-distortion nonlinear system input x(n) to feed the estimation algorithm. To cope with this issue, the property of nonlinear system to spread the signal bandwidth can be used. Assume that nonlinear distortions fall into a free frequency band, then the signal contained in this free frequency band can be used as an error signal to feed the estimation algorithm of the nonlinear inverse. Indeed, suppress nonlinear distortions in a specific band can be considered as linearizing the system in the whole system bandwidth. This is particularly true if the nonlinear system considered has no memory or a short memory depth. The given scenario is depicted in Figure 2.24.
2.5 Summary Nowadays, with increasingly wide systems bandwidth, nonlinearity constraints become more and more critical. Therefore, being able to accurately model the
60
Digitally enhanced mixed signal systems
nonlinear behavior of the RF blocks becomes essential. Models are necessary not just to characterize but also to correct and compensate the nonlinear behavior of systems with increasingly interesting digital solutions. The choice of the model will depend on the application purpose: complex models will be suited for characterization and for the correction of wired power devices; on the other hand, low complexity models will be suited for mobile applications. In this chapter, we have first seen classical nonlinear models from the literature, from the simplest to the most complex. Then we have discussed about which models are most suited for typical RF blocks. Finally, we have introduced how these models could be used for the digital compensation of nonlinear distortions in transmitters and in receivers. This chapter is also an introduction to the following chapters which deal more precisely with the compensation of nonlinear systems, as with the pre-distortion and the post-distortion techniques.
References [1] [2]
[3]
[4]
[5] [6]
[7] [8] [9] [10] [11] [12]
Batruni R. Curing nonlinear distortion. Embedded Systems Design. 2006;19(8):20–36. Schetzen M. The Volterra and Wiener Theories of Nonlinear Systems. A Wiley – Interscience Publication. Wiley; 1980. Available from: https://books. google.fr/books?id=S0XvAAAAMAAJ. Mileounis G, Koukoulas P, and Kalouptsidis N. Input-output identification of nonlinear channels using PSK, QAM and OFDM inputs. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2008. ICASSP 2008; 2008. p. 3589–3592. Tsimbinos J and Lever KV. Nonlinear system compensation based on orthogonal polynomial inverses. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications. 2001;48(4):406–417. Schreurs D, O’Droma M, Goacher AA, et al. RF Power Amplifier Behavioral Modeling. 1st ed. New York, NY, USA: Cambridge University Press; 2008. Gharaibeh KM. Nonlinear Distortion in Wireless Systems: Modeling and Simulation with MATLAB® . Piscataway, NJ: IEEE Press; Chichester, West Sussex, UK: Wiley; 2011. Zhou GT, Qian H, Ding L, et al. On the baseband representation of a bandpass nonlinearity. IEEE Transactions on Signal Processing. 2005;53(8):2953–2957. Albulet M. RF Power Amplifiers. Electromagnetic Waves. Institution of Engineering and Technology; 2001. Cripps SC. Advanced Techniques in RF Power Amplifier Design. Artech House Microwave Library. Norwood, MA: Artech House; 2002. Vuolevi J and Rahkonen T. Distortion in RF Power Amplifiers. Artech House Microwave Library. Norwood, MA: Artech House; 2003. Raghavan A, Srirattana N, and Laskar J. Modeling and Design Techniques for RF Power Amplifiers. Wiley – IEEE. Chichester, UK: Wiley; 2008. Eroglu A. Introduction to RF Power Amplifier Design and Simulation. Oakville, Canada: CRC Press; 2015.
Nonlinear modeling [13] [14]
[15]
[16] [17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
61
Ghannouchi FM, Hammi O, and Helaoui M. Behavioral Modeling and Predistortion of Wideband Wireless Transmitters. New York: Wiley; 2015. Isaksson M, Wisell D, and Ronnow D. A comparative analysis of behavioral models for RF power amplifiers. IEEE Transactions on Microwave Theory and Techniques. 2006;54(1):348–359. Pedro JC and Maas SA. A comparative overview of microwave and wireless power-amplifier behavioral modeling approaches. IEEE Transactions on Microwave Theory and Techniques. 2005;53(4):1150–1163. Razavi B and Behzad R. RF Microelectronics. vol. 1. Upper Saddle River, NJ: Prentice Hall; 1998. Ganesan S, Sanchez-Sinencio E, and Silva-Martinez J. A highly linear lownoise amplifier. IEEE Transactions on Microwave Theory and Techniques. 2006;54(12):4079–4085. Chen WH, Liu G, Zdravko B, et al. A highly linear broadband CMOS LNA employing noise and distortion cancellation. IEEE Journal of Solid-State Circuits. 2008;43(5):1164–1176. Fan X, Zhang H, and Sanchez-Sinencio E. A noise reduction and linearity improvement technique for a differential cascode LNA. IEEE Journal of SolidState Circuits. 2008;43(3):588–599. Zhang H and Sanchez-Sinencio E. Linearization techniques for CMOS low noise amplifiers: A tutorial. IEEE Transactions on Circuits and Systems I: Regular Papers. 2011;58(1):22–36. Aparin V and Larson LE. Modified derivative superposition method for linearizing FET low-noise amplifiers. IEEE Transactions on Microwave Theory and Techniques. 2005;53(2):571–581. Fairbanks JS and Larson LE. Analysis of optimized input and output harmonic termination on the linearity of 5 GHz CMOS radio frequency amplifiers. In: Radio and Wireless Conference, 2003. RAWCON ’03. Proceedings; 2003. p. 293–296. Baki RA, Tsang TKK, and El-Gamal MN. Distortion in RF CMOS shortchannel low-noise amplifiers. IEEE Transactions on Microwave Theory and Techniques. 2006;54(1):46–56. Aparin V and Persico C. Effect of out-of-band terminations on intermodulation distortion in common-emitter circuits. In: 1999 IEEE MTT-S International Microwave Symposium Digest (Cat. No.99CH36282). vol. 3; 1999. p. 977–980. Zhang H, Fan X, and Sanchez-Sinencio E. A low-power, linearized, ultrawideband LNA design technique. IEEE Journal of Solid-State Circuits. 2009;44(2):320–330. Kim TW. A common-gate amplifier with transconductance nonlinearity cancellation and its high-frequency analysis using the Volterra series. IEEE Transactions on Microwave Theory and Techniques. 2009;57(6):1461–1469. Allén M, Marttila J, Valkama M, et al. Digital linearization of direct-conversion spectrum sensing receiver. In: 2013 IEEE Global Conference on Signal and Information Processing; 2013. p. 1158–1161.
62 [28]
[29]
[30]
[31]
[32]
[33] [34] [35] [36]
[37] [38]
[39] [40]
[41]
[42] [43]
Digitally enhanced mixed signal systems Habibi H, Janssen EJG,Yan W, et al. Digital compensation of cross-modulation distortion in multimode transceivers. IET Communications. 2012;6(12): 1724–1733. Allén M, Marttila J, Valkama M, et al. Digital full-band linearization of wideband direct-conversion receiver for radar and communications applications. In: 2015 49th Asilomar Conference on Signals, Systems and Computers; 2015. p. 1361–1368. Umoh I and Ogunfunmi T. Digital post-linearization of a Wideband Low Noise Amplifier for ultra-wideband wireless receivers. In: 2011 IEEE International Symposium of Circuits and Systems (ISCAS); 2011. p. 1275–1278. Umoh I and Ogunfunmi T. Digital post-distortion linearization of wideband wireless receiver nonlinearity. In: 2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS); 2014. p. 431–434. Grimm M, Allén M, Marttila J, et al. Joint mitigation of nonlinear RF and baseband distortions in wideband direct-conversion receivers. IEEE Transactions on Microwave Theory and Techniques. 2014;62(1):166–182. Linear Technology. Very Low Noise, High Frequency Active RC, Filter Building Block, LT1568 datasheet, 2007. Linear Technology. Precision, Low Power, Differential Amplifier/ADC Driver Family. LTC6363 Family, 2018. Ghannouchi FM and Hammi O. Behavioral modeling and predistortion. IEEE Microwave Magazine. 2009;10(7):52–64. Yu C, Guan L, Zhu E, et al. Band-limited Volterra series-based digital predistortion for wideband RF power amplifiers. IEEE Transactions on Microwave Theory and Techniques. 2012;60(12):4198–4208. Kalouptsidis N and Koukoulas P. Blind identification of Volterra–Hammerstein systems. IEEE Transactions on Signal Processing. 2005;53(8):2777–2787. Vanbeylen L, Pintelon R, and Schoukens J. Blind maximum-likelihood identification of Wiener systems. IEEE Transactions on Signal Processing. 2009;57(8):3017–3029. Brillinger DR. Time Series: Data Analysis and Theory. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics; 2001. Mileounis G, Kalouptsidis N, and Koukoulas P. Blind identification of Hammerstein channels using QAM, PSK, and OFDM inputs. IEEE Transactions on Communications. 2009;57(12):3653–3661. Prakriya S and Hatzinakos D. Blind identification of LTI-ZMNL-LTI nonlinear channel models. IEEE Transactions on Signal Processing. 1995;43(12): 3007–3013. Abed-Meraim K, Qiu W and Hua Y. Blind system identification. Proceedings of the IEEE. 1997;85(8):1310–1322. Tong L and Perreau S. Multichannel blind identification: from subspace to maximum likelihood methods. Proceedings of the IEEE. 1998;86(10): 1951–1968.
Nonlinear modeling [44]
[45] [46]
[47]
[48]
63
Peng L and Ma H. Design and implementation of software-defined radio receiver based on blind nonlinear system identification and compensation. IEEE Transactions on Circuits and Systems I: Regular Papers. 2011;58(11): 2776–2789. Schetzen M. Theory of pth-order inverses of nonlinear systems. IEEE Transactions on Circuits and Systems. 1976;23(5):285–291. Sarti A and Pupolin S. Recursive techniques for the synthesis of a pth-order inverse of a Volterra system. European Transactions on Telecommunications. 1992;3(4):315–322. Kim J and Konstantinou K. Digital predistortion of wideband signals based on power amplifier model with memory. Electronics Letters. 2001;37(23): 1417–1418. Hussein MA, Bohara VA, and Venard O. On the system level convergence of ILA and DLA for digital predistortion. In: 2012 International Symposium on Wireless Communication Systems (ISWCS); 2012. p. 870–874.
This page intentionally left blank
Chapter 3
Digital predistortion Geneviève Baudoin1 , Olivier Venard1 , and Dang-Kièn Germain Pham2
In [1], Katz et al. provide an overview about history of power amplifier (PA) linearization and draw the picture for its motivation. The concern in linearizing power amplifier dates from the beginning of broadcasting and the expansion of the telecommunications in the 1920s [1]. In these early years, the feedforward approach was introduced by the Bell Labs to mitigate the cross modulation of voice-modulated carriers transmitted through cable and repeaters. A few years later, the Bell Labs introduced the feedback linearizer architecture which has the advantage over the feedforward architecture to be self-adaptive to drifts but has the drawback to be narrowband regarding nowadays needs. From the 1980s, the concern was not only the linearization but also the efficiency of the transmitter and (analog) predistorters start to be used for satellites link. As the transmissions shift to communication with higher spectral efficiency waveform such as Quadrature Amplitude Modulation with M symbols (M-QAM), the requirement for linearity becomes more stringent and open the way for digital predistortion development (DPD). Even if the main goal of DPD is to linearize the power amplifier, it also contributes, in many cases, to improve the power efficiency, which is important since power amplifiers are responsible for a great part of the power consumption in base stations of wireless cellular networks [2]. It is worth noting that linearization and predistortion were also widely used in the field of audio and especially high-fidelity audio where loudspeakers are highly nonlinear components [3]. This domain also gave rise to a lot of papers about Volterra series-based modeling.
3.1 Why do we need predistortion? Modulated waveform used for modern wireless communications, see [4], for instance, for 5G candidates, have high spectral efficiency (i.e., (bit/s)/Hz), the counter part of this high efficiency is the increase of the dynamic range between the average transmitted power and the peak transmitted power (peak-to-average power ratio, PAPR).
1 2
ESIEE-Paris, System Engineering Department – ESYCOM CNRS FRE 2028, France ComElec Department, LTCI, Télécom ParisTech, France
66
Digitally enhanced mixed signal systems
On the other hand, the wider is the PAPR the harder it is for a PA to transmit a waveform with low distortion (i.e., high linearity) in a power-efficient manner.
3.1.1 Waveform features 3.1.1.1 Complementary cumulative distribution function A useful statistical characterization of the waveform to be transmitted is the complementary cumulative distribution function (CCDF) of its instantaneous power, an example of which for a multi-carrier signal is plotted Figure 3.1(a). The CCDF, G(β), is defined as β G(β) = 1 − p(x)dx, (3.1) −∞
where p(x) is the probability distribution function (PDF) of the waveform instantaneous power, this PDF for the same multi-carrier signal is plotted in dotted line in +∞ Figure 3.1(b). The average power Pavg , which is −∞ xp(x)dx statistically speaking, is highlighted in Figure 3.1(b) with the vertical line around 1 dBm. On its right hand, another vertical light line emphasizes the considered peak power Ppeak , around 9 dBm on this example. As stated in (3.1), the instantaneous power ranges from −∞ to +∞, so the considered peak power corresponds to a value α for which α p(x)dx < 1. (3.2) p (x < α) = −∞
The probability that the instantaneous power of the waveform exceeds the considered peak power is then given by +∞ p(x)dx. (3.3) p (x > α) = α
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 –30
(a)
4
×10–3
3.5 3 2.5 PAPR 7.6 dB
AM M – AM
PAPR
2 1.5
ηPAE P
1 0.5 –25
–20
–15
–10 –5 dBm
0
5
10
0 –30
(b)
–25
–20
–15
–10
–5
0
5
10
dBm
Figure 3.1 CCDF and PAPR: (a) CCDF of instantaneous power (dBm) of the transmitted signal (3.1) and (b) PDF of instantaneous power (dBm) of the transmitted signal, p(x) (dotted curve). AM–AM PA curve of the PA (top curve), instantaneous power added efficiency of the PA, ηPAE (bottom curve)
Digital predistortion
67
Obviously, α should be chosen so as p (x > α) is small enough to be considered neglectable, typically 1% or 1‰.
3.1.1.2 Peak-to-average power ratio From the above discussion, the linear PAPR is defined as PAPRlin =
Ppeak , Pavg
(3.4)
or in dB: Ppeak . = 10 log Pavg
PAPRdB
(3.5)
Another equivalent figure, sometimes considered, is the crest factor (CF) which is the square root of the PAPR: xmax CFdB = 10 log , (3.6) xrms where xmax is the maximum value of the magnitude of the waveform and xrms is the RMS value of this magnitude.
3.1.1.3 Frequency bandwidth Another feature of the transmitted waveform that will have a strong impact is its frequency bandwidth, as for wideband signals, the PA could exhibit nonlinear memory effects which will generally be lower or even neglectable for narrowband signals [5].
3.1.1.4 Stationarity In the simplest scenario, the above-discussed features of the transmitted waveform will remain constant over time, but in some applications or for some waveforms, they will vary with time giving rise to a dynamic nonlinear behavior of the PA. Such non-stationarity will happen for instance when dealing with frequency hopping waveform or 4G base station PA where the bandwidth of the transmitted signal will vary with the number of communications.
3.1.2 System level considerations Many authors emphasize that the power consumption of a transmission system and its power efficiency result from the operating points of its “components” [6] and that an optimal choice results from trade-off between conflicting constraints [7]: the waveform considered and its spectral efficiency vs PAPR; the CF reduction (CFR) and the digital predistortion components algorithms and parameterization, the selected backoff for the PA and finally the budget of allowed degradation of the transmitted signal (in-band and out-of-band distortion).
68
Digitally enhanced mixed signal systems
3.1.2.1 Linearity–efficiency trade-off Roughly speaking, the operating point of a PA will correspond to the targeted average transmitted power, Pavg , Figure 3.1(b). The average power added efficiency is then given by Ppeak PAEavg = ηPAE (y)p(y)dy, (3.7) 0
where p(y) is the PDF of the instantaneous power and ηPAE (y) is the related PAE. To reach the best efficiency (around 11 dBm in Figure 3.1(b)), the operating point must be the closest as possible to the saturation level of the PA, but the PA becomes highly nonlinear in this region. As the transmitted signal will experience nonlinearities, its figures of merit (error vector magnitude (EVM) and adjacent channel power ratio (ACPR), which will be defined in Section 3.1.2.3 hereafter) will be degraded. So its dynamic range must lie in a linear behavior area or at least a mild nonlinear behavior to accommodate the allowable distortion budget (EVM and ACPR) at the transmitter output. But doing so, the resulting PAEavg , (3.7), will be degraded.
3.1.2.2 System level trade-off The ultimate goal of a transmission link is to guaranty a given maximum bit error rate (BER). The BER is a function of the SNR at the receiver which in turn depends on the attenuation of the transmission channel and the EVM budget of the transmitter. In [6], the authors take into account the power consumption of the DPD and CFR functions to state logically that DPD and CFR are fruitful when the sum of their power consumption plus the PA power consumption is below the power consumption of the PA alone for a given BER. When there is no linearization processing, the linearity of the transmitter relies only on the output backoff (OBO) which, in turn, depends on the PAPR of the transmitted signal and the PA transfer function. On the opposite, when linearization processing is considered, the OBO can be decreased because of the CFR and because the DPD will be able to mitigate the nonlinear behavior of the PA. The efficiency improvement obtained will provide the power budget to implement the linearization processing. As the OBO is decreased, the nonlinear behavior of the PA will be degraded, this may have two consequences: the complexity of the CFR/DPD is increased, so their power consumption and the DPD may not be able to compensate completely the nonlinearities. In order to preserve the BER, we would need to increase the output power to get a better SNR at the receiver. And this, obviously, increased the power consumption, etc.
3.1.2.3 Figures of merit Nonlinearities have two impacts on the transmitted waveform, namely, the in-band distortion which is a degradation of the information transmitted by the transmitter under consideration and the out-of-band distortion which is a perturbation caused to the other transmissions, i.e., a degradation of their transmission channel. The measures
Digital predistortion
69
of quality for these distortions are, respectively, the EVM and the ACPR or adjacent channel leakage ratio (ACLR). The definitions of these figures of merit are given hereafter.
Error vector magnitude The EVM is a distortion measure between a reference constellation mapping and the actual symbol constellation, which can be at the output of the transmitter or at the receiver just in front of the decision device. To compute the EVM, one has to first normalize the power of both constellations under consideration and to compensate for the rotation and the DC offset of the actual symbol constellation (Figure 3.2). Let us consider z(k) the complex actual symbol, it can be related to s(k), the corresponding reference constellation point by the following relation: z(k) = (c0 + c1 (s(k) + e(k))) ea+jr ,
(3.8)
where c0 is the DC offset of the origin of the constellation, c1 is the complex normalization gain, a comes from an amplitude variation and r is a phase offset coming from a frequency offset. The residual error which allows to compute the EVM is then given by z(k)e−(a+jr) − c0 − s(k). (3.9) c1 There are three different ways to normalize the average power of the error signal: it can be normalized to the average power of the actual symbols, s(k), or to the nominal average power of the constellation or to the nominal maximum power of the constellation. Unless otherwise noted, we will consider in the sequel the first choice given by |e(k)|2 EVMRMS = k . (3.10) 2 k |s(k)| e(k) =
The EVM may be expressed in dB but more often in percentage, (%).
Err Tx Rx Corr DCoffset
Figure 3.2 Illustration of the computation of the EVM
70
Digitally enhanced mixed signal systems
ACPRalt low
Alternate lower channel
Reference channel
ACPRadj low
Measurement bandwidth Channel bandwidth
Adjacent lower channel
Adjacent upper channel
Alternate upper channel
Channel offset
Figure 3.3 ACPR/ACLR measurement
Adjacent channel power ratio/Adjacent channel leakage ratio ACPR or equivalently the ACLR is the ratio of the mean power in the channel used for transmission to the mean power on an adjacent channel frequency. Adjacent channels generally refer to the IM3 zone and alternate channels the IM5 zone. From a system point of view, the ACPR is measured as follows: let us denote the channel offset (Figure 3.3) by FOff , the measurement bandwidth by B and X ( f ) being the power spectral density of the transmitted signal. For the complex baseband representation of the transmitted signal, the adjacent lower and upper ACPR is given, respectively, by B/2 X ( f )df −B/2 ACPRadj low (dB) = 10 log10 −Foff +B/2 (3.11) X ( f )df −Foff −B/2
B/2
X ( f )df ACPRadj up (dB) = 10 log10
−B/2 Foff +B/2
(3.12) X ( f )df
Foff −B/2
The alternate lower and upper ACPR are obtained by substituting, respectively, −Foff with −2Foff in (3.11) and Foff with 2Foff in (3.12).
3.2 Principles of predistortion The principle of predistortion is to transform the signal sent to the amplifier so as to compensate for the distortions that will be introduced by the amplifier. The transformation applied by the predistortion system should ideally be the opposite of the distortion generated by the amplifier. The predistortion device can be inserted at
Digital predistortion
71
different levels of the amplification chain, in baseband, at intermediate frequency (IF) or at radio frequency (RF). Figure 3.4 shows the basic principle of predistortion. Two questions arise for a predistortion system: calculating the predistortion operator and implementing it. The second question is especially delicate when the realization is made in analog RF. We limit ourselves here mainly to the digital baseband predistortion which presents the best performance. We will note DPD the digital predistorter. Predistortion can be made adaptive to avoid having to characterize the amplifiers a priori or to compensate for the variations of the amplifier in function of temperature, aging, etc. The first DPD systems for wireless communications were proposed by Nagata [8] in 1989, Cavers [9,10] in 1990 and Wright [11] in 1992. Figure 3.5 shows the basic diagram of digital baseband predistortion in the case where an adaptive control is set up. In order to obtain a reference signal to adapt the DPD, it needs a return path in which a part of the PA output signal is taken thanks to a coupler. In this basic diagram, the frequency transposition is performed by an analog IQ modulator, which has the drawback of possible imbalances between I and Q channels. One can of course imagine different variants of this scheme with IF or direct conversion. The signal of the return path used to control the predistortion should be of the highest quality since it serves as a reference signal for correction. For this purpose, it is advisable to use the mixers at the lowest power level possible. The power consumption of digital to analog converters (DAC) and analog-to-digital converters (ADC) is not Predistortion
PA characteristics Pout
Pout
Resulting transfer function Pout Phase
Pin
Pin
Predistorter
Pin
PA
Figure 3.4 Principle of predistortion of power amplifiers Analog
Digital I(n) Q(n)
Predistorter (FPGA)
DAC
RF upconverter
Dual/Single ADC
ZIF/IF downconverter
y(t) PA
DSP
Figure 3.5 Transmitter architecture with DPD
72
Digitally enhanced mixed signal systems
negligible (especially for broadband signals) and degrades the efficiency especially if the power output of the amplifier is not very important. The sampling frequency on the direct path must be greater than the Shannon frequency of non-pre-distorted signal. Indeed the predistortion is a nonlinear operation which widens the signal bandwidth. The higher the order of intermodulation products to be corrected, the higher the necessary sampling frequency. The optimal DPD must lead to a linear transmitter (DPD + amplifier) with an overall gain G0 up to a maximum value of the input power Pin,max (smaller than the saturation amplitude). Peak-backoff (PBO) is the difference between the desired maximum power and the saturation power. Several choices are possible for the reference gain (Figure 3.6). A current choice for G0 is the PA gain at input power Pin,max . It can be quite significantly lower than the gain of the amplifier in its linear operating area (which can also be chosen as a reference gain). The chosen maximum power point determines the reference gain G0 . We then have the following relationship between the mean and maximum output powers, the saturation power of the amplifier, the PAPR of the input signal (PAPR) and the PBO: P out,dB = Pout,max,dB − PAPRin,dB = Pout,sat,dB − PBOdB − PAPRin,dB . The choice of the reference gain at input power Pin,max is interesting because it facilitates power control since the original input signal and the predistorted signal can be normalized by the same scaling value [12]. Figure 3.7 shows the linearization results for a very nonlinear Base station PA. This PA is a three-way Doherty PA with a peak output power of 57 dBm. It is built with three LDMOS BLF7G22LS-130 from Ampleon. Tests are done with an LTE signal
1.4 1.2 Psat
Output power
out Pmax
1 PD+PA with gain G(2) 0
0.8 0.6
PA alone
0.4
PD+PA with gain G(1) 0
0.2 0
0
0.2
in,2
Pmax
0.4 Input power
0.6
0.8 in,1 Pmax
Figure 3.6 Different possible choices for the reference gain
1
Digital predistortion
Phase output vs input
0.7 |Output|
PA output DPD+PA output DPD output
3
0.6 0.5 0.4 0.3 0.2
Normalized power spectral densities
PA output DPD+PA output DPD output
0.8
2 1 0 –1 –2 –3
0.1 0
0
4
1 0.9
0
0.2
Figure 3.7
0.4 0.6 |Input|
0.8
1
–4
73
0
0.2
0.4 0.6 |Input|
0.8
1
PA output Input DPD+PA output
–10 –20 –30 –40 –50 –60
–1
–0.5
0 0.5 Frequency
1 ×108
Normalized AM/AM (left), AM/PM (middle), power spectral densities (right) with and without DPD for a Doherty PA
of 20 MHz bandwidth and a PAPR of approximately 8 dB. The carrier frequency is 2.14 GHz and the sampling frequency is 200 MHz. Figure 3.7 shows from left to right: the normalized amplitude-to-amplitude conversion (AM/AM) characteristics, the normalized amplitude-to-phase conversion (AM/PM) characteristics and the normalized power spectral densities, without and with DPD.
3.3 Analog vs digital predistortion Digital predistortion usually works in baseband. For signal regeneration applications such as repeater, downconverting the signal just to be able to predistort it before amplification would not be very efficient. In [13], the authors propose an approach where the modulated signal is predistorted; furthermore, they also propose an approach to make this analog predistortion adaptive. The architecture proposed by [13] is depicted Figure 3.8 where we can see that the predistorter is made of a linear term thanks to the delay line and a cubic term thanks to an analog cuber, the contribution of which is tuned by a complex coefficient implemented by the attenuator and the phase shifter. It is worth noting that this predistorter is memoryless. This predistortion may be adaptive by making the attenuator and the phase shifter digitally controlled. They are then driven by an adaptive algorithm which seeks to maximize ACLR from spectrum measurements. When speaking of analog predistortion, one have to quote the analog truncated Volterra series predistorter which was designed by Scintera, now a subsidiary of Maxim Integrated. This product implements an analog memory polynomial (MP) predistorter with odd-terms nonlinearities up to the ninth order and a memory depth of 300 ns split in four taps [14]. The analog processing for the predistortion may be less power hungry than its digital counterpart (even if this is not an assertion as there are many ways to optimize hardware implementation for digital, lookup table (LUT)-based architecture for instance), but it remains restricted to quite simple predistorter functions such as MP and does not offer the same processing versatility than digital processing.
74
Digitally enhanced mixed signal systems Delay line Driver amplifier
RF input
Main power amplifier
Attenuator Phase shifter Error generator
Vector modulator Cuber Fundamental cancellation Predistorter
Figure 3.8 Analog RF predistorter [13] In 4G and 5G wireless communication systems, very high data rate are achieved thanks to carrier aggregation (CA) generating signals with very high bandwidths for which digital predistortion may be difficult to implement (it would require too expensive DAC and ADC converters and high computation load). As well, achieving 1 Tbps in very high throughput satellite communication systems in Ka-band satellite communications requires available bandwidth wider than 1 GHz. In [15], a wide bandwidth analog predistorter is presented that is used to linearize traveling wave tube amplifiers. It is based on a nonlinear tunable microwave circuit that can realize a gain expansion and a phase shift to compensate the PA behavior. It allows to use simultaneously multi-carrier signals over the 2.9 GHz bandwidth of the Ka band from 17.3 to 20.2 GHz with a state-of-the-art linearization performance.
3.4 Mathematical aspects 3.4.1 Baseband formulation When the DPD is done on the baseband signal using either indirect learning architecture (ILA) or direct learning architecture (DLA) (see Section3.6), the goal of the predistorter is to compensate for the RF bandpass behavior of the PA. As it is emphasized in [16], the model used for the bandpass nonlinear PA behavior relates the output of the PA, y(t) to the input x(t), while z(t) is the output of the harmonic filter. It is worth noting that y(t) could have energy at DC, 2fc , 3fc , . . . depending on the nonlinear behavior of the PA. If the feedback path is taken from z(t), then the “PA” model in baseband does not only consider the real PA but also encompasses the harmonic filter. In fact including harmonic filter (real one or equivalent one) in the PA baseband model is unavoidable from a signal processing point of view as the RF output signal needs to be downconverted and sampled, it is then required to be bandlimited.
Digital predistortion
75
Zonal filter y (t)
x (t)
x˜ (t)
z (t)
PA
PA
z˜ (t)
fc (a)
(b)
Figure 3.9 Comparison of the baseband and bandpass RF representation [16]: (a) bandpass RF representation and (b) baseband representation When Volterra series are used to model the RF behavior of the PA solely [16], the input–output relationship corresponds to (Figure 3.9): K k · · · hk (τ0 , . . . , τk ) y(t) = x(t − τi )dτ0 · · · dτk , (3.13) i=1
k=0
where y(t), x(t) and h(t) are all real values. But in baseband we consider the relationship between complex baseband input and the complex “bandlimited” equivalent PA (PA+harmonic filter). This yield to this (discrete) baseband counterpart formulation [17]: z˜ (n) =
K L 1 −1 k=0 l1 =0
Lk −1
···
h˜ 2k+1 (l1 , . . . , l2k+1 )
l2k+1 =0
k+1 i=1
x˜ (n − li )
2k+1
x˜ ∗ (n − li ),
(3.14)
i=k+2
˜ where h(n) is complex valued.
3.4.2 pth-Order inverse of linear system In its seminal paper [18], Schetzen provides a fundamental result which states that the pth-order pre-inverse of a nonlinear system is identical to its pth-order post-inverse. The theory of the pth-order inverse is based on the representation of nonlinearsystems by Volterra series, and its definition [18] states that the kernels of the Volterra series representing the tandem cascade of a nonlinear-system preceded by its pth-order inverse are zero from the second to the pth order. The pth-order inverse of a nonlinear system is determined recursively starting from the inverse of the first-order kernel of the system. And the existence of this first order inverse is the only condition for the existence of a causal and stable pth-order inverse. In [19], the authors pointed out some drawbacks concerning the pth-order inverse approach: a Volterra model of the system to be linearized is required and then the computation of the inverse have to be done, that is why they suggest to use an indirect learning approach to compute a functional equivalent of the pth-order inverse. In their minds, the proposed approach has two main advantages over the computation of the pth inverse: it does not require to have a Volterra model of the nonlinear system and because of the optimization procedure involved, it will minimize the nonlinearities of order above p for the complete system which is not necessarily the case for the pth-order inverse.
76
Digitally enhanced mixed signal systems
3.5 Models for DPD structures In this section, we consider behavioral models of dynamic nonlinear systems usable in a context of linearization by digital predistortion. It means that we focus on discretetime baseband models with input and output complex signals. We are mainly interested in black box models, although it is possible in some cases to integrate circuit-type knowledge in the model. In the context of predistortion, the models can be used, according to the approaches, to model the inverse of the amplifier or the amplifier itself. Models can be classified in different manners, for example, parametric versus nonparametric models or memoryless models versus dynamic models. First DPD for wireless communications systems were simple corrective complex gains that tried to correct memoryless AM/AM and AM/PM characteristics. They were realized with LUT containing complex numbers. The content of the LUT could be updated using the error between the system input signal u(n) and the PA output normalized by the reference gain. In Nagata’s approach [8], the corrective gain depended on the instantaneous baseband input u(n). And as u(n) is a complex number, the LUT had to be addressed using two components (Cartesian or polar coordinates of u(n)) and so the size of LUT had to be large (e.g., for components quantified on N bits the LUT size is 22N ). Cavers [9,10] took advantage of the fact that for memoryless systems, the PA can be represented by a complex gain (AM/AM and AM/PM characteristics) that only depends on the magnitude of the input signal. Therefore, he simplified the DPD as a corrective gain depending on the input magnitude only, and the LUT size could be reduced to 2N instead of 22N (Figure 3.10). But with the increasing data rates and signal bandwidths, the memoryless (or quasi-memoryless) approach became insufficient and MP models [20] were proposed for DPD. They allowed for a good trade-off between linearization performance and complexity for signals with a bandwidth of a few MHz. Several other parametric models were proposed to cope with wider bandwidths signals and PA with strong nonlinearity and memory effects. In current applications of DPD, most of the time parametric models are used.
DSP input u(n)
DSP output z(n)
|u(n)|
Quantization i(n) LUT address 1 0
NLUT–1 fi Correction gain
Figure 3.10 Memoryless DPD implemented with an LUT [9,10]
Digital predistortion
77
3.5.1 Parametric models Most parametric models used for DPD are derived from Volterra series [21] with different pruning approaches as well as truncation to finite memory length and nonlinearity orders. When the baseband DPD model is derived from an RF Volterra series model, the model of (3.14) is obtained. And this discrete low-pass equivalent baseband model contains only odd-order terms. The limitation to odd-order terms is due to the fact that the RF model is supposed to be bandpass Volterra series, but several authors propose baseband Volterra models that are not subject to this limitation for the RF model [22,23] and can contain even-order terms. The argument of these authors relies on the fact that in the case of memoryless systems, functions with odd symmetry can produce outputs at odd harmonics [24]. From now on, we will consider only baseband models and for ease of notation, we will note use the tilde to represent complex-envelope signals. Volterra series are interesting because of their generality and ability to represent nonlinear systems with memory. Another interesting point is their linearity in function of their coefficients which simplifies their identification in the indirect learning approach (see Section 3.6). But when the nonlinearity order and memory length K and L increase, the number of coefficients hk (l1 , . . . ) increases dramatically: the number of coefficients is on the order of LK for nonsymmetric Volterra series. Also the Volterra kernels are not orthogonal and their identification is faced with numerical problems. Therefore different simplification strategies and models have been proposed to limit the number of coefficients. They can be classified [25] in four general approaches: ● ● ● ●
block-oriented models, models obtained by direct pruning, modified or dynamic Volterra series and use of orthogonal basis. Good models for DPD should
● ● ●
be able to manage complex signals, be able to model nonlinearities and memory effects and be easy to identify. For example, in the indirect learning approach (see Section 3.6), an important feature is the linearity of the model with respect to the coefficients.
3.5.1.1 Block-oriented nonlinear model Block-oriented nonlinear (BONL) models are based on associating several simple models in cascade or in parallel. Generally they separate nonlinearities and memory effects by associating linear time invariant dynamic blocks and static nonlinear blocks. One can cite Wiener models made of a cascade of a linear filter followed by a static nonlinearity, Hammerstein models constituted of a cascade of a linear filter followed by a static nonlinearity, and Wiener–Hammerstein models made of a cascade of three
78
Digitally enhanced mixed signal systems
blocks: a filter, a static nonlinearity and a filter. Other combinations are possible such as parallel Hammerstein models. Unfortunately, the identification of such BONL models are most of the time done using iterative nonlinear optimization techniques such as Narendra and Gallman method [26] used in [27] for DPD. In [28], a fast identification of Wiener–Hammerstein systems is proposed using discrete optimization with a genetic algorithm (GA). A cascade of sparse MPs models is proposed in [29] with a new method for sizing the models.
3.5.1.2 Pruning of Volterra series Among the most popular models obtained by direct pruning of Volterra series, we can cite MP models and generalized memory polynomial (GMP) models. The MP models proposed by [20] only keep the terms on the diagonal of the Volterra series. They are defined by y(n) =
K−1 L−1
ak,l u(n − l)|u(n − l)|k .
k=0 l=0
where K is the memory order and L the memory length. The number of coefficients is equal to KL. This model was extended as GMP models by Morgan [17] to better cope with PA with strong nonlinearities and signals with wide bandwidth. GMP models are defined by y(n) =
K a −1 L a −1
ak,l u(n − l)|u(n − l)|k
k=0 l=0
+
Kb Lb −1 Mb
bk,l,m u(n − l)|u(n − l − m)|k
k=1 l=0 m=1
+
Kc L Mc c −1
ck,l,m u(n − l)|u(n − l + m)|k .
k=1 l=0 m=1
3.5.1.3 Modified or dynamic Volterra series In order to improve the trade-off between complexity and modeling performance of Volterra series, several authors have proposed different modified or dynamic Volterra series. For PA modeling, the authors of [30,31] limit the number of coefficients by separating the static and dynamic part in the model. They build the model as a sum of a purely static polynomial term and a dynamic term expressed in function of the dynamic deviation e(n, i) defined as e(n, i) = u(n − i) − u(n).
Digital predistortion
79
The Volterra series is reformulated with rth-order dynamic kernel
of the kth-order nonlinearity that are multiplied by products of r terms e(n, ij ): j=r j=1 e(n, ij ). Zhu proposed a new formulation of the dynamic Volterra series called dynamic deviation reduction (DDR)-based Volterra models [32,33] that has the advantage of being linear with respect to the coefficients which leads to a simpler identification. This new model applies a limitation (truncation) of the dynamic order. The general expression is y(n) =
K
hk,0 (0, . . . , 0)uk (n)
k=1
⎧ ⎡ ⎤⎫ K ⎨ k M M r ⎬ ⎣xk−r (n) ··· hk,r (0, . . . , 0, i1 , . . . , ir ) + x(n − ji )⎦ ⎩ ⎭ i1 =1
r=1
k=1
ir =ir−1
j=1
This model can be simplified by keeping only low dynamic orders equal (typically 1 or 2). The simplified model of order 1 is given by [33]
(P−1)/2
y(n) =
g2k+1,l (i)|u(n)|2k u(n − i)
k=0
+
(P−1)/2 M k=1
g2k+1,2 (i)|u(n)|2(k−1) u2 (n)u∗ (n − i).
i=1
3.5.1.4 Orthogonal Volterra series To improve numerical aspects and trade-off modeling accuracy and number of coefficients, some authors have suggested to use orthogonal basis functions in the Volterra expansion such as G-functions proposed by Wiener [34] or orthogonal MP [35], Laguerre functions [36], Kautz [37]. Unfortunately, these approaches often do the hypothesis of a white Gaussian input signal which is seldom exactly verified or present some difficulties for the identification.
3.5.1.5 Models with segmentation Another simplification approach is to split the amplitude range in different regions characterized by different models such as piecewise-linear approaches, models using splines, vector-switched (VS) models. These approaches are able to represent strong nonlinearities and are less prone to numerical problems than global polynomial models. Also it helps improving the convergence speed, since it is based on segments of smaller orders of nonlinearity that can be identified with shorter blocks of data. There are different compact forms to model a real function by line segments such as simplicial canonical piecewise linear (SCPWL) or canonical piecewise linear (CPWL) approaches [38,39]. One question is how to segment the amplitude range. A suboptimal technique consists in using uniform segments.
80
Digitally enhanced mixed signal systems
SCPWL has been applied in the memoryless case to represent AM/AM and AM/PM characteristics (functions of real positive variable). If we consider a positive input x and K segments defined by limits (thresholds) βi with β = (β1 , β2 , . . . , βK )T , the approximation is given by fβ (x) = c0 +
K
ck λk (x)
k=1
λ0 (x) = 1 λk (x) =
1 (x 2 1 (βK 2
− βk + |x − βk |), x ≤ βK − βk + |βK − βk |), x > βK
The CPWL approach can be used for PA modeling by using the expression: M M K y(n) = ai x(n − i) + ck ak,i x(n − i)| − βk . i=0
k=1
i=0
But the model is nonlinear with respect to its coefficients, and it is difficult to determine the good partition. Authors in [40,41] proposed an extension and a simplification of this approach that they called decomposed vector rotation (DVR). It can manage complex signals and nonlinear systems with memory and it is linear with respect to its coefficients. To cope with complex signals, the idea is to replace the terms |x(n − i) − βk | by ||x(n − i)| − βk |ejθ(n−i) where θ(n − i) is the argument of x(n − i)The simplest version of the model is expressed as y(n) =
M i=0
ai x(n − i) +
K M
ck,i ||x(n − i)| − βk |ejθ (n−i) .
k=1 i=0
If the thresholds βk are fixed values, the model is linear with respect to its coefficients. To improve the modeling performance, the DVR model can be enriched by other basis functions of higher order (derived from DDR, for example). The values of thresholds βk can be evenly distributed or optimized. The first works on optimal segmentation for DPD applications were conducted for optimal spacing in memoryless LUT DPD [42]. In [43], the thresholds are optimized jointly with the coefficients for a general piecewise model. Another approach is based on using spline interpolation between extremities of different segments. The amplitude range is split into segments delimited by knots. On each segment, the spline function is a polynomial. The higher polynomial degree is the degree of the spline. Some continuity conditions at the segment boundaries are imposed on the function and its derivatives. Cubic spline interpolation done with polynomial of order three is the most common; it allows for continuity of the first two derivatives. For a given set of knots, the spline functions form a vector space. B-splines (basis splines) form a basis of that vector space. Any other spline function of a given degree can be expressed as a unique linear combination of these basis functions of the same degree, and a spline function expressed in that way is called a
Digital predistortion
81
B-spline function. Also, when uniform segmentation is used, the basis functions have the same shape and can be written as a simple translation of the first one. If the knots are not uniformly spaced, the basis functions are derived by recursive algorithms such as the de Boor–Cox algorithm. The approximated function fˆ (x) using B-splines can be written as xˆ (x) =
K−1
ai Bi (x),
i=K0
where the Bi (x) are the different B-splines starting at node i. For cubic splines, K0 = −3. An interesting point is that the expression is linear with respect to the coefficients ai . First authors, applied splines to memoryless DPD. The principle is to interpolate static characteristics either of the power amplifier or of its inverse [44,45], the interpolation being applied to the real and imaginary part or to the magnitude and phase of the function or directly to the complex function. In [46], the spline approach in the memoryless case is formulated as a parametric linear-approximation problem. The authors start by an estimation of the complex PA characteristics by splines and then invert it to obtain the DPD. The amplitude range is split into K intervals and the gain function of the PA is expressed with B-splines as basis functions. In [47], the same authors extended the method to the case with memory by splitting the amplitude range for each memory order in several segments and by using different nonlinearity order for each memory tap. The model is derived from MP models with the form y(n) =
L−1
u(n − l)S l (|u(n − l)|)
l=0 l
where S is approximated with complex-valued cubic spline functions of real variable over K segments. Other techniques were also used to take into account memory effects in splines methods, e.g., using Wiener or Wiener–Hammerstein models [48–50] were the static nonlinearity is represented by splines and identification is done by iterative algorithms. Splines interpolation was also applied to 2D-predistortion [51] for concurrent dual-band systems. In [52], the authors applied cubic spline functions in a CPWL model to improve the accuracy in case of a small number of segments for linearization of a ROF (radio over fiber) link. In [53], authors proposed other nonlinear polynomials functions with high locality to use instead of B-splines.
3.5.1.6 Switched models Both piecewise models and switched models [54,55] have been used to take into account the variation of PA behavior at different power levels. For example, switched models are well suited for Doherty PAs. Indeed because Doherty PA are based on two
82
Digitally enhanced mixed signal systems
PA (carrier and peak PA), their dynamic behavior is difficult to correct with classical DPD models. As Doherty PA present different memory effects at different power levels, in [55], the segmentation and switching approach is extended to the memory domain. The approach called VS model is based on sub-models that are identified and adapted separately. TheVS model is linearly identifiable and achieves a good trade-off between complexity and modeling performance. The VS model is made of a bank of different models. The applied model is selected by a switch function that is driven by the complex baseband input signal. The input space is segmented in K different regions, and the chosen model depends on the region of the input signal. The segmentation is done in function of the complex input vector U(N ) (n) made of the N last input samples U(N ) (n) = {u(n), u(n − 1), . . . , u(n − N + 1)}. Each input sample u(n) is classified thanks to a vector quantization approach based on U(N ) (n). The model corresponding to the class of u(n) is selected and applied. In practice, N = 2 and using only the amplitude of the input to do the classification have shown to be sufficient.
3.5.1.7 Neural networks models Other nonlinear models are based on neural nets, in particular, multilayer perceptrons. Artificial neural networks (NNETs) are differentiated according to the topology of the connection network between neurons, the nonlinear function of neuron activation (input–output relationship of each neuron, sigmoid type, hyperbolic tangent, for example) and the learning technique used (gradient back-propagation, simulated annealing, etc.). Multilayered perceptrons are non-recurring structures (no cycles in graphs). In contrast, recurrent NNETs are characterized by graphs containing cycles. Multilayered perceptrons are interesting for their generality (Cybenko’s theorem states that they can approach any type of function) and by the existence of efficient learning algorithms (gradient back-propagation). But it is not easy to determine their structure (number of layers and neurons). Another family is that of Gaussian radial basis functions. The network consists of two layers and can approach any function. The first layer comprises neurons performing a Gaussian function defined by two parameters: a prototype vector (mean) P and a standard deviation σ . The second layer performs a linear combination of the outputs of the first layer with the addition of a bias. The training of NNETs generally requires a rich database of signal types to minimize the risk of specialization of the network on a single type of signal. It can be quite long. Thus, gradient techniques require a presentation of the data set before modifying the weights, which is not very compatible with a real-time adaptation. Different baseband NNET models have been tested for modeling power amplifiers or for DPD. In addition to the topology, activation functions and the learning technique, they can differ by working with real signals (I and Q components, for example) or complex signals (complex envelope). The simplest idea is to use a perceptron with complex inputs and outputs. But the calculations are complex and the convergence slow. The use of real signals simplifies calculations and improves results.
Digital predistortion
83
Among the real-valued networks, it has been proposed to use two independent networks to model the AM/AM curve and the AM/PM curve. A more efficient way is to use a single network with I and Q inputs (and outputs). To improve the performance in the case of memory effects, dynamic NNETs have been proposed: recurrent networks or networks called “focused time-delay neural network,” which use at the input of a perceptron network several samples (present and past) of the input, thus avoiding the loopback and the associated difficulties of adaptation. In [56], a real valued tapped neural network delay is used as a predistortion model and adapted in real time. In [57], a “CasCor” network is used. The objective was to reduce the convergence time compared to multilayer perceptrons with back-propagation algorithm. The CasCor network is a multilayer network optimized in size during learning. It starts with a single-layer network that is optimized. If the final error exceeds a threshold, it adds a second layer and so on. In [58], authors propose a noncausal NNET based on a tapped delay line NNET for complex signals with one hidden layer and including both delayed and advanced samples at its input.
3.5.2 Nonparametric models Parametric models are currently the most used models for DPD, but some studies have been conducted on nonparametric models for which no a priori model structure is applied. Of course simple LUTs approaches [9] are nonparametric models. They were first used for memoryless systems. Then they were extended to memory systems by associating an LUT to each delayed sample u(n − l) with a model given by y(n) =
L
x(n − l)Gl (|x(n − l)|) ,
k=1
where Gl (|x(n − l)|) is the value of LUT number l for the index corresponding to the input signal amplitude |x(n − l)|. In [59], a low complexity model is proposed associating LUT and finite impulse response filter (FIR) and called filtered LUT. The content of the LUT can be interpolated by different techniques such as splines. These interpolated LUT techniques are not subject to ill-conditioning problems encountered by polynomial models. It is worth noting that the LUT methods can be derived by reformulating Volterra-derived models [60]. For example, in the case of a polynomial model, one can write: y(n) =
K−1 L−1 k=0 l=0
ak,l u(n − l)|u(n − l)|k =
L−1 l=0
Gl (|u(n − l)|)u(n − l),
(3.15)
k with Gl (|u(n − l)|) = K−1 k=0 ak,l |u(n − l)| that is quantified on Nb bits and stored in an LUT addressed by the quantified value of u(n − l).
84
Digitally enhanced mixed signal systems
Unfortunately, because of the large amount of LUT-values to identify, traditional LUT approaches suffer from slow convergence compared to polynomial approaches that have less parameters. Authors of [61] compare polynomial and interpolated LUT approaches. They show that spline-interpolated LUT approaches can be seen as replacing polynomial basis functions of Volterra-derived models by spline basis functions that have a more local support in function of the input signal amplitude. In this way, they provide a unified view of spline-interpolated LUT and Volterra-derived polynomial models and state that the main difference between both approaches is in the implementation. But using spline-interpolated LUT is in some sort a parametric model. Other nonparametric models, in the case of memoryless models, use cumulative distribution functions with histograms [62] or higher-order statistics [63]. In [64], another nonparametric technique is derived that can be applied to systems with memory effects. It is based on density estimation using the kernel method [65] and on the method given in [66] extended to the case of complex signals. The dynamic nonlinear characteristic is described as y(n) = fm1 (u(n − m1 )) m1
+
m1
+
m2
···
m1
fm1 ,m2 (u(n − m1 ), u(n − m2 ))
fm1 ,m2 ,...,mp u(n − m1 ), . . . , u(n − mp ) ,
(3.16)
mp
where the fm1 ,m2 ,...,mp are nonlinear static functions. The kernel estimator can be applied for estimating a static nonlinear function g −1 for real-valued input signals. For a set of input samples {u(n)}Nn=0 and a set of output N −1 samples {y(n)}n=0 , g(ui ) can be estimated by uˆ (ui ) =
N −1 n=0
φ ((u(n) − ui )/δ) y(n), N −1 l=0 φ ((u(l) − ui )/δ)
where φ is the kernel and δ is its aperture. In [64], a triangular kernel is used. The technique is extended to the case of complex-valued signals and is improved by removing correlation between successive input samples u(n) by a Gram–Schmidt method. The technique is applied to a predistortion system. Its interest is that it estimates the basis functions of the DPD during the identification process which allows to adjust these basis functions and the structure of the DPD to the given system.
3.6 Identification Several approaches are possible to identify the predistorter [67][68][69]. The two main techniques are called direct learning approach (DLA) and indirect learning approach (ILA). We will also present the method called iterative learning control (ILC).
Digital predistortion x(n)
u(n)
85
y(n)
Predistorter
PA 1 Gref
e(n) –
Figure 3.11 Principle of direct learning architecture (DLA) u(n)
x(n)
y(n)
Predistorter
PA –
e(n)
Postdistorter
1 Gref z(n)
Copy
Figure 3.12 Principle of indirect learning architecture (ILA) We can distinguish DLA and ILA by the type of error that is used in the optimization criterion. In the direct learning approach, Figure 3.11, the predistorter is determined by minimizing a criterion based on the difference eDirect between the observed output of the amplifier y(n) and the ideal output that is equal to the original input signal u(n) multiplied by a reference gain G0 . edirect (n) = y(n) − G0 u(n). This error is quite natural. Therefore, DLA was the first technique implemented in baseband digital predistorters with adaptive LUTs [8][9][10][11]. Several algorithms were proposed afterwards to improve the method [70]. They will be presented in Section 3.6.2. The difficulty in DLA is that the optimal output signal of the predistorter is unknown. The principle of ILC is to solve this issue and to calculate the optimum output of the DPD before identifying the DPD. This calculation is done without modeling the PA. ILC is further presented in Section 3.6.3.
3.6.1 Indirect learning architecture The principle of the indirect learning approach, shown in Figure 3.12, is to calculate a fictitious postdistorter that will be used as a predistorter. The DPD is calculated indirectly through the identification of this fictitious postdistorter. The architecture was first proposed by [19]. One argument to justify that the model identified for the postdistorter can also be used for the predistorter is that the pth-order post-inverse of a Volterra series is equal to pth-order pre-inverse [18].
86
Digitally enhanced mixed signal systems
The identification of a postdistorter is easier than that of a predistorter since the input and the reference signals (the optimal output of the postdistorter) are both available. The input of the postdistorter is the PA output y(n) divided by a reference gain G0 and the reference signal is the PA input x(n). If we note z(n), the output of the postdistorter, the optimization criterion is based on the error eindirect : eindirect (n) = z(n) − x(n). In the case where the postdistorter model is linear with respect to its coefficients, the minimization of a least square (LS) criterion function of eindirect leads to the minimization of a convex quadratic function. In this case, the complexity of the model identification is generally much less complex than for DLA methods. One drawback of ILA is its sensitivity to measurement noise at the PA output that introduces a bias in the estimation of coefficients with an LS criterion. The coefficients are biased because the measurement noise is at the input of the postdistorter. Some techniques have been proposed to reduce the influence of the noise [71,72]. But in most cases encountered in broadcast and wireless communication systems, the signal-to-noise ratio is high enough so that this bias can be neglected. Another drawback is that the pre-inverse of a nonlinear system is not exactly equal to its post-inverse. In the case of strong nonlinearity, its performance is inferior to that of DLA. When the model is linear with respect to its coefficients ci (as for many models derived from Volterra series) and an LS criterion is used, the coefficients identification is obtained by solving a linear system of linear equations. Indeed, the output of the postdistorter z(n) is expressed as
Ncoef
z(n) =
ci φi (n) = φ(n)T c,
i=0
where φi (n) is an element of the vector of regressors φ(n) and c is the coefficient vector. The number of coefficients is Ncoef . For example, for an MP model, φi (n) has the form y(n − m)/G0 |y(n − m)/G0 |k , with i = Mk + m, k ∈ [0, K − 1], m ∈ [0, M − 1] and Ncoef = KM , M being the memory depth and K the maximum order of nonlinearity. Considering buffers of N samples, the equation can be written for n ∈ [0, N − 1] in vector form: z = T c, where is a K × Ncoef matrix with lines φ(n) for n ∈ [0, N − 1]. The instantaneous error is e(n) = z(n) − x(n) and the error vector e = [e(0), . . . , e(N − 1)]T = z − x with x = [x(0), . . . , x(N − 1)]T . The LS criterion is min J = c
N −1
|e(n)|2 = eH e.
n=0
The optimum coefficient vector copt is the solution of the system of linear equations: H copt = H x
Digital predistortion or equivalently, with the pseudo-inverse: −1 copt = H H x
87
(3.17)
3.6.2 Direct learning architecture The DLA approach is presented in detail in Section 2.4.1.2. Here we only add some complements. Compared to ILA, DLA has several advantages [67,69]: ●
●
It is a formally rigorous approach. Its objective is to minimize the direct error eDirect between the observed output of the amplifier y(n) and the ideal output that is equal to the original input signal u(n) multiplied by a reference gain G0 . It does not suffer from biased estimation of coefficients due to measurement noise at PA output as ILA does.
But it also suffers from some drawbacks: ●
●
It is a non-convex optimization approach that most of the time necessitates complex iterative techniques with slow rates of convergence. DLA techniques often need a model of the PA; model that may be not very precise.
Numerous techniques have been proposed for DLA. They can be classified [73] into model-based DLA techniques that necessitate a PA model and closed-loop estimator techniques that do not need a PA model. Among model-based DLA techniques, we can cite analytical approaches [74,75] and adaptive filtering techniques. In the analytical method proposed in [74], an MP model of the PA is first identified, then the DPD model is obtained by inverting the PA model. For the memoryless case, the inverse of the PA model is obtained by polynomial fitting. In the case of a memory length equal to one, an iterative technique [74] is proposed to obtain the PA inverse. Adaptive filtering techniques can be applied to calculate the DPD. But the adaptive compensation applied by the DPD is placed before the PA which is a nonlinear system. It is a problem similar to that of adaptive prefiltering. In the case of prefiltering, algorithms such as the well-known least mean square (LMS) may be unstable. Therefore, improved algorithms have been proposed such as FxLMS (filtered-x LMS). The general principle of FxLMS algorithm is presented in detail in [76]. It is derived from LMS (stochastic gradient algorithm) algorithm and was proposed to solve prefiltering cases where the output of the adaptive filter is filtered by another filter before being compared to the reference signal (see Figure 3.13). This filter H (z) may generate instabilities in the LMS convergence. The solution proposed by Morgan and other authors (see [76]) was to introduce an inverse filter in the cancellation path (called error-filtered approach) or a filter Hˆ (z) ≈ H (z) in the reference path (called filtered–reference approach or FxLMS since the reference signal is x(n)). In the FxLMS algorithm, the coefficients are updated by w(n + 1) = w(n) + μe(n)x (n),
88
Digitally enhanced mixed signal systems d(n)
e(n)
+ –
x(n) W(z)
H(z) d(n)
x'(n) H(z)
LMS
Figure 3.13 Principle of FxLMS
ˆ where x (n) = [x (n), x (n − 1), . . . , x (n − L + 1)]T , x (n) = h(n)*x(n), * represents ˆ the convolution and h(n) is the impulse response corresponding to Hˆ (z) (Hˆ (z) is supposed to be very close to H (z)). Nonlinear FxLMS (NFxLMS) is the generalization of FxLMS to the nonlinear case and NFxRLS to the nonlinear recursive LS case. It can be seen that the equivalent of Hˆ (z) is a model of the PA. Different nonlinear adaptive filtering approaches using a PA model have been used for DPD such as NFxLMS [77], NFxRLS [70], nonlinear adjoint LMS or recursive least square (RLS) [70] with a lower complexity than NFxLMS or NFxRLS. For example, if we consider an MP models for the PA and for the DPD with a DPD input u(n), a DPD output x(n) and a PA output y(n), we have the following relations: x(n) =
M P−1
wpm u(n − m)|u(n − m)|p .
m=0 p=0
y(n) =
L K−1
akl x(n − m)|x(n − l)|k .
l=0 k=0
where wpm and akl are, respectively, the DPD and PA models coefficients. And the NFxLMS algorithm updates the DPD coefficients using an approximation of the instantaneous gradient of the error leading to the following expression for the updating of coefficients w(n): w(n + 1) = w(n) + μe∗ (n)ψ(n) with ψ(n) =
L K k +2 akl |x(n − l)|k u(n − l) 2 l=0 k=0
and u(n − l) = [u(n − l), u(n − l − 1), . . . , u(n − l − M )]T .
Digital predistortion
89
Some DLA techniques do not use any PA model, in particular, the closed-loop estimator [78–80]. In the closed-loop estimator approach, the output of the DPD is expressed by x(n) = u(n) + χ (n)a(n) χ(n) = [B1 (n), B2 (n), . . . , BNc (n)] where a is the vector of coefficients of the DPD of length Nc and Bk (n) are the basis waveforms of the DPD model. The criterion to minimize JCL is defined as JCL = E |e(n) − χ (n)a(n)|2 e(n) = y(n)/G0 − u(n). where G0 is the reference gain and a(n) is the estimate of the DPD coefficient error. Defining the matrix: Q = E χ H (n)χ (n) The coefficient error is equal to a(n) = Q−1 E χ H (n)e(n) and the updating of the coefficients is expressed by a(i) = a(i − 1) − βa. The process is iterated to reach a steady state, where β < 1 is a convergence constant. The conditioning of Q is important for the good convergence of the algorithm. Regularization techniques (such as singular value decomposition or ridge regression) can be used to improve the conditioning. In [79] closed-loop estimator is compared to ILA. It shows the interest of closedloop estimator for wide bandwidth signals. The argument is that ILA compares signals at the output of the predistorter and of the postdistorter and that those signals have wider bandwidths than the original input signal. On the contrary, the closed-loop estimator compares the original input signal with the PA output which at steady state has the same bandwidth as the input signal.
3.6.3 DPD with iterative learning control (ILC) In [81], ILC was introduced as a tool for linearization of power amplifier by digital predistortion. ILC is a well-known approach in control theory. It is usually applied to systems operating in a repetitive way. The goal is to track a desired reference signal at the output of the system. The principle is to iteratively determine the best input signal in order to track the desired reference signal at the output of the system. Unlike common identification approaches in which one identifies the parameters of a model, in ILC, one directly estimates the best input signal. This approach is possible in repetitive systems where the input signal can be improved at each iteration. But in the case of digital predistortion, the system is not repetitive. The authors of [81] propose to apply ILC to predistortion in two steps. In the first step, ILC is applied to a buffer of signal to estimate the best PA input signal. This best
90
Digitally enhanced mixed signal systems
PA input signal is the best DPD output signal. In the second step, a DPD model is chosen, e.g., a GMP model, and the parameters of that model are identified using the input DPD signal and the best output DPD signal by common identification methods. The results presented in [81] show that this ILC approach performs better than ILA in the presence of noise at the PA output and that it gives performance similar to that of DLA but with a smaller complexity. In ILC approaches, different iterative algorithms can be applied to estimate the best input signal. They process the signals per blocks of length N samples. Figure 3.14 presents the principles of ILC and indicates the notations. Vectors are noted with bold characters. The output reference signal is r(n) and the corresponding vector of N samples [r(0), r(1), . . . , r(N − 1)]T is noted r. At iteration number k, the system input and output signals are noted uk and yk . The error at iteration k is ek = rk − yk . The different learning algorithms can be expressed as uk+1 = uk + ek .
(3.18)
Different learning matrices are used depending on the algorithm. It has to be chosen carefully for the algorithm to converge. In [81], a condition is established for the algorithm convergence. By noting FS the nonlinear dynamic transfer function of the system, the output signal at iteration k is yk (n) = FS [uk (n), uk (n − 1), . . . , uk (0)] .
(3.19)
yk = FS (uk ). T = f0 uk0 , . . . , fn ukn , . . . , fN −1 ukN −1 .
(3.20)
(3.21) In the last equation. ukn = [uk (0), uk (1), . . . , uk (n)] and yk (n) = fn ukn . The Jacobian matrix of Fs with respect to uk is noted JF (uk ) and is equal to ⎞ ⎛ ∂fo uk0 ∂fo uk0 ... ⎟ ⎜ ∂uk0 ∂ukN −1 ⎟ ⎜ ⎟ ⎜ .. .. .. ⎟ JF (uk ) = ⎜ . . . ⎟ ⎜ N −1 N −1 ⎟ ⎜ ⎠ ⎝ ∂fN −1 uk ∂fN −1 uk ... 0 N −1 ∂uk ∂uk r uk
uk+1
System
yk −
+ ek
Learning controller
Figure 3.14 Principle of iterative learning control (ILC)
Digital predistortion
91
with the hypothesis that FS is continuous in the region of interest and with a first-order approximation, the error vector can be written as ek+1 = ek − JF (uk )ek = [I − JF (uk )] ek . From that expression, using the L2 norm for the error vector, the convergence condition of ILC algorithms is obtained if limk→∞ ||ek | |22 = 0. This convergence condition is satisfied if ||I − JF (uk )| |2i2 < 1 where ||.| |2i2 denotes the spectral norm of a matrix. In the case of DPD, the considered system is the power amplifier. Three ILC learning algorithms are compared in [81] for DPD: ● ● ●
Newton-type ILC algorithm Instantaneous gain-based ILC Linear ILC algorithm. Newton-type ILC algorithm is defined by uk+1 = uk + JF (uk )−1 ek .
This algorithm necessitates a good knowledge of the PA characteristic FS , and the calculation of the inverse of the Jacobian matrix is computationally expensive. Because of these two drawbacks, it is only used as a reference algorithm to analyze the convergence speed of less complex algorithms. The instantaneous gain-based ILC uses the diagonal gain matrix G: G (uk ) = diag (G [uk (0)] , . . . , G [uk (N − 1)]) where G [uk (n)] = (yk (n)/uk (n)). The learning matrix of the instantaneous gain-based ILC is defined as = G (uk )−1 leading to uk+1 = uk + G (uk )−1 ek . This algorithm is less computationally expensive than the Newton-type one, and it requires very little knowledge about the PA characteristic. Linear ILC algorithm (also called first-order linear-type) is less complex. It uses a scalar constant for and is defined by uk+1 = uk + γ ek . The convergence of the algorithm is satisfied if 0 4 MHz. This is due to the increasing dominance of the second-order distortion (|S2 |) at higher frequencies and in line with the simulations in Section 6.3. Figure 6.14(b) shows the measured power consumption of the DHSS as function of output frequency with and without DEM enabled. At higher frequencies, switching loss start to dominate as expected; it is clear that the DEM power overhead – especially at low frequencies – is very modest. Finally, Figure 6.14(c) shows Spectrum with DEM
−10
Spectrum without DEM
−20 Power [dBm]
−30 −40 −50 −60 −70 −80 −90 0 (a)
2
4 6 Frequency [MHz]
8
0 (b)
2
4 6 Frequency [MHz]
8
Figure 6.13 Measured DHSS spectrum at 1 MHz sine generation: with DEM (a) and without DEM (b)
250
Digitally enhanced mixed signal systems
the measured FOM as function of DHSS output frequency with and without DEM enabled. The peak improvement of using the proposed partial DEM is about 40%. Table 6.2 lists key metrics for different digital harmonic-cancelling sine-wave synthesisers (here the FOM is calculated using both SDR2k and SFDR2k to enable 75
SDR with and without DEM
SDR2k [dB]
70 65 60 55 50 45 40 (a) Power with and without DEM
Power [mW]
1.4 1.3 1.2 1.1 1.0 0.9 0.8 (b)
70 FOM [conv./(pJ mm2)]
1.5
106 107 Frequency [Hz]
FOM with and without DEM
60 50 40 30 20
(c)
107 106 Frequency [Hz]
Figure 6.14 Measured DHSS metrics with DEM (filled) and without DEM (open): SDR (a); power consumption (b); and FOM (c) Table 6.2 Metric comparison of different reported DHSS circuit
CMOS process [nm] fφ [MHz] f0 [MHz] SDR2k [dB] SFDR2k [dB] Power [mW] Area [mm2 ] Phase generatorb FOM (SDR2k ) [conv.steps/(pJ mm2 )] FOM (SFDR2k ) [conv.steps/(pJ mm2 )] a b
Output filter included. SL, synchronous logic; RO, ring oscillator.
This work
[27]a
[27]
[36]a
[28]a
130 20 2 66 69 0.94 0.066 SL 66 93
130 1,160 10 72 – 4.04 0.186 SL 55 –
130 1,160 10 59 – 4.04 0.066 SL 34 –
90 – 100 – 45 1.68 0.0455 RO – 237
180 – 750 63 – 57 0.08 RO 238 –
Digitally enhanced digital-to-analogue converters
251
comparison with [36]). Note that for most of the designs, an output filter was included which improves the SDR2k and SFDR2k . The high FOM obtained in [36] and [28] can be contributed to their use of ring oscillators for generating the HCDAC phase-shifted inputs, φi ; use of ring oscillators often leads to poor phase noise performance.
6.5 Summary In this chapter, we used digital techniques to enhance the performance of DACs. We first looked at the typical DAC encoding schemes and illustrated their effects on common static error measures – INL and DNL – by means of simulations on a 32 unit element DAC structure with element mismatch. We also illustrated the effect of dynamic error measures – SDR and SFDR – caused by element mismatch and unbalanced rise- and fall-transitions. We briefly looked at popular DAC calibration methods for reducing static errors in DACs; DAC calibration can be very effective in reducing DAC errors; however, it commonly require additional analogue hardware. We thus turned our attention to DEM which can also reduce DAC errors and be implemented entirely in the digital domain. In a DEM DAC, the DAC input bits do not control designated unit elements but a selection of unit elements randomly selected from all the unit elements; in effect, the element mismatch is turned into white or shaped noise. We illustrated the effects on the DAC output spectra from a number of different DEM methods – complete randomly scrambled DEM, a binary partial DEM, a differential DEM, and a high-pass shaped DEM – by means of simulations on a 32 unit element DAC structure. Further, we illustrated the effects on the spectra of unbalanced riseand fall-transitions. Finally, we looked at DEM in the context of DHSS. Sine wave synthesis plays an important role in many applications and DHSS can be particularly attractive since it can be implemented at relatively low hardware cost using digital logic and a purpose built HC-DAC. We analysed how unit element mismatch affects the HC-DAC while generating sine waves and how partial or complete DEM scrambling of the unit elements can improve the SDR and SFDR. We argued that complete element scrambling can be costly and proposed a simple, hardware-efficient partial DEM technique. Measurement from a current-steering HC-DAC implementation with this partial DEM was presented and compared with other DHSS circuits found in the literature. In conclusion, we have demonstrated that DAC performance can effectively be improved by the DEM digital enhancement method. We saw that – due to the increased element switching compared with a conventional DAC – DEM can be sensitive to dynamic switching asymmetry at high signal frequencies but that DEM algorithm can also be insensitive to this and shape the DEM noise to high frequencies. We also saw that even simple, partial DEM techniques can be used to significantly improve the SDR and SFDR of DACs at lower frequencies.
252
Digitally enhanced mixed signal systems
References [1] [2]
[3]
[4] [5]
[6]
[7] [8]
[9]
[10]
[11]
[12]
[13]
[14]
Baker RJ. CMOS: Circuit Design, Layout, and Simulation, 3rd ed. Piscataway, NJ: Wiley; 2010. Kaulberg T and Bogason G. Position detection with the use of MAGFETs. In: Instrumentation and Measurement Technology Conference, 1995. IMTC/95. Proceedings. Integrating Intelligent Instrumentation and Control., IEEE. IEEE; 1995. p. 158. Nicholson AP, Irfansyah AN, Jenkins J, et al. A statistical design approach using fixed and variable width transconductors for positive-feedback gainenhancement OTAs. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2017;25(6):1966–1977. Hastings RA, The Art of Analog Layout. Upper Saddle River, NJ: Prentice Hall; 2001. Lakshmikumar KR, Hadaway RA, and Copeland MA. Characterisation and modeling of mismatch in MOS transistors for precision analog design. IEEE Journal of Solid-State Circuits. 1986;21(6):1057–1066. Yuan X, Shimizu T, Mahalingam U, et al. Transistor mismatch properties in deep-submicrometer CMOS technologies. IEEE Transactions on Electron Devices. 2011;58(2):335–342. Pelgrom MJ, Duinmaijer AC, and Welbers AP. Matching properties of MOS transistors. IEEE Journal of Solid-State Circuits. 1989;24(5):1433–1439. Irfansyah AN, Lehmann T, Jenkins J, et al. Analysis and design considerations of systematic nonlinearity for sigma–delta current-steering DAC. In: TENCON Spring Conference, 2013 IEEE. IEEE; 2013. p. 108–111. Irfansyah AN, Lehmann T, Jenkins J, et al. A resistive DAC for a multistage sigma–delta modulator DAC with dynamic element matching. Analog Integrated Circuits and Signal Processing; 2019;98(1):109–123. Shen MH, Tsai JH, and Huang PC. Random swapping dynamic element matching technique for glitch energy minimization in current-steering DAC. IEEE Transactions on Circuits and Systems II: Express Briefs. 2010;57(5):369–373. Risbo L, Hezar R, Kelleci B, et al. Digital approaches to ISI-mitigation in highresolution oversampled multi-level D/A converters. IEEE Journal of SolidState Circuits. 2011;46(12):2892–2903. Tang Y, Briaire J, Doris K, et al. A 14 bit 200 MS/s DAC with SFDR 78 dBc, IM3 -83 dBc and NSD -163 dBm/Hz across the whole nyquist band enabled by dynamic-mismatch mapping. IEEE Journal of Solid State Circuits. 2011;46(6):1371. Rahman MT and Lehmann T. A self-calibrated cryogenic current cell for 4.2 K current steering D/A converters. IEEE Transactions on Circuits and Systems II: Express Briefs. 2017;64(10):1152–1156. Groeneveld W, Schouwenaars H, and Termeer H. A self calibration technique for monolithic high-resolution D/A converters. In: Solid-State Circuits Conference, 1989. Digest of Technical Papers. 36th ISSCC., 1989 IEEE International. IEEE; 1989. p. 22–23.
Digitally enhanced digital-to-analogue converters [15] [16]
[17]
[18]
[19]
[20] [21] [22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
253
Cong Y and Geiger RL. A 1.5-V 14-bit 100-MS/s self-calibrated DAC. IEEE Journal of Solid-State Circuits. 2003;38(12):2051–2060. Chen HH, Lee J, Weiner J, et al. A 14-b 150 MS/s CMOS DAC with digital background calibration. In: VLSI Circuits, 2006. Digest of Technical Papers. 2006 Symposium on. IEEE; 2006. p. 51–52. Radulov GI, Quinn PJ, Hegt H, et al. An on-chip self-calibration method for current mismatch in D/A converters. In: Solid-State Circuits Conference, 2005. ESSCIRC 2005. Proceedings of the 31st European. IEEE; 2005. p. 169–172. Chen T and Gielen GG. A 14-bit 200-MHz current-steering DAC with switching-sequence post-adjustment calibration. IEEE Journal of Solid-State Circuits. 2007;42(11):2386–2394. Galton I and Carbone P. A rigorous error analysis of D/A conversion with dynamic element matching. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing. 1995;42(12):763–772. Galton I. Why dynamic-element-matching DACs work. IEEE Transactions on Circuits and Systems II: Express Briefs. 2010;57(2):69–74. Van De Plassche RJ. Dynamic element matching for high-accuracy monolithic D/A converters. IEEE Journal of Solid-State Circuits. 1976;11(6):795–800. Jiang H, Olleta B, Chen D, et al. A segmented thermometer coded DAC with deterministic dynamic element matching for high resolution ADC test. In: Circuits and Systems, 2005. ISCAS 2005. IEEE International Symposium on. IEEE; 2005. p. 784–787. Jensen HT and Galton I. A low-complexity dynamic element matching DAC for direct digital synthesis. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing. 1998;45(1):13–27. Radke RE, Eshraghi A and Fiez TS. A 14-bit current-mode Sigma Delta DAC based upon rotated data weighted averaging. IEEE Journal of Solid-State Circuits. 2000;35(8):1074–1084. Wang P and Sun N. A random DEM technique with minimal element transition rate for high-speed DACs. In: Circuits and Systems (ISCAS), 2014 IEEE International Symposium on. IEEE; 2014. p. 1155–1158. Sanyal A, Chen L and Sun N. Dynamic element matching with signalindependent element transition rates for multibit modulators. IEEE Transactions on Circuits and Systems I: Regular Papers. 2015;62(5):1325–1334. Elsayed MM and Sanchez-Sinencio E. A low THD, low power, high outputswing time-mode-based tunable oscillator via digital harmonic-cancellation technique. IEEE Journal of Solid-State Circuits. 2010;45(5):1061–1071. Shi C and Sanchez-Sinencio E. 150–850 MHz high-linearity sine-wave synthesizer architecture based on FIR filter approach and SFDR optimization. IEEE Transactions on Circuits and Systems I: Regular Papers. 2015;62(9): 2227–2237. Song Y and Kim B. Quadrature direct digital frequency synthesizers using interpolation-based angle rotation. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2004;12(7):701–710.
254
Digitally enhanced mixed signal systems
[30]
Rairigh D, Liu X, Yang C, et al. Sinusoid signal generator for on-chip impedance spectroscopy. In: Circuits and Systems, 2009. ISCAS 2009. IEEE International Symposium on. IEEE; 2009. p. 1961–1964. Davies AC. Digital generation of low-frequency sine waves. IEEE Transactions on Instrumentation and Measurement. 1969;18(2):97–105. Aluthwala P, Weste N, Adams A, et al. The effect of amplitude resolution and mismatch on a digital-to-analog converter used for digital harmoniccancelling sine-wave synthesis. In: Circuits and Systems (ISCAS), 2016 IEEE International Symposium on. IEEE; 2016. p. 2018–2021. Aluthwala P, Weste N, Adams A, et al. Partial dynamic element matching technique for digital-to-analog converters used for digital harmoniccancelling sine-wave synthesis. IEEE Transactions on Circuits and Systems. 2017;64(2):296–309. Tsao S; IET. Generation of delayed replicas of maximal-length linear binary sequences. Proceedings of the Institution of Electrical Engineers. 1964;111(11):1803–1806. da Rocha JF, dos Santos MB, Costa JMD, et al. Level shifters and DCVSL for a low-voltage CMOS 4.2-V buck converter. IEEE Transactions on Industrial Electronics. 2008;55(9):3315–3323. Soda M, Bando Y, Takaya S, et al. On-chip sine-wave noise generator for analog IP noise tolerance measurements. In: Solid State Circuits Conference (A-SSCC), 2010 IEEE Asian. IEEE; 2010. p. 1–4.
[31] [32]
[33]
[34]
[35]
[36]
Chapter 7
Clock generation Naser Pourmousavian1, Teerachot Siriburanon1, Feng-Wei Kuo2, Masoud Babaie3, and Robert Bogdan Staszewski1
In modern transceivers, clock generation and planning is one of the key aspects. As a matter of fact, with the increasing occupation of the spectrum and with the increasing use of discrete front ends, non-idealities such as reciprocal mixing are getting more and more critical. This chapter presents the different techniques to enhance the performance of the clock generation especially for all-digital phase locked loops (PLLs) (ADPLLs).
7.1 Development of advanced PLLs To this day, the stringent requirements of an radio frequency (RF) synthesizer make its design one of the most challenging in the implementation of an RF transceiver. The synthesizer must meet the specifications of phase noise (PN) performance, spurious tones level, switching speed, frequency and tuning range while also meeting lowpower, low-voltage, low-cost and integrability. PLL is a synthesis technique where in its simplest form, a negative feedback loop consisting of an oscillator and a phase detector (PD) is formed. If the output phase of the oscillator drifts from the phase of a reference frequency (FREF) signal, an error is generated. Consequently, the loop produces correction commands to the oscillator until the phases are aligned and the loop can achieve ‘lock’. Figure 7.1(a) depicts a typical charge-pump PLL utilizing a voltage-controlled oscillator (VCO). The phase/frequency detector (PFD) estimates the phase difference between the divided-by-N VCO output clock and the FREF and generates either an UP or a DOWN pulse whose width is proportional to the measured time difference by the PFD. The pulses are then fed to a charge-pump circuit where sinks or sources current and produces a current pulse corresponding to the phase misalignment.
1
School of Electrical and Electronic Engineering, University College Dublin, Ireland Taiwan Semiconductor Manufacturing Company (TSMC), Taiwan 3 Department of Microelectronics, Delft University of Technology, The Netherlands 2
256
Digitally enhanced mixed-signal systems
Phase/ frequency detector UP FREF PFD (fR) DOWN
Charge pump Loop filter
∆f0 FREF
VCO VTune
(fV) FDIV
Freq. divider FDIV
÷N FCW
UP CKV
VTune t
Dither
(a) FREF (fR)
∆f0
DCO TDC
ferror
Digital loop filter
(fV) FREF ΣΔ FDIV
Frequency divider
FDIV
FCW
CKV
ferror
xxxx
xxxx
xxxx
xxxx
ΣΔ Modulation
t
(b)
Figure 7.1 PLL types: (a) conventional charge-pump PLL (drawn in the fractional-N configuration for the sake of comparison); (b) fractional frequency divider-based ADPLL This current pulse is then integrated on a capacitor and filtered by the first-order pole formed by the resistor and capacitor to generate a smooth control voltage for the VCO. The mismatches between the width of the output pulses of the PFD as well as charge injection and clock feedthrough mismatches between the metal-oxide-semiconductor (MOS) devices of the charge pump will give rise to periodic glitches which could potentially modulate the VCO output frequency and produce spurious integer and fractional reference spurs and degrade close-in PN. The advancements in deep-submicron complementary metal-oxide-semiconductor (CMOS) technology enable high levels of scaling and integration in digital circuitry but at the same time create new challenges in the implementation of traditional RF and analogue building blocks. This inevitably prompted new research paths on finding digital solutions to overcome the challenges and use the opportunities generated by the ever ongoing scaling. The charge-pump PLL is not an exception to this trend. The low supply voltage of the scaled CMOS nodes along with the increase in the small signal output conductance gds of a MOS transistor result in the design of far from ideal current sources. Moreover, the increase in MOS gate leakage makes the use of high-density MOS varactors lossy and as a result necessitates the use of on-chip metal–oxide–metal (MOM) capacitors which have lower capacitance density and occupy larger areas. This will also lead to very large loop filters.
Clock generation
257
In [1], a digitally controlled oscillator (DCO), which deliberately avoids any analogue tuning voltage controls, is presented. The DCO, which can be considered the counterpart of the VCO, is controlled by digital commands as opposed to analogue tuning voltage. This allows for its loop control circuitry to be implemented in a fully digital manner which has been introduced as ADPLL. ADPLLs can be classified into two main architectural types of divider-less (true phase-domain operation) [2] and divider-based ADPLLs [3]. ADPLL architectures not only address the implementation issues in the nano-scaled technologies but also introduce more flexibility and reconfigurability. Figure 7.1(b) illustrates a fractional frequency divider based ADPLL which mimics the topology of charge-pump PLL and makes it easier to compare. Needless to say, the research in analogue PLLs never slowed down and along with the developments in digital PLLs, divider-less ‘sub-sampling’ analogue PLLs have been widely explored [4]. In a sub-sampling analogue PLL, shown in Figure 7.2(a), the divider in the feedback path is removed and a sub-sampling PD block which is sampled at the reference frequency is introduced. The new architecture helps suppress the charge-pump noise contribution to the total PN by orders of magnitude compared to conventional analogue PLLs. Consequently, under low-power consumption the PN performance is shown to be decent. However, the loop filter is still implemented using low-density on-chip passive components consuming large area. Moreover, the loop dynamics is vastly impacted by the VCO swing and as a result extremely sensitive to process–voltage–temperature (PVT) variation and since implementing digital calibration on an analogue-intensive platform can be faced with difficulties, the calibration can be challenging.
VCO
FREF (fR) SSPD
Charge pump
(fV)
Loop filter
CKV (a) FREF (fR) ADC
Digital loop filter
×G
(b)
DCO (fV) ΣΔ
Sample and hold
CKV
Variable gain amplifier
Figure 7.2 Sub-sampling PLL types: (a) sub-sampling analogue PLL; (b) ADC-based sub-sampling ADPLL
258
Digitally enhanced mixed-signal systems FCW
Σ
FREF (fR) TDC
Reference phase Phase error
Variable phase
Digital loop filter
DCO (fV) ΣΔ CKV
Figure 7.3 Simplified architecture of counter-based ADPLL
The sub-sampling concept has also been introduced in digital PLLs in [5] as shown in Figure 7.2(b). Instead of adopting a traditional inverter-based time-to-digital converter (TDC), the time residue is converted into voltage domain and is digitized by employing an analogue-to-digital converter (ADC). By sampling the DCO waveforms using a sub-sampling PD, the time can be converted into a voltage quantity. Without the need for a high-resolution ADC, this approach can achieve high resolution in phase detection with low power consumption. However, when sampling an oscillator waveform, linear detection range is limited. Thus, it inherently only supports integerN operation. Directly operating this architecture in fractional-N operation is not straightforward and can cause degradation in fractional spurs of the ADPLL [6,7]. Figure 7.3 shows a divider-less ADPLL which operates in the true phase domain. The ADPLL operates by comparing the variable phase of the multi-GHz output of the DCO with the phase of the lower frequency (e.g. 15–40 MHz) reference clock FREF. The comparison results in a digital phase error (PHE) which after being filtered by the digital loop filter adjusts the DCO frequency. The role of the PFD and charge pump in a traditional PLL is played by a TDC in an ADPLL which is in charge of measuring the fractional delay difference between the reference clock FREF and variable clock (CKV). The FREF information is contained in the transition times (i.e. timestamps) of FREF. Likewise, the DCO CKV contains the timing information in its timestamps. The PHE φE [k] is obtained by finding the difference between the reference phase and the variable phase. The TDC system includes the TDC core which is in charge of calculating the fractional part of the variable phase and an accumulator which counts the CKV edges to arbitrarily increase the dynamic range, thus contributing to the integer part of the CKV. The digital fractional phase is determined by passing the DCO clock, CKV, through the chain of inverters of the TDC as shown in Figure 7.4. Each inverter produces a clock signal delayed from the previous inverter output. The staggered clock phases are then sampled through the chain of flip-flops (FFs) by the reference frequency FREF. The TDC output has an integer form and cannot be used in the loop by its raw form since time resolution is a varying physical parameter. In order to properly combine the TDC correction [k] with the fractional part of the reference phase, it should be normalized by the CKV period. Hence, the TDC acts like an interpolator, its gain cannot be too large or too small and it needs to be just right.
Clock generation ∆tres = ∆tinv CKV
D(1)
D(2)
D(3)
D(L)
FREF
CKV D(1) D(2) D(3)
e[k]
Normalization Q(1)
Q(2)
Q(3)
Q(L)
L
(a)
259
Q 0 0 0 1
D(L) FREF
1
(b)
Figure 7.4 (a) Simplified time-to-digital converter core, (b) timing diagram of TDC FCW
Σ
FREF (fR)
FCW
DTC
Reference phase Phase error
Digital loop filter
DCO (fV) ΣΔ
TDC Snapshot
CKV
FREFdelay (a)
1/2N TDC
(b)
Digital loop filter
Variable phase Snapshot
DCO ÷N
FREF (fR)
MUX
Σ
Reference phase Phase error
ΣΔ 2N
CKV0–(2N–1)
Predictor
Figure 7.5 ADPLL architectures with relaxed TDC dynamic range requirement, (a) DTC-assisted, (b) DCO phase selection To further reduce the power consumption and relax theTDC’s strict specifications, a new architecture employing a digital-to-time converter (DTC) was proposed in [8]. Figure 7.5(a) illustrates that ADPLL architecture with DTC-assisted snapshot TDC. Compared to the ADPLL in Figure 7.1(b) where the TDC needs to cover the full CKV period, the TDC snapshotting reduces the sampling rate from CKV to FREF. Furthermore, DTC delays the reference clock FREF according to a fractional part of frequency command word (FCW). Consequently, the rising edge of FREFdelay occurs very close to the rising edge of CKV, hence relaxing TDC’s detection range. The introduced techniques result in significant power reduction. The snapshot block is an
260
Digitally enhanced mixed-signal systems
asynchronous circuit which captures the first rising edge of CKV right after the rising edge of the trigger signal FREFdelay . Other techniques can also be implemented in order to reduce TDC dynamic range requirements. Figure 7.5(b) illustrates an ADPLL where multiple phases of the DCO output are generated. A phase selector chooses the CKV phase which is the closest to FREF edge and feeds it to the TDC. By using this technique the TDC can be shortened as it needs to cover only the period of the divided CKV phases. One drawback of this architecture is the fact that the DCO needs to run at higher frequencies. By the introduction of the aforementioned techniques in the ADPLL architecture where a high-resolution design can be achieved more power efficiently and with less hardware complexity, there is no need to rely entirely on the TDC. However, while the stringent requirements of TDC dynamic range are alleviated, the in-band PN is still limited by TDC resolution. Different types of TDC can be used in the ADPLL architecture with their own advantages and drawbacks. The most commonly used type is inverter-based TDC where the TDC resolution tres is equivalent to the loaded inverter delay. The inverterbased TDC gives the best performance in terms of resolution, power consumption and design complexity. However, to get a fast inverter, the transistors gate voltage (coming from VDD of the previous inverter) has to be at least several hundred mV higher than the threshold voltage Vt which imposes a lower limit for the supply voltage. The bang bang PD (BBPD) is another solution. The low complexity and power consumption of a BBPD, which can be implemented using a single FF, makes it a very attractive solution especially in DTC-assisted PLLs [9]. However, its non-linear behaviour can potentially result in instability of a BB-PLL loop. A time-amplifier (TA) TDC achieves a higher resolution by combining a TA and an inverter-based TDC [10]. The TA amplifies the input time difference by exploiting the metastability of a set–reset (SR) latch and the output is then digitized by the inverter-based TDC. Despite the resolution improvement which is proportional to TA gain, the added non-linearity will degrade the PN performance. Using a first-order modulation and taking advantage of noise shaping is another method to implement high-resolution TDCs. The gated ring-oscillator (RO) (GRO) based TDC and vernier-GRO TDC are such TDC which can achieve high resolutions.
7.2 ADPLL-based transmitter The frequency synthesizer of Figure 7.3 can be readily turned into a numerically controlled frequency modulator (FM) that could act as a stand-alone front-end constantenvelope transmitter, shown in Figure 7.6, for Bluetooth (GFSK) and GSM (GMSK). Alternatively, after adding a digital envelope modulator at its output, it can become a polar transmitter, as commercially demonstrated for enhanced data for global evolution (EDGE) and proposed for Wideband Code Division Multiple Access (WCDMA). The transmitter of Figure 7.6 can achieve a two-point wideband and precise frequency/phase modulation without any significant constraint on the loop bandwidth
Clock generation
261
Data FCW Channel FCW
FCW
FREF (fR)
Σ
Reference phase Phase error
TDC Σ Variable phase
DCO Digital loop filter
(fV)
TX Out
ΣΔ CKV
Figure 7.6 ADPLL with the frequency modulation capability of the ADPLL. The wideband characteristic is due to the nonexistence of any speedlimiting devices and circuits. A properly designed circuitry will only feature an fT -type technology limitation, which could be on the order to hundreds of GHz. Contrary to an apparent suspicion, the Q-factor of the DCO LC-tank does not limit the modulating bandwidth, since it is the resonant frequency of the LC-tank that is independently perturbed during the digital tuning, not its voltage or current. The oscillation frequency deviation is dynamically controlled by directly modulating the DCO frequency in a feed-forward manner with a closed loop compensation that effectively removes the loop dynamics from the modulating transmit path. The numerically controlled FM precision stems from the fact that all the blocks, except for the DCO and TDC, have an exact transfer function: The TDC is an equivalent to an ADC with time being the ‘analogue’ input, whereas the DCO is an equivalent to a DAC with frequency deviation being the ‘analogue’ output. The conversion gains of the TDC and the DCO are continuously calibrated in the background, such that their estimation errors are less than 1%, which satisfy requirements of virtually all modern transmitters. This should be contrasted with the much lower required precision of the basic unmodulated ADPLL. Moreover, a digitally controlled RF power amplifier (PA) (DPA) produces various levels of output power which are typically needed in wireless systems, such as Bluetooth transceivers.
7.3 Ultra-low-voltage, ultra-low-power ADPLL for IoT applications The development of radios for Internet-of-Things (IoT) node devices has spurred research in ultra-low-power (ULP) ADPLLs performing as local oscillators (LO) [8,11–13]. The IoT concept entails stringent conditions on the size and weight of battery or other energy storage used to supply the IoT circuitry. In spite of the recent advancements, the IoT system lifetime is still limited by the power consumption of its radio, and in particular the LO. Figure 7.7 plots a system lifetime for various battery choices as a function of current consumption. State-of-the-art Bluetooth low
262
Digitally enhanced mixed-signal systems ISSCC2017 BLE papers
104
Alkaline SR44 150 mAh
100
10–1
0
mm 12 m 12 m
14.5 mm 5.4 mm
IMEC I IMEC II Macau
Lifetime (h)
Alkaline AAA 1.25 Ah
101
Increase in size
Alkaline AA 3 Ah
102
5 cm
103
Alkaline SR63 10 mAh
5 10 15 Radio current consumption (mA)
20
Typical BLE module
11.6 mm
Figure 7.7 BLE system lifetime across radio current consumption for various battery types energy (BLE) radios consume ∼3 mW and thus can continuously operate no more than 200 h on a single SR44 battery, which has comparable dimensions to the radio module. This triggers inconvenient battery replacements, which limits their marketing attractiveness. The lifetime could be easily extended with larger batteries but that comes at a price of increased weight and size and it is clearly against the vision of IoT miniaturization. Energy harvesters can significantly extend the IoT lifetime up to the point of a perpetual operation, but they typically provide low voltages, often well below typical supply of CMOS circuits, i.e. within 0.25–0.8 V range [14]. This is likely to degrade performance of important ADPLL building blocks. An inverter-based TDC is such an example. An inverter could be considered a basic time-delay cell with regenerative properties which benefits from CMOS scaling and offers the shortest controllable delay at low power consumption, but its delay (i.e. resolution of the TDC) can vary by more than ±50% over PVT [2]. This unnecessarily increases the TDC size (i.e. range overhead) and power consumption and can deteriorate in-band PN and spurious tones. In [15], new system and circuit techniques were exploited to enhance efficiency of an ADPLL-based BLE transmitter. Although a DCO and an output stage of a PA are designed in such a way as to operate directly at the low voltage of harvesters (i.e. ∼0.5 V), the rest ofADPLL blocks still need ∼1 V to operate as shown in Figure 7.8(a). In the rest of this chapter, an ADPLL which is part of a BLE TX, powered directly from a 0.5 V supply source, is presented. While the DCO is directly connected to 0.5 V, an internal regulated switched-capacitor (SC) DC–DC ‘doubler’ boosts the low input voltage to ∼1 V internally and supplies the TDC and other digital circuitry, as shown in Figure 7.8(b). The doubler is an integral part of the TDC output normalization and uses a clock skipping technique to regulate the TDC supply in response to a background detection of its resolution, which is directly correlated with a delay/speed of digital logic. The doubler is specifically optimized for event-based loads, such as the TDC and FREF-based digital logic in the ADPLL, and uses an out-of-phase and multiphase approach, where phases generated from an internal RO are running at roughly ×4 of
Clock generation VDD, high (~1V)
VDD1
VDD, low (~0.5V)
VDD, low (~0.5V)
ADPLL
ADPLL VDD2
VDD1 DCO
(a)
Doubler VDD2 DCO
TDC Digital
263
TDC Buffers
Digital
Buffers
(b)
Figure 7.8 Supply voltage of ADPLL: (a) in prior-art design, e.g. [15] and (b) the presented single low supply voltage solution FREF clock rate. The designed doubler features low area overhead, low impact on the overall energy efficiency and does not introduce any significant spurious tones into the system. An architecture of the ultra-low-voltage (ULV) ADPLL is described in Section 7.5. Section 7.7 investigates the PVT-insensitive TDC and the detailed background calibration. In Section 7.8, the design of a highly efficient SC DC–DC doubler/regulator to achieve small area and power overhead is discussed in detail. To compensate for the absence of low-dropout regulators, the clock skipping technique uses multiphase path for spurious reduction, which is then described and theoretically verified. Finally, to show the effectiveness of the implemented system, Section 7.9 discloses experimental results.
7.4 Switched-capacitor DC–DC converter Conventionally, magnetic-based DC–DC converters are used to boost an input voltage to a desired output voltage [14,16]. However, an on-chip inductor is not yet suitable for integration due to its low quality (Q) factor and increased losses. To overcome the efficiency issue, the inductor needs to be realized off-chip, resulting in an increase in size and cost of the system which again may not fit in the IoT specification budget. On the other hand, the higher power density and Q-factor of an on-chip capacitor makes it a great choice for integration. An SC DC–DC converter relies on only switches and capacitors to transfer and store energy; changing the number and arrangement of the elements results in a specific voltage conversion ratio (VCR). Although choosing a monolithic approach is attractive in many aspects, there are limitations directly related to these basic components [17] which are shortly discussed here. There are quite a few ways to integrate capacitors in CMOS. The capacitors can be divided into two main groups: standard and non-standard. Metal–insulator–metal, deep trenches and ferroelectric capacitors are non-standard structures which basically have high capacitance density and low parasitic losses, but their limited availability and higher cost make them unattractive for integration. MOS gate oxide and MOM
264
Digitally enhanced mixed-signal systems
capacitors are two widely available structures in CMOS which can be considered for integration. Gate oxide capacitors utilize the capacitance between the gate of an MOS transistor and its channel. Depending on the oxide thickness (thick oxide in the case of I/O devices or thin oxide in the case of core devices in a given technology), gate oxide capacitors can have relatively high capacitance density, especially as the technology scales. Despite its high capacitance density, a gate oxide capacitor is not suitable as a flying capacitor since its bottom plate is embedded in the substrate resulting in a high substrate coupling and increased parasitic losses. Also, to reduce its series resistance, necessary layout considerations may result in area overhead, thus limiting the capacitance density. Moreover, as technology scales and the oxide gets thinner, tunnelling leakage can add to the losses. MOM capacitors, on the other hand, are built with the regular metal stack and, depending on the distance of the lowest metal layer to substrate, can achieve a relatively low substrate coupling. Compared to MOS capacitors, MOM capacitors have a lower capacitance density, especially when only top metal layers are used to minimize the bottom plate parasitic losses. In this design, MOM capacitors are used as the flying capacitors where minimizing the bottom plate parasitic losses is important. The maximum voltage in the designed doubler does not reach above 1 V; hence, there is no need for the switches to withstand a higher than the nominal voltage rating. This makes it possible to use single thin-oxide switches. Furthermore, as the voltage increases along the doubler path and gets closer to the maximum available voltage (∼1 V), VGS –Vth of the switches decreases, thus making it harder to turn them on. To decrease the on-resistance of the switches without resorting to bootstrapping, an appropriate selection of N-type metal-oxide-semiconductor (NMOS) and P-type metal-oxide-semiconductor (PMOS) transistors can maximize VGS –Vth . Due to typically low capacitance density of an integrated capacitor, they consume much bigger silicon area compared to switches. As a result, in an integrated context, the switch parasitics are less important than the parasitics of the capacitors. The impact of flying capacitors parasitics on SC DC–DC converters performance are discussed extensively in [17]. A metric which quantizes the combined loss impact of each flying capacitors parasitic is introduced and stated as Msw = kc,i V 2 sw,par,i (7.1) i 2
where V sw,par,i presents the voltage swing on the parasitic substrate coupling and is weighed by the utilization kc,i of that flying capacitor. Also, the losses due to flying capacitors parasitic can be stated as Esw,i = αpar,i Cfly,i V 2 sw,par,i
(7.2)
where αpar,i is the ratio of the parasitic substrate coupling capacitance to the flying capacitor. This metric can be used to compare different SC topologies and VCRs. Figure 7.9 illustrates the most common SC DC–DC in their step-up forms. Based on the comparison in [17], as the VCR increases, Dickson topology outperforms all other topologies in terms of parasitics losses. To minimize the losses due to substrate coupling in our design, MOM capacitors are used as the flying capacitor which have lower αpar,i compared to MOS capacitors.
Clock generation VDC-in 1
1
2
1
C1
C2
2 2
1 1
VDC-in
2 VDC-out
1
C3
2
C1 2
2
1
1
2 VDC-out C3
C2 1
265
2
2
1
(b)
(a) 2
2
1 VDC-in
C1 2
2
1
1 C2
2
2
1 C3 V DC-out
(c)
Figure 7.9 Most common switched-capacitor DC–DC converter topologies in their step-up form: (a) 1:4 Dickson, (b) 1:5 Fibonacci, (c) 1:4 series–parallel
Furthermore, Dickson topology has been utilized which shows the best performance among SC topologies (although for a 1:2 VCR, most of the topologies show similar performance). To maintain the ULV ADPLL performance on par with that corresponding to the optimal 1 V supply of nanoscale CMOS, the design of the SC booster should be considered for the optimum size of flying capacitor and power consumption overhead to achieve high efficiency. To transfer charge between the input and output ports of the converter, capacitors must be charged and discharged, resulting in a voltage drop across the converter. This voltage drop can be represented as an output impedance. An idealized model of the SC converter is shown in Figure 7.10(a). This model consists of an ideal transformer with the expected ‘turns’ ratio and the output impedance Rout that models the voltage drop across the output due to losses in the real converter. To optimize the design for maximum efficiency, the output impedance Rout must be designed to be equal to the impedance seen at the load. There are two limits determining the output impedance: the slow-switching limit (SSL) and the fast-switching limit (FSL) which are related to the switching frequency (Figure 7.10(b)) [18]. For the SSL analysis, the finite resistance of the switches and capacitors is neglected. For this purpose a set of charge multiplier vectors is defined [18]. These vectors correspond to charge flows that occur in each phase. The charge multiplier vectors, the capacitor values and the switching frequency are the parameters that are needed for the SSL analysis. Equation (7.3) presents the simplified version of the SSL output impedance for a two-phase SC converter. Parameters ac,i represent the charge multiplier vectors, as Ci and fsw are capacitor values and the switching frequency, respectively: (ac,i )2 (7.3) RSSL = Ci fsw i∈caps
266
Digitally enhanced mixed-signal systems Log Rout Rout
1:n Vin
FSL
Vout SSL
(a)
(b)
Frequency
Log
Figure 7.10 (a) Idealized model of a switched-capacitor converter and (b) output impedance when RSSL ≈ RFSL The charge multiplier vector in the implemented SC DC–DC converter is 1. As it can be seen in (7.3), RSSL is inversely proportional to the capacitor values and the switching frequency. The other limit is the FSL. FSL is characterized by constant current flows between the capacitors. The on-resistance of the switches is large enough, which prevents the capacitors from approaching equilibrium. In this analysis, the capacitor voltages are constant and the only parameter determining the FSL impedance is the conduction loss in switches. To calculate the FSL impedance, charge multipliers are defined which present the charge flow through switches. Equation (7.4) presents a simplified version of the FSL impedance for a two-phase SC converter. The ar,i correspond to the charge multipliers and the Ri are the value of on-state resistance of the switches: Ri (ar,i )2 (7.4) RFSL = 2 × i∈switches
In the topology used in this design, the charge multipliers have a value of 1. Based on the equation, the FSL impedance increases as the on-resistance of the switches increase. Calculating the exact value of the total output power is very complicated for different SC converter topologies. For a two-phase converter, it can be shown that the total output power is calculated using the following equation: Rout = βR2SSL + (1 − β)R2FSL (7.5) where β has a value between 0 and 1. Figure 7.10(b) shows the output impedance when RSSL ≈ RFSL . In the proposed doubler, the switches are designed to be large enough so the output impedance is dominated by the SSL. There are other losses in an SC converter which would deteriorate the efficiency and have not been considered in the defined model. These are the switching losses which occur due to bottom plate parasitics of the flying capacitors, parasitic capacitance of the switches and the dynamic losses of digital circuitry (mostly contributed by bulky buffers of the switches). Considering the rms power consumption of the TDC and ADPLL logic, the doubler is optimized for the maximum efficiency with the switching clock rate of 80 MHz and a total capacitor of 340 pF (divided between four separate modules).
Clock generation
267
7.5 Low-voltage ADPLL architecture with PVT-tolerant TDC To address the IoT’s lofty goal of perpetual battery-less operation, several power management considerations are taken into account in the implemented ADPLL. The architecture utilizes a single 0.5 V supply for the entire design, as shown in Figure 7.8(b). The supply voltage reduction results in significant power savings for the most power-hungry block, i.e. DCO [15,19], and enables it to be supplied directly from energy harvesters. On the other hand, the TDC and all digital blocks, which consume relatively much less, are supplied from the SC DC–DC converter that regulates the supply of TDC (and all other digital circuitry) to maintain the ADPLL’s PN performance across voltage and temperature variations. Figure 7.11 shows a detailed block diagram of the ULP/ULV ADPLL with an embedded SC doubling regulator. A ‘÷2’ divider following the DCO is used to generate four phases of a variable carrier clock, CKV[0:3], covering the Bluetooth frequency range of fV =2,402–2,478 MHz. The BLE DPA (acting here as an external 50- load driver) is fed by the differential CKV[0]/CKV[2] clock signal. The ADPLL comprises a reference phase accumulator, which receives a FCW, i.e. a ratio of the desired RF carrier frequency, fV , to the reference frequency, fR . The FCW is accumulated in each FREF cycle and generates a reference phase signal RR [k], which is provided to an arithmetic subtractor. All the four phases of the CKV[0:3] are routed to the phase detection circuitry which selects the phase whose rising clock edge is expected to be the closest to the rising clock edge of FREF. The prediction is done
SPI
TX Modulation data
Data FCW TX Mod. . 8
Doubler e[k]
FREF_Div
4.1–5.1 GHz 2 DCO (VDD=0.5V) ÷2
Σ1
RF out
2 TTW VDD_DIG
Prog. divider
TB
**Supplies all digital blocks
CKV[0]
Cdc TDC
(VDD=0.5V)
CKV[0:3] CKV’
0...3
Calibration unit
FREF
PVT
64 RV[i]
RV[k]
Vin (0.5V)
10
ModM
Σ
DCO Nom.
64
ModL
RR[k]–RV[k]+e[k] RR[k] LF ΦE[k] + –
=
Channel FCW
KDCO
4 (2.05—2.55 GHz)
frac(RR[k])
gm calibration and the design of the TDC with switched-capacitor doubler
Figure 7.11 Detailed block diagram of the implemented ADPLL-based BLE TX with background doubler-assisted PVT-calibration for TDC resolution
268
Digitally enhanced mixed-signal systems
based on a fractional part of RR [k]. TheTDC quantizes a time difference between edges of the selected CKV phase (CKV’) and FREF and generates a fractional error correction [k]. On the other hand, edges of CKV[0] are counted and accumulated as RV [k], which is then combined with [k] before fed to the arithmetic subtractor. The PHE is φE [k] = RR [k] − RV [k] + [k]
(7.6)
After going through the type-II infinite impulse response loop filter, the PHE φE [k] updates the DCO tuning word. The TDC resolution tres is kept constant across PVT via a calibration loop. This not only maintains the expected level of PN performance but it is also instrumental in keeping the TDC length as short as possible by not having, for example, to account during the design phase for the fast PVT corner. Furthermore, if tres is fixed at an integer division of the CKV period TR , then the conventional fineresolution TDC normalizing multiplier could be greatly simplified or even entirely avoided if that division is a power-of-two integer. After the loop is settled, the calibration starts by observing φE [k] and correlating it with RR [k] to obtain a gradient ∇ for an LMS adaptation algorithm [12]. This regulates the supply of TDC to maintain its target resolution until the PHE perturbations due to the normalization error are minimized. Thus, in-band PN performance can be maintained across PVT variations. Details of the implemented calibration scheme will be discussed in the following section. To further lower the power consumption, a dynamic programmable reference clock divider (see Figure 7.11) can be engaged right after the loop is settled to scale down the ADPLL’s effective reference clock rate fR from 40 to 5 MHz or even lower, which reduces the dynamic power drain of digital logic while proportionately deteriorating the in-band PN, as LIB ∝ 1/fR . Since the ADPLL loop bandwidth is proportional to FREF, the loop filter coefficients need to be simultaneously adjusted to keep the bandwidth constant. Additionally, once the ADPLL acquires the lock, the digital part of ADPLL can be shut down, thus ultimately improving the power efficiency [15]. The open-loop operation relies on the system tolerance to frequency drift which must be well below the BLE limit of 400 Hz/μs [20].
7.6 Switching current-source oscillator RF oscillators are considered one of the BLE transceiver’s most power-hungry circuitry and to go towards an ULP and voltage design, they must be very power efficient and preferably operate directly at the energy harvester output [15]. PN and figure of merit (FoM) of any RF oscillator at an offset ω from the resonating frequency ω0 = 2πf0 can be expressed by L (ω) = 10 log10
ω 2 KT 0 · F · ω 2Qt 2 αI αV PDC
(7.7)
Clock generation and
FoM = 10 log10
103 KT ·F 2Qt 2 αI αV
269
(7.8)
where K is the Boltzmann’s constant, T is the absolute temperature, Qt is the LC-tank quality factor, αI is the current efficiency, defined as a ratio of the fundamental current harmonic Iω0 over the oscillator DC current IDC and αV is the voltage efficiency, defined as a ratio of the single-ended oscillation amplitude Vosc /2 over the supply voltage VDD and F is the oscillator’s effective noise factor. The RF oscillator’s PDC is derived by PDC = VDD × IDC =
2VDD 2 αV · Rin αI
(7.9)
where Rin is an equivalent differential input parallel resistance of the tank’s losses. By taking into account the BLE blocking profile in [20], the oscillator’s PN needs to be better than −105 dB c/Hz at 3 MHz offset from 2.45-GHz carrier frequency which makes the requirements for IoT applications quite trivial. Consequently, reducing PDC can be considered the main goal in IoT applications. Lower PDC can be achieved by increasing Rin to some extent which can be limited by the degradation of the inductor’s Q-factor. Furthermore, supplying the oscillator at a lower VDD and minimizing αV /αI can further reduce the oscillator’s power consumption. Figure 7.12 illustrates the implemented oscillator schematic and simulated waveforms. The two-port resonator consists of a step-up 1:2 transformer and tuning capacitors. M1,2 are the current-source transistors, setting the oscillator’s DC current and along with M3,4 enable the switching of the tank current direction which will double αI .
VDD=0.5V
C2 GB
0.8
VB 0.5L2 0.5L2
GA
M4
0.7
DB
C1
DA
GA
DB
0.6
L1
Voltage (V)
M3
VDD
0.5
VB
0.4 0.3
DA
DB
0.2 0.1 M1
M2 GA
GB
0 Time (s)
(b)
(a) Figure 7.12 Switching current-source oscillator [15]: (a) schematic and (b) waveforms
270
Digitally enhanced mixed-signal systems
On the other hand, VDD can be decreased as low as VOD1 + VOD3 ≈ Vt . Consequently, the designed oscillator can be directly powered by an energy harvester output.
7.7 Calibration for PVT-insensitive time-to-digital converter (TDC) Traditionally, TDC gain is adjusted via a digital multiplier [2], controlled by an LMS adaptation algorithm [12]. A recent alternative is to maintain a constant TDC inverter delay with a feedback loop by digitally tuning the inverter loading capacitors [21], as shown in Figure 7.13(a). However, that would require a higher driving strength of the inverter cells, and thus larger power consumption, since the inverter delay is proportional to C/gm . Therefore, it appears less suitable for ULP applications. In [22], equidistant phases of an RO, which is injection-locked to a DCO, are used to quantize the DCO phase. However, a free-running frequency of RO could drift away from the locking range over temperature and cause the ADPLL to lose its lock. To overcome the shift of delay characteristics of the TDC due to PVT, a second feedback path (Figure 7.11), extending from the PHE φE [k] to a calibration unit controlling the SC regulator, is implemented. After the ADPLL is locked, the calibration loop is enabled, and it keeps on working in the background to ensure a fixed tres in face of temperature and voltage changes. As shown in Figure 7.13(b), the PVT compensation is done by means of regulating the TDC supply voltage as an alternative to the tuning of inverter loading capacitors. The calibration mechanism strives to keep a fixed resolution independent of PVT. Each TDC tuning word (TTW) corresponds to a specific TDC resolution value and it is designed in such a way that TTWmax /2 is a function of the desired TDC resolution. The TDC resolution tres can be interpreted as the propagation delay tpd of TDC inverters, which is the average delay time of a loaded inverter with
Δt ~ From ADPLL
L CKV’
IN ; where CLOAD = CINV + Ccalib CLOAD Ccalib
Ccalib
Ccalib
Ccalib
Q’(1)
L Doubler with logic control
CKV’
FREF
(a)
From ADPLL Vin (~0.5V)
VDD_DIG All (~1V) digital blocks
FREF
Q’(2)
Q’(3)
Q’(L)
Q’(1)
L
Q’(2)
Q’(3)
Q’(L)
L
(b)
Figure 7.13 Concept of TDC delay stabilization (a) through regulating capacitive load, CLOAD [11] and (b) proposed regulating doubler output, VDD− DIG, via clock skipping, affecting charging current, IN
Clock generation
271
high-to-low and low-to-high propagation delay (tpHL and tpLH , respectively) and can be stated as [23]: tpHL + tpLH CL · VDD 1 CL · VDD tpd = (7.10) + = 2 2 Kn · (VDD − VthN )α Kp · (VDD − VthP )α where CL is the inverter’s load capacitance, VDD is its supply voltage, VthN and VthP are the threshold voltages of NMOS and PMOS, respectively, Kn and Kp are the NMOS and PMOS trans-conductance, respectively, and α < 2 is suitable for short channel devices [24]. Figure 7.14 shows the SPICE simulated tres across supply voltage. The crossed line indicates the tres predicted by (7.10). Note, as VDD goes very low, the devices spend less time in saturation and (7.10) is no longer an accurate estimation for the inverter delay. Hence, some deviation between the simulations (dots) and equation (crossed line) at 0.7 V is observed. At a specific temperature and process (fixed KN /P , VthN /P and CL ), tres is a function of an instant supply voltage. Any tuning word greater/less than the targeted TTWmax /2 corresponds to a lower/higher tres and a higher/lower supply voltage. Therefore, in order to keep the tres fixed, TTWmax /2 can work as a threshold for skipping the doubler’s switching clock cycles as introduced below. Figure 7.15 explains the principle of the doubler’s clock skipping. In this work, a 1-stage Dickson converter [25,26] is realized to achieve a 1:2 step-up conversion. During φ1, the flying capacitor is charged by the input voltage (VC = Vin − 0 = Vin ). At the beginning of φ2, the bottom-plate is connected to the input and, due to the fact that charge cannot be transferred instantaneously, the top-plate, which is now 26 Simulation Fitted curve Equation
24 20 mV
–95.64 –96.4 –97.23
20 18
–98.14
1.59 ps
16
–99.17
14
–100.33
12
–101.66
10
–103.25
20 mV
–105.19
8 0.43 ps
6 4 0.7
In-band phase noise (dB c)
TDC resolution (ps)
22
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
–107.69 1.15
–111.2 1.2
TDC supply voltage (V)
Figure 7.14 SPICE simulation results of TDC resolution versus supply voltage, superimposed on (7.10) model. Right y-axis: expected in-band PN calculated by LIB = ((2π)2 /12)(tres /TV )2 (1/fR ) [2], where TV = 417 ps and fR = 40 MHz
272
Digitally enhanced mixed-signal systems CLKswitching
V1 Vin
Voltage(V)
2×Vin
Vin
Vout
Cfly
Φ1 Vout V1 V1
Vin
Cfly
Vout Φ2
Time (s) • Skipping CLK Temperature ResolutionTDC (ps) Decrease Vout • CLK generation Temperature ResolutionTDC (ps) Increase Vout
Switching CLK
Figure 7.15 Principle of regulating doubler’s output via clock skipping
connected to the output, should become ×2Vin so that the voltage across the capacitor remains constant (VC = 2 × Vin − Vin = Vin ). As the decision is made in the calibration unit that the output voltage needs to be decreased, the doubler’s clock generator is momentarily turned off and no more switching takes place in the doubler module and no charge is transferred to the output decoupling capacitor Cdc . The output voltage is then naturally let, to slowly decrease until the tres reaches its target value. Figure 7.16(a) presents a block diagram of the implemented calibration loop. The TDC output is normalized by the targeted resolution. If tres deviates from the target, a periodical PHE perturbation is induced [27]. The PHE is then fed to the calibration unit which works based on an LMS algorithm trying to force φE [k] to zero by adjusting a multi-bit digital TTW: TTW [k + 1] = TTW [k] − TTW [k]
(7.11)
where TTW [k] is an adjustment code determined by φE [k]. Based on TTW, the switcher increases or decreases VDD_DIG by skipping less or skipping more of the switcher’s clock pulses. For example, if temperature decreases/increases, tres increases/decreases and the calibration unit will generate a TTW less/greater than TTWmax /2 to tell the switcher to skip less/more cycles to increase/decrease VDD_DIG in order to maintain tres across PVT (1 → 2 → 3 and 1 → 2 → 3 in Figure 7.16). To gain insight into the calibration loop operation, Figure 7.17 plots the simulated results for two cases of temperature at 120◦ C and −40◦ C. An LMS algorithm which
Clock generation
273
Calibration Enabled
CLK
TTW
–40°C
TDC Norm.
φE Target ∆tres
From ADPLL Loop
0
TTW
VDD_DIG
Calibration unit (From ADPLL)
120°C
40°C
VDD-DIG ∆tres
6ps 8ps 10ps TDC resolution
Switchedcapacitor modules
|φE|
Control logic unit
TTW (N bit)
Temp
Doubler
8ps 1V 0.9V 0.8V Time
(a)
(b) 1
TDC resolution (∆tres)
1
T
2 2'
V
3
V
3'
2
10ps 8ps
T
1
3'
3 –40°C 25°C
6ps 2'
120°C
0.8V
(c)
0.9V 1V VDD_DIG TDC resolution is maintained under different PVT variation
Figure 7.16 Conceptual diagram of the implemented calibration for temperature variations to maintain TDC resolution: (a) block diagram, (b) waveforms, (c) change in TDC resolution across voltage and temperature keeps regulating the TDC supply voltage VDD until the PHE φE perturbation is zero, is implemented as VDD [k + 1] = VDD [k] − μ · φE [k] ·
tR [k] − tV [k] tTARGET
(7.12)
where VDD is the TDC supply voltage, tTARGET is the target tres , μ is a constant convergence step and tR [k] and tV [k] are the reference and CKV timestamps, respectively. In each iteration, the error to the targeted resolution is observed and the supply voltage is regulated. The model uses (7.10) to generate tres in each iteration. An increase/decrease in temperature leads to a decrease/increase in the MOS threshold voltage, which will result in the departure of tres from its targeted value and the induction of PHE perturbations. In each case, the calibration is enabled after 500 FREF clock iterations to force the PHE perturbations to vanish. To gain further insight, the simulations are then carried out with the ADPLL engaged in Figure 7.18. Due to the nature of (7.10) and the fact that for voltages higher
274
Digitally enhanced mixed-signal systems 120° Calibration enabled
40° VDD = 1V
–40° Calibration enabled
Vth(V)
0.6 0.55 0.5 0.45 0.4
40° VDD = 1V
1,000
2,000
3,000
4,000 FREF clocks
0
1,000
2,000
3,000
5,000
6,000
7,000
8,000
VDD_TDC (V)
4,000 FREF clocks
1.1 1.05 1 0.95 0.9
0
1,000
2,000
3,000
4,000 FREF clocks
5,000
6,000
7,000
8,000
0.4 0.2 0 –0.2 –0.4 0
1,000
2,000
3,000
4,000 FREF clocks
5,000
6,000
7,000
8,000
Resolution (ps)
0
Phase error
fV = 2.401GHz fR = 40MHz μ = 0.0001 5,000
6,000
7,000
8,000
10 8 6 4
TDC resolution (ps)
Figure 7.17 Simulated results of the implemented calibration for exaggerated temperature variation and LMS bandwidth
30 25 20 15 10 5 0
VDD_TDC (V)
(a)
Phase error
fV = 2.401GHz fR = 40MHz μ = 0.0001
0
500
1,000
1,500
2,000
V[0] = 0.7V V[0] = 0.9V V[0] = 1.1V V[0] = 1.3V V[0] = 1.5V
2,500
3,000
3,500
4,000
2,000 2,500 FREF clocks
3,000
3,500
4,000
FREF clocks
1.6 1.4 1.2 1 0.8 0.6
(b)
(c)
Calibration enabled
3 2 1 0 –1 –2
0
500
1,000
1,500
V[0] = 0.7V
PHE = 0
0
500
1,000
1,500
V[0] = 1.5V
2,000 2,500 FREF clocks
3,000
3,500
PHE=0 4,000
Figure 7.18 Simulated results of the LMS algorithm with different starting supply voltages: (a) TDC resolution, (b) TDC supply voltage, (c) phase error
Clock generation
275
than 1.1 V the change in resolution is close to zero (see Figure 7.14), it would be harder for the algorithm to converge to the target resolution. Figure 7.18(c) shows the simulated result of φE against of FREF clock iteration. In the case where VDD [0] = 0.7 V, the φE converges to zero after 1,500 clock cycles, while in the case where VDD [0] = 1.5 V it takes more than 3,500 cycles for φE to reach zero. Unlike the sudden temperature jumps in Figure 7.17 simulations, the real temperature change rate in the considered applications is very low ( 0} converging to optimal filter h0 . By definition, this optimal filter is in the intersection of all hyperplans {(n), n > 0}. Thus, LMS algorithm projects vector h[n − 1] on (n) colinearly to x[n] to get h[n] satisfying ||v[n + 1]|| < ||v[n]||. Converging time is related to the angle between (n − 1) and (n) hyperplans corresponding to correlation between x[n] and x[n − 1]. To solve this problem, the APAs have been introduced [16,17]. In these algorithms, the projection is on the intersection of K hyperplans K−1 (n − i). The norm of v[n] is then reduced, and the convergence i=0 time is improved. The convergence can be determined analytically. The desired signal can be written as d[n] = x[n]t h0 + u[n]
(9.18)
where u[n] is a noise vector. Vector v[n] update is equal to v[n + 1] = G[n]v[n] + Q[n]u[n] (9.19) T −1 T T −1 with G[n] = IL − μX[n] X [n]X[n] X [n] and Q[n] = μX[n] X [n]X[n] If we take the mean of this expression, and under independence assumptions, it leads to E[v[n + 1]] = E[G[n]]E[v[n]] + E[Q[n]]E[u[n]]
(9.20)
Noise vector u[n] is supposed to be zero-mean which leads to E[v[n + 1]] = E[G[n]]E[v[n]]
(9.21)
In [17], a model has been developed to compute the convergence of E[G[n]]. It is based on energy conservation and shows that convergence is ensured for 0