124 33 46MB
English Pages 419 [411] Year 2020
Slawomir Koziel Anna Pietrenko-Dabrowska
Performance-Driven Surrogate Modeling of High-Frequency Structures
Performance-Driven Surrogate Modeling of High-Frequency Structures
Slawomir Koziel • Anna Pietrenko-Dabrowska
Performance-Driven Surrogate Modeling of High-Frequency Structures
Slawomir Koziel School of Science and Engineering Reykjavik University Reykjavik, Iceland
Anna Pietrenko-Dabrowska Faculty of Electronics, Telecommunications and Informatics Gdansk University of Technology Gdansk, Poland
ISBN 978-3-030-38925-3 ISBN 978-3-030-38926-0 https://doi.org/10.1007/978-3-030-38926-0
(eBook)
© Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To our families: Dorota, Anna Halina, Janusz, Kinga, Wladek, Bronek, and Stas
Preface
The development of modern high-frequency structures, including microwave and antenna components, heavily relies on full-wave electromagnetic (EM) simulation models. Notwithstanding, EM-driven design entails considerable computational expenses. This is especially troublesome when solving tasks that require massive EM analyses, parametric optimization, and uncertainty quantification being representative examples. The employment of fast replacement models, also referred to as surrogates, has been fostered as a way of mitigating these issues. Unfortunately, conventional modeling methods are of limited applicability for handling nonlinear outputs of high-frequency devices. The reason is the curse of dimensionality but also a fundamental requirement that design-ready surrogates are to cover wide ranges of the system parameters and its operating conditions. This book offers a different methodological perspective on modeling of high-frequency structures, specifically the concept and implementation of constrained or performance-driven surrogates. The presented approach addresses the issues of dimensionality and parameter ranges through appropriate confinement of the model domain, focused on the regions that are promising from the point of view of the relevant design objectives. The performance-driven paradigm enables the construction of reliable surrogates at a fraction of cost required by conventional methods and to accomplish modeling tasks where other techniques routinely fail. The book provides a broad selection of specific frameworks, extensively illustrated using examples of real-world microwave and antenna structures. Applications, including parametric optimization and multiobjective design, are also discussed, along with the exposition of inverse modeling methods. Furthermore, the book contains introductory material on data-driven and physics-based surrogates. Practical aspects of high-frequency surrogate modeling and recommendations concerning particular techniques are discussed as well. Reykjavik, Iceland Gdansk, Poland December 2019
Slawomir Koziel Anna Pietrenko-Dabrowska
vii
Acknowledgments
We would like to acknowledge the efforts of all those students, researchers, and colleagues who have helped us during the research work presented in this book. We would also like to thank Dassault Systèmes, France, for making CST Microwave Studio available for our research purposes, Sonnet Software Inc. for Sonnet em, and Keysight Technologies for ADS.
ix
Contents
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 11
2
Basics of Data-Driven Surrogate Modeling . . . . . . . . . . . . . . . . . . 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Design of Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Factorial Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Space-Filling Designs . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Sequential Sampling . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Modeling Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Polynomial Regression Models . . . . . . . . . . . . . . . . . 2.3.2 Radial Basis Functions . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 Support Vector Regression . . . . . . . . . . . . . . . . . . . . 2.3.5 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . 2.3.6 Fuzzy Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.7 Polynomial Chaos Expansion . . . . . . . . . . . . . . . . . . 2.3.8 Other Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Model Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
23 23 26 27 27 30 32 33 34 35 37 38 40 42 44 49 52
3
Physics-Based Surrogate Modeling . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Low-Fidelity Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Principal Properties and Techniques . . . . . . . . . . . . . 3.2.2 Variable-Resolution and Variable-Accuracy Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Variable-Fidelity Physics Modeling . . . . . . . . . . . . . . 3.2.4 Low-Fidelity Model Selection . . . . . . . . . . . . . . . . . . 3.3 Physics-Based Surrogates: Basic Concepts . . . . . . . . . . . . . . . 3.4 Response Correction Models . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
59 59 61 62
. . . . .
63 66 69 70 75 xi
xii
Contents
3.4.1
Global Modeling Using Multipoint Space Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Space Mapping with a Function Approximation Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3 Multipoint Output Space Mapping . . . . . . . . . . . . . . . 3.4.4 Surrogate Modeling Using Generalized Shape-Preserving Response Prediction . . . . . . . . . . . . 3.5 Feature-Based Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Feature-Based Modeling for Statistical Analysis . . . . . 3.5.2 Feature-Based Modeling of Antenna Input Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Physics-Based Surrogates for Optimization . . . . . . . . . . . . . . . 3.6.1 Space Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.2 Approximation Model Management Optimization . . . 3.6.3 Manifold Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.4 Shape-Preserving Response Prediction . . . . . . . . . . . . 3.6.5 Adaptively Adjusted Design Specifications . . . . . . . . 3.6.6 Feature-Based Optimization . . . . . . . . . . . . . . . . . . . 3.6.7 Adaptive Response Scaling . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
5
Design-Oriented Modeling of High-Frequency Structures . . . . . . . 4.1 Data-Driven Modeling by Constrained Sampling . . . . . . . . . . . 4.1.1 Uniform Versus Constrained Sampling . . . . . . . . . . . 4.1.2 Modeling Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.3 Illustration Examples . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Design-Oriented Constrained Modeling for Operating Frequency and Substrate Parameters . . . . . . . . . . . . 4.2.1 Modeling Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Case Study: Ring Slot Antenna . . . . . . . . . . . . . . . . . 4.2.3 Application Examples and Experimental Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Constrained Feature-Based Modeling of Compact Microwave Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Case Study. RRC and Response Features . . . . . . . . . . 4.3.2 Modeling Methodology . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Numerical Verification and Application Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
75
. .
77 79
. . .
81 87 88
. . . . . . . . . .
93 99 100 102 103 105 107 109 114 122
. . . . .
129 130 130 132 133
. 137 . 137 . 140 . 143 . 143 . 143 . 146 . 149 . 151
Triangulation-Based Constrained Modeling . . . . . . . . . . . . . . . . . . 5.1 Reference Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Surrogate Model Domain Definition . . . . . . . . . . . . . . . . . . . . . 5.3 Surrogate Model Construction . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Demonstration Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . .
153 154 154 158 158
Contents
5.4.1 UWB Monopole Antenna . . . . . . . . . . . . . . . . . . . . . 5.4.2 Uniplanar Dipole Antenna . . . . . . . . . . . . . . . . . . . . 5.4.3 Miniaturized Microstrip Coupler . . . . . . . . . . . . . . . . 5.5 Uniform Sampling Methods for Triangulation-Based Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Uniform Sampling Scheme . . . . . . . . . . . . . . . . . . . . 5.5.2 Demonstration Examples . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
7
xiii
. 158 . 163 . 166 . . . .
172 173 174 177
Nested Kriging Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Modeling Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Objective Space: Geometry of Optimum Design Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.2 Reference Designs and Level I Surrogate . . . . . . . . . . . 6.1.3 Surrogate Model Domain . . . . . . . . . . . . . . . . . . . . . . 6.1.4 Level II Surrogate . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.5 Design of Experiments . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Demonstration Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Uniplanar Dipole Antenna (Antenna I) . . . . . . . . . . . . . 6.2.2 Ring Slot Antenna (Antenna II) . . . . . . . . . . . . . . . . . . 6.2.3 Miniaturized Rat-Race Coupler (RRC) . . . . . . . . . . . . . 6.2.4 Impedance Matching Transformer . . . . . . . . . . . . . . . . 6.3 Application Case Studies: Design Optimization . . . . . . . . . . . . . 6.3.1 Optimization Methodology and Initial Design . . . . . . . 6.3.2 Results for Antennas I and II . . . . . . . . . . . . . . . . . . . . 6.3.3 Design Optimization of RRC and Impedance Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Improved Design of Experiments for Nested Kriging Surrogate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Modified Design of Experiments Procedure . . . . . . . . . 6.4.2 Demonstration Case Studies . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
179 180
197 199 203 204
Feature-Based Constrained Modeling . . . . . . . . . . . . . . . . . . . . . . 7.1 Modeling Framework: Incorporating Response Features into Nested Kriging Surrogates . . . . . . . . . . . . . . . . . 7.1.1 Response Features . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2 Level I Surrogate . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.3 Surrogate Model Construction . . . . . . . . . . . . . . . . . . 7.1.4 Modeling Framework . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Demonstration Case Study I: Dual-Band Dipole Antenna . . . . . 7.2.1 Test Case and Problem Statement . . . . . . . . . . . . . . . 7.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
208 208 210 211 212 214 214 215 216
180 181 182 183 184 185 186 189 190 193 195 195 196 196
. 207 . . . . . . . . .
xiv
Contents
7.3
8
9
Demonstration Case Study II: Triple-Band Dipole Antenna . . . 7.3.1 Test Case and Problem Statement . . . . . . . . . . . . . . . 7.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Demonstration Case Study III: Compact Microwave Coupler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Test Case and Problem Statement . . . . . . . . . . . . . . . 7.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
216 217 218 219
. . . . .
219 221 222 224 224
Constrained Modeling Using Principal Component Analysis . . . . . 8.1 Modeling Using Domain Confinement and Principal Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 Design Space and Objective Space . . . . . . . . . . . . . . 8.1.2 Reference Designs: Principal Components . . . . . . . . . 8.1.3 Surrogate Model Domain . . . . . . . . . . . . . . . . . . . . . 8.1.4 Design of Experiments: Surrogate Model Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.5 Surrogate Model Optimization . . . . . . . . . . . . . . . . . 8.2 Demonstration Case Study I: Uniplanar Dipole Antenna . . . . . 8.2.1 Antenna Structure and Problem Statement . . . . . . . . . 8.2.2 Numerical Results and Benchmarking . . . . . . . . . . . . 8.2.3 Application Examples . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Demonstration Case Study II: Ring Slot Antenna . . . . . . . . . . 8.3.1 Antenna Structure and Problem Statement . . . . . . . . . 8.3.2 Numerical Results and Benchmarking . . . . . . . . . . . . 8.3.3 Application Examples . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Demonstration Case Study III: Impedance Matching Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1 Transformer Structure and Problem Statement . . . . . . 8.4.2 Numerical Results and Benchmarking . . . . . . . . . . . . 8.4.3 Application Examples . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Demonstration Case Study IV: Rat-Race Coupler . . . . . . . . . . 8.5.1 Coupler Structure and Problem Statement . . . . . . . . . 8.5.2 Numerical Results and Benchmarking . . . . . . . . . . . . 8.5.3 Application Examples . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 227
Variable-Fidelity Performance-Driven Modeling . . . . . . . . . . . . . . 9.1 Variable-Fidelity Performance-Driven Modeling: Procedure Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Variable-Fidelity Performance-Driven Modeling Using Co-kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.1 Co-kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.2 Illustration Case Study: Impedance Matching Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
228 228 229 230
. . . . . . . . . .
231 231 233 233 233 234 237 237 238 239
. . . . . . . . .
240 241 241 242 242 242 244 246 247
. 249 . 250 . 251 . 251 . 254
Contents
Variable-Fidelity Performance-Driven Modeling Using Two-Level Gaussian Process Regression . . . . . . . . . . . . 9.3.1 Gaussian Process Regression (GPR) Basics . . . . . . . . 9.3.2 Two-Stage GPR Modeling . . . . . . . . . . . . . . . . . . . . 9.3.3 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Variable-Fidelity Performance-Driven Modeling Using Space Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.1 Model Correction Using Space Mapping . . . . . . . . . . 9.4.2 Demonstration Case Study: Dual-Band Microstrip Dipole Antenna . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xv
9.3
10
11
. . . .
255 256 257 259
. 264 . 264 . 268 . 273 . 274
Constrained Modeling for Efficient Multi-objective Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Multi-objective Design: Problem Formulation and Solution Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Multi-objective Optimization Using Surrogate Models . . . . . . . . 10.3 Multi-objective Optimization Using Triangulation-Based Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 Surrogate Model Domain . . . . . . . . . . . . . . . . . . . . . . 10.3.2 Demonstration Example and Results . . . . . . . . . . . . . . 10.4 Multi-objective Optimization Using Nested Kriging . . . . . . . . . 10.4.1 Nested Kriging Modeling: Brief Recollection . . . . . . . . 10.4.2 Nested Kriging Modeling for Multi-Objective Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.3 Application Case Study I: Planar Yagi Antenna . . . . . . 10.4.4 Application Case Study II: Wideband Monopole Antenna . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.5 Application Case Study III: Impedance Matching Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Warm-Start Design Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Accelerated Optimization Using Design Database . . . . . . . . . . . 11.1.1 Design Database: Initial Design by Database Geometry Exploration . . . . . . . . . . . . . . . . . . . . . . . . 11.1.2 Optimization Procedure . . . . . . . . . . . . . . . . . . . . . . . 11.1.3 Case Study I: Planar Yagi Antenna . . . . . . . . . . . . . . . 11.1.4 Case Study II: Compact Rat-Race Coupler . . . . . . . . . . 11.2 Warm-Start Optimization Using Kriging Surrogates . . . . . . . . . 11.2.1 Database Designs and Kriging Surrogates . . . . . . . . . . 11.2.2 Optimization Procedure I: TR Gradient Search . . . . . . . 11.2.3 Optimization Procedure II: Iterative Correction Scheme References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
277 278 283 284 285 287 292 292 296 299 303 306 312 315 316 317 318 319 322 325 326 326 328 338
xvi
12
13
Contents
Inverse Surrogates for Accelerated Simulation-Driven Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Fast Dimension Scaling Using Inverse Surrogates . . . . . . . . . . 12.1.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . 12.1.2 Inverse Model Construction . . . . . . . . . . . . . . . . . . . 12.1.3 Scaling Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.4 Case Study I: Miniaturized Rat-Race Coupler . . . . . . . 12.1.5 Case Study II: Bandwidth-Enhanced Patch Antenna . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.6 Case Study III: Scaling of Band-Notch UWB Antenna . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Scaling for Multiple Operating Conditions . . . . . . . . . . . . . . . 12.2.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . 12.2.2 Inverse Model Construction . . . . . . . . . . . . . . . . . . . 12.2.3 Scaling Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2.4 Iterative Correction Scheme . . . . . . . . . . . . . . . . . . . 12.2.5 Case Study I: Dual-Band Miniaturized Coupler . . . . . 12.2.6 Case Study II: Dual-Band Antenna—Re-design for Substrate Parameters . . . . . . . . . . . . . . . . . . . . . . 12.2.7 Case Study III: Triple-Band Dipole Antenna . . . . . . . 12.2.8 Case Study IV: Four-Objective Scaling of Compact RRC . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 Advanced Correction Schemes . . . . . . . . . . . . . . . . . . . . . . . . 12.3.1 Fast Re-design of Compact Couplers: Corrected Scaling for Power Split Control . . . . . . . . . 12.3.2 Optimization-Based Forward Power Split Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
341 342 342 343 345 346
. 348 . . . . . . .
350 355 355 355 357 357 358
. 360 . 363 . 369 . 374 . 376 . 381 . 389
Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Chapter 1
Introduction
Computational models have become a backbone of contemporary engineering design. Their advantage over simpler (primarily analytical or semiempirical) methods of describing components and devices is in a capability of comprehensive handling and quantifying physical phenomena that affect the system operation, as well as in providing reliable values of performance figures pertinent to the design task being conducted. The development of computer hardware and simulation techniques has been unprecedented over the last several decades. Commercial simulation packages utilized in various fields, e.g., mechanical engineering (Fusion 360; Autodesk 2019), aerospace engineering (Inventor; Autodesk 2019), highfrequency electronics (Cadence Allegro 2019; Microwave Office, National Instruments 2019), multi-physics simulation (Multiphysics Simulation; ANSYS 2019; COMSOL Multiphysics, COMSOL Inc. 2018), etc., have reached a genuinely high level of sophistication. This is to the extent which makes it possible to evaluate complex structures and large-scale systems, e.g., antenna allocation on military vehicles (Byun et al. 2013), civil aircrafts (incorporating 3D model of the wings and the fuselage) (Liersch and Hepperle 2011), the ship on a random sea surface (Hao and Sheng 2017), or turbulent airflow through dual-rotor wind turbines (Rosenberg et al. 2014), as well as conduct multi-physics analysis (Keyes et al. 2013; Sharma and Sarris 2016; Cho et al. 2017; Fang et al. 2017; Szakmany et al. 2018; Ravelo 2018). Given reliable data on the material parameters and boundary conditions as well as sufficiently dense discretization of the structure at hand, the simulation models provide accuracy that allows us to replace costly prototyping. In high-frequency electronics, the main application area of this book, the primary type of computational modeling is full-wave electromagnetic (EM) analysis (Davidson 2010; Sullivan 2013; Sevgi 2014; Swanson and Hoefer 2003; White 2004). By numerical solving of Maxwell equations established over a selected computational domain, it is possible to find the distribution of the electric and magnetic fields therein and, through appropriate post-processing, acquire relevant system characteristics such as scattering parameters (Nikolova et al. 2006) or radiation patterns (Paulotto et al. 2008). A large number of general purpose and specialized analysis © Springer Nature Switzerland AG 2020 S. Koziel, A. Pietrenko-Dabrowska, Performance-Driven Surrogate Modeling of High-Frequency Structures, https://doi.org/10.1007/978-3-030-38926-0_1
1
2
1 Introduction
techniques have been developed that are suitable for various purposes (e.g., finiteelement analysis; Jin 2002; finite-difference time-domain analysis; Webb 2004; methods of moments; Gibson 2007, etc.). EM-driven design including design closure (in particular, adjustment of geometry and/or material parameters to finetune the system performance) has become an academic and industry standard. Nowadays, it is possible to conduct the entire design process within the simulation environment. Many commercial simulation software packages are available, including the general purpose EM solvers such as HFSS (HFSS 2019), Altair FEKO (FEKO 2018), CST Microwave Studio (CST 2018), XFDTD (XFDTD 2016), Momentum (Keysight 2019), or Sonnet em (Sonnet 2018), but also specific design tools (e.g., ADS; Advanced Design System, Keysight 2019, Antenna Magus; Antenna Magus 2019, for antenna design). Initially, the main application of computational models was design verification. Rapid development of simulation techniques, software, and hardware, enabled a possibility of carrying out simulation-driven design. Undoubtedly, the most common design task is parametric optimization, where the values of selected variables are adjusted in order to improve the system performance (Bubnicki 2005; Zaslavski 2010; Pistikopoulos et al. 2007; Koziel and Bandler 2015; Koziel et al. 2016; Cao et al. 2011; Chakravorty and Mandal 2016; Sadrossadat et al. 2013). The latter is quantified by means of appropriately defined objective function (Koziel et al. 2013; Sobester and Forrester 2015). Other tasks include statistical design (e.g., Monte Carlo analysis; Styblinski and Opalski 1986) as well as uncertainty quantification (Hosder 2012; Allaire and Willcox 2014). A wide range of specialized algorithms have been developed to perform each of these tasks (Nocedal and Wright 2000; Conn et al. 2009; Gorissen et al. 2010; Yang 2010). Some of the methods are generic, i.e., applicable to a number of problems in different areas (Nocedal and Wright 2000); others are problem-specific (e.g., Koziel et al. 2013). It should be emphasized that utilization of the simulation models in the design processes has actually become a practical necessity for a growing number of components and systems. This is because traditional approaches, largely based on design-ready theoretical models, are no longer adequate. One of the reasons is the increasing level of complexity of engineering systems; another are various system- and component-level interactions that have to be taken into account in the design process (Kozakoff 2010; You et al. 2014; Bekasiewicz and Koziel 2015; Wang et al. 2018; Mandic et al. 2019). Needless to say, simulation-driven design is quite a challenging problem for a majority of real-world cases. The fundamental issue is a high cost of evaluating the computational models. The simulation times very much depend on the model complexity and may be just a few seconds per frequency for simple electromagnetic (EM) analysis of two-dimensional models (e.g., planar microwave filters; Hazdra et al. 2005), a few minutes for computational fluid dynamics (CFD) analysis of two-dimensional airfoil profiles (Siegler et al. 2016) or EM analysis of compact planar antennas (Bekasiewicz and Koziel 2015), and up to a few hours (e.g., CFD analysis of three-dimensional structures such as aircraft wings, or EM analysis of integrated photonic components; Krause and Jäger 2005; Fakhfakh et al. 2015).
1 Introduction
3
Particularly involved structures (a full aircraft, a ship, climate models) may require many hours or even days of the analysis time (Wehner et al. 2010; Dennis et al. 2012; Yondo et al. 2018). Interestingly, as the engineers consider more and more complex systems, the high-cost bottleneck is still there despite all the advancements in hardware and simulation software. Long analysis times can make simulationdriven optimization prohibitive when conventional algorithms are utilized as the latter typically require a large number of objective function evaluations. The problem is more pronounced for high-dimensional parameter spaces but also when global or multi-objective optimization is necessary. The most popular global search procedures involve population-based metaheuristics, which are extremely inefficient in computational terms (typical number of objective function evaluations ranges from a few thousands to many thousands per algorithm run). Another issue pertinent to computational models is the numerical noise, which may be a result of terminating the simulation process before full convergence, or it may be related to adaptive meshing employed by certain solvers. The latter manifests itself through noticeable changes of the simulated system responses due to even very small changes of the structure geometry (these leading to considerable changes of the mesh topology). The noise may affect the operation of gradient-based optimization routines that normally require the objective function to be smooth. The issues related to high cost of computational models can be mitigated to a certain extent by using adjoint sensitivities (Director and Rohrer 1969; Pironneau 1984; Jameson 1988; El Sabbagh et al. 2006; Papadimitriou and Giannakoglou 2008; Toivanen et al. 2009) or automated differentiation (Griewank 2000; Bischof et al. 2008). These methods allow for a fast evaluation of gradients of the figures of interest at small extra computational effort (often only one additional simulation) regardless of the number of designable parameters. Consequently, the benefits of adjoints are particularly evident for higher-dimensional problems. Adjoint sensitivities are currently available in some commercial simulation packages in various areas (e.g., computational fluid dynamics solvers: ANSYS Fluent 2015; Star-CCM+ 2015), as well as some noncommercial codes (e.g., Stanford University Unstructured; Palacios et al. 2013). In the context of high-frequency modeling, currently only CST Microwave Studio (CST 2018) and ANSYS HFSS (HFSS 2019) support this technology. It should be mentioned that the challenges discussed in the previous paragraph led to the development of various interactive forms of utilizing simulation models in the design work, still commonly used in many areas. The key factor is the engineering insight, which allows for making reasonable predictions about the promising changes of the system parameters. The practical workflows typically involve parameter sweeping (usually, one parameter at a time). Experienced designers are often capable of identifying satisfactory parameter setups using an acceptable number of simulations even though the interactive procedures are rather laborious and do not guarantee optimum results. As a matter of fact, with the increasing complexity of the engineering systems, the efficiency of parameter sweeping has been declining, also because it is not able to handle multiple objectives and constraints. Perhaps the most promising approach to handle expensive computer simulation is a utilization of fast replacement models, referred to as surrogates (Simpson et al.
4
1 Introduction
2001; Queipo et al. 2005; Bandler et al. 2008; Forrester and Keane 2009; Koziel and Bekasiewicz 2016; Yondo et al. 2019). If a given system or device needs to be repeatedly evaluated, whether the purpose is parametric optimization, sensitivity analysis, yield estimation, or robust design, computationally cheap surrogate models become indispensable. Obviously, the computational burden can be relieved to a certain extent by using more efficient algorithms, e.g., those that exhibit faster convergence rates (in the context of numerical optimization). However, this is insufficient when thousands or tens of thousands of evaluations are required. For the surrogate to be effectively used instead of the original simulation model, it needs to be fast, sufficiently accurate, and, preferably, analytically tractable (e.g., smooth) (Alexandrov and Lewis 2001; Bandler et al. 2008; Jin 2011; Koziel and Leifsson 2013a). Depending on their application areas, the surrogates can be constructed locally (e.g., in the vicinity of the optimization path for the purpose of local optimization, or around the nominal design for the purpose of statistical analysis) or over the entire parameter space pertinent to a given problem (e.g., for the purpose of global optimization). Although surrogate models have been around for many decades, their importance has been steadily growing over the last two decades or so because of the increasing role of computational models themselves. The literature on surrogate models (also referred to as replacement models or metamodels; Simpson et al. 2001; Muller and Shoemaker 2014) is replete (Bilicz 2016; Declercq et al. 2013; Leary et al. 2003; Mendes et al. 2013; Du and Roblin 2018; Wang et al. 2018; Yang et al. 2019). This book is not an exposition of surrogate modeling in general but focuses on particular aspects of construction and application of replacement models in high-frequency electronics. Nevertheless, the necessary background material is included for the convenience of the reader. In particular, we provide a general classification of the surrogate models, highlight the stages of model construction and validation, as well as discuss a number of specific modeling methods. We also discuss various challenges and the ways of addressing them. More detailed outline of the book content is provided in the last paragraph of this chapter. As explained before, surrogate modeling emerged from practical necessity: massive evaluations of the simulation models of certain kinds, especially those that require numerical solving of the systems of partial differential equations over large computational domains, were (and still are) unmanageable (Hazdra et al. 2005; Siegler et al. 2016; Bekasiewicz and Koziel 2015; Krause and Jäger 2005; Fakhfakh et al. 2015; Yondo et al. 2018). It should be made clear at this point that construction of the replacement models also requires quite a number of evaluation of the original model; however, the surrogate should ultimately ensure a reduction of the total computational overhead. This applies to both the library type of models for multiple use (e.g., models of individual components being building blocks of larger systems, Queipo et al. 2005) and the models constructed for a one-time use such as parametric optimization (De Tommasi et al. 2010). From that perspective, the appropriate selection of the modeling approach is an important yet a nontrivial task. The surrogate modeling process consists of several steps. The first one is design of experiments, i.e., allocation of the training data set (Kleijnen 2018). Various sampling schemes have been developed, but, nowadays, the preferred approaches
1 Introduction
5
are space-filling designs where the samples are allocated as uniformly as possible (Santner et al. 2018). The training data is then acquired at the selected points, and the surrogate model is identified according to the preferred modeling technique. In some cases, the model parameters may be found analytically; in others, a dedicated optimization process, often referred to as model training, has to be executed (Queipo et al. 2005; Bandler et al. 2008; Forrester and Keane 2009; Brigham and Aquino 2007). The final stage is model validation, where the quality of approximating the training data and/or generalization error (i.e., the ability of making predictions at the locations outside the training set) is estimated (Mack et al. 2007). The entire modeling cycle may be iterated by allocating additional training points according to specified rules (Kleijnen 2018) and re-identifying the model. Despite a large variety of modeling techniques that can be found in the literature, two major types of the surrogates can be distinguished. The first type is so-called data-driven or approximation models, constructed from sampled high-fidelity simulation data (Simpson et al. 2001). The most popular techniques include polynomial regression (Jin et al. 2001), artificial neural networks (Haykin 1998), radial basis functions (RBF) (Wild et al. 2008), kriging (Jones 2001; Forrester and Keane 2009; Kleijnen 2009), support vector regression (Smola and Schölkopf 2004; ChávezHurtado and Rayas-Sánchez 2016), Gaussian process regression (Angiulli et al. 2007; Jacobs 2012), and multidimensional rational approximation (Shaker et al. 2009). The most important advantage of approximation models is their versatility. Because the surrogate is constructed merely using the data acquired from the system of interest, no physical insight is involved, and the mentioned techniques can be potentially applied to any type of problem. At the same time, data-driven surrogates are cheap to evaluate because they are essentially analytical models (e.g., linear combinations of appropriate basis functions in case of RBF; Rozhenko 2018). However, versatility comes at a price: in order to ensure the acceptable predictive power of the surrogate, the design space needs to be sampled with adequate density. As the modeling error mainly depends on the average distance between the training points and the nonlinearity of the system responses, one needs to ensure that the said distance is sufficiently small to capture the response changes across the model domain. This becomes the major bottleneck for approximation-based modeling because the average point-to-point distance scales poorly in high-dimensional spaces and large training data sets are necessary to construct usable surrogates (the effect also referred to as the curse of dimensionality; Wu et al. 2019). Depending on the functional landscape to be modeled, a typical number of training data samples ranges from a few hundred to many thousands. Construction of the surrogates within highly dimensional spaces (say, 20 or more dimensions) is only possible if the system responses are weakly nonlinear. In case of high-frequency electronics, practical datadriven modeling of components such as multiband antennas (characterized by sharp, resonant-like responses), filters (featuring multiple poles and transmission zeros), or compact microwave components (e.g., couplers), is limited to a few parameters. In the case of the mentioned structures, modeling within wide ranges of parameters is even more important for the surrogates to be of any utility for design purposes (Feng et al. 2019; Yelten et al. 2012; Koziel et al. 2013; Koziel and Bekasiewicz 2015;
6
1 Introduction
Koziel et al. 2016). This poses even more challenges than the dimensionality issue because characteristic features of the responses (e.g., the resonances) change considerably along the frequency spectrum (Koziel and Bekasiewicz 2017a; Koziel and Bekasiewicz 2017b; Koziel and Bekasiewicz 2018a; Koziel and Bekasiewicz 2018b; Ullah and Koziel 2019; Rossi et al. 2014). Consequently, in practice, globally accurate approximation modeling is usually justified in case of multiple-use library models of components described by a limited number of parameters. Overcoming these issues is one of the main topics of this book. It should be mentioned that global data-driven modeling—mostly for the purpose of design optimization—has been nowadays dominated by the surrogates iteratively improved through sequential sampling (Couckuyt et al. 2012). Various ways of incorporating new training points into the model (so-called infill criteria) have been developed, including exploitative models (i.e., models oriented toward improving the design in the vicinity of the current one), explorative models (i.e., models aiming at improving global accuracy), as well as model with balanced exploration and exploitation (Forrester and Keane 2009; Couckuyt et al. 2010). Generally, these techniques are often referred to as efficient global optimization (EGO) methods (Jones et al. 1998) or surrogate-assisted evolutionary algorithms (SAEAs) (Gorissen et al. 2009; Lim et al. 2010; Jin 2011; Yang et al. 2019). The second major class of surrogates is physics-based models. The term “physicsbased” relates to the fundamental structure of these surrogates which exploit, to a certain extent, the system-specific knowledge (Cervantes-González et al. 2016; Koziel and Leifsson 2016). This, in turn, is most often some sort of simplified physical description of the system in the form of an underlying low-fidelity model (Robinson et al. 2008; Koziel and Leifsson 2013b; Sarkar et al. 2019). The lowfidelity model is subsequently corrected using a limited amount of high-fidelity simulation data, typically through linear or nonlinear regression (Bandler et al. 2004). A representative example of a low-fidelity model in the area of microwave engineering is an equivalent network of the structure (e.g., a filter), with the highfidelity model being evaluated through a full-wave electromagnetic analysis (Zhu 2002). Clearly, the equivalent network based on the circuit theory rules does not ensure accuracy comparable to the full-wave simulation involving numerical solutions to Maxwell’s equations, e.g., it does not account for the cross-coupling effects within the structure, but it is definitely faster. The fundamental benefit of this type of arrangement is that due to the same physics shared by the low- and high-fidelity models, the surrogate is likely to exhibit a better generalization capability, it is valid over wider ranges of parameters, and it is less prone to suffer from the curse of dimensionality. At the same time, the number of high-fidelity training samples is substantially smaller than required by the approximation models. In other words, physics-based modeling solves some of the issues of data-driven surrogates, which has been the primary reason for its growing popularity (Pantoja et al. 2007; Salleh et al. 2008; Crevecoeur et al. 2010; Koziel and Leifsson 2013a; Cervantes-González et al. 2016; Koziel et al. 2016; Baratta et al. 2018; Zhang et al. 2018). Unfortunately, the same factors that contribute to the attractiveness of the physics-based surrogates also imply their limitations. The first issue is the low-fidelity model itself. It is
1 Introduction
7
normally problem-specific (Bandler et al. 2004; Koziel and Leifsson 2013a); therefore, physics-based surrogates lack versatility of the data-driven models. Low-fidelity models can be obtained in various ways: (i) as analytical models (in practice, a set of design-ready equations offering a considerably simplified description of the system; Koziel et al. 2014), (ii) by simulating the system at a different level (e.g., in microwave engineering: equivalent circuit representation evaluated using circuit theory rules versus full-wave electromagnetic simulation; Bandler et al. 2004), and (iii) lower-fidelity or lower-resolution simulation (e.g., simulation with coarser discretization of the structure and/or relaxed convergence criteria; Koziel and Ogurtsov 2014). Because the low-fidelity models typically involve computer simulation, their evaluation time cannot—in many cases—be neglected, and the aggregated computational cost of the low-fidelity model simulation may be significant in certain applications, e.g., parametric optimization (Zhou et al. 2007; Koziel and Leifsson 2016). Another issue is a trade-off between the low-fidelity model speed and accuracy. While this can be easily adjusted in many situations, e.g., by changing the discretization density in coarse-mesh simulation models (Koziel and Bekasiewicz 2016), selection of a particular model setup might not be a trivial task (Koziel and Ogurtsov 2012). Surrogate models are finding applications wherever reduction of the computational overhead due to massive evaluations of the computational model is of concern. Probably the most popular application area is design optimization (Booker et al. 1999; Bandler et al. 2004; Queipo et al. 2005; Forrester and Keane 2009; Koziel et al. 2011; Sóbester et al. 2012; Tabatabaei et al. 2015). Similarly, in highfrequency electronics, surrogate models are widely used for optimization purposes (Koziel and Leifsson 2013a; Lim et al. 2015; Lourenço and Lebensztajn 2015; Koziel et al. 2016; Koziel and Bekasiewicz 2016; Rangel-Patiño et al. 2017; Bramerdorfer and Zăvoianu 2017; Feng et al. 2019). Surrogate-based optimization (SBO) replaces direct optimization of the expensive high-fidelity model in the form of an iterative prediction-correction scheme, in which the surrogate guides the optimization process toward a better design, and it is subsequently refined using the high-fidelity data acquired along the way (Forrester and Keane 2009; Koziel and Leifsson 2013a). Because most of the operations are executed on the surrogate, the overall cost of the optimization process can be greatly reduced as compared to direct handling of the high-fidelity model (Koziel and Leifsson 2013a; Koziel and Ogurtsov 2014; Koziel and Bekasiewicz 2016). The SBO algorithms may be local ones (Zhou et al. 2007), with the surrogate constructed along the optimization path, or global (Iuliano and Andrés Pérez, (Iuliano and Andrés 2016)), where the model is constructed within a larger portion or the entire parameter space pertinent to the problem at hand. A simple example of the former is sequential approximate optimization (SAO), usually employing simple polynomial type of surrogates established within the domain that is relocated upon fining new and better designs (Kitayama et al. 2011). Global methods often follow the concept of the efficient global optimization (EGO) mentioned earlier in this chapter (Jones et al. 1998). Physics-based surrogates are typically used for a local optimization. One of the most popular physics-based SBO methods in high-frequency electronics is space
8
1 Introduction
mapping (SM) (Bandler et al. 2004; Koziel et al. 2008) which comes in many variations (aggressive SM, Bandler et al. 1995; implicit SM, Koziel et al. 2011; manifold mapping, Echeverria et al. 2006; neural SM, Gutiérrez-Ayala and RayasSánchez 2010; output SM, Ayed et al. 2012; input SM, Khalatpour et al. 2011). For an overview of the methods that emerged from space mapping, the reader is directed, e.g., to Rayas-Sanchez (2016). Because the most straightforward way of constructing the surrogate model from the underlying low-fidelity model is correcting its response, many physics-based SBO algorithms employ such mechanisms. Some of the techniques include approximation model management optimization (AMMO; Alexandrov and Lewis 2001), multipoint correction (Toropov 1989), manifold mapping (Echeverria and Hemker 2005), adaptive response correction (Koziel et al. 2009), shape-preserving response prediction (Koziel 2010), or adaptive response scaling (Koziel and Unnsteinsson 2018). Apart from design optimization, surrogate models are used for many other purposes as well. An important area is uncertainty quantification (e.g., yield estimation; Bandler et al. 2002; Biernacki et al. 2012) and tolerance-aware (or robust) design (Ko et al. 2011; Koziel and Bandler 2015; Kouassi et al. 2016; Aubry et al. 2016). Statistical analysis is typically performed in order to find the effects of manufacturing tolerances or uncertainties concerning operating conditions on the system performance. Traditional methods such as Monte Carlo (MC) simulation (Hu et al. 2016, Liu 2017) are computationally heavy, and fast surrogates seem to be an ideal way of accelerating the process. Certain types of models, such as polynomial chaos expansion (PCE), are particularly suitable in this context due to their capability of directly assessing the statistical moments of the output probability distributions without the necessity of running MC (Sudret 2008; Xiu 2009; Du and Roblin 2017; Manfredi et al. 2017). Another important application of surrogates is inverse modeling which allows us to directly yield the optimum values of design variables corresponding to the required values of performance figures (Akkaram et al. 2007; Kabir et al. 2008; Koziel et al. 2016; Liu et al. 2016; Zhang et al. 2018). The unquestionable benefits of using surrogates as a way to alleviate the difficulties related to massive evaluation of expensive simulation models should not overshadow the practical problems related to surrogate model construction. As already mentioned, the fundamental issue is the curse of dimensionality, or, more generally, extremely disadvantageous relations between the predictive power of the surrogate, the size of the parameter space—that depending on both its dimensionality and the parameter ranges—and the number of training samples. Nonlinearity of the system responses to be modeled (as functions of designable parameters) only make the situation more complex. There have been many attempts and techniques developed to mitigate these problems. One of them is sequential sampling (Xiong et al. 2009; Mukhopadhyay 2011; Wei et al. 2012; Mackman et al. 2013; Xu et al. 2014), nowadays commonly used as a design of experiments approach for datadriven surrogates. The idea is to replace a one-shot (typically uniform) sample allocation by an iterative process in which additional (or infill) samples are distributed based on the feedback obtained from the current sample distribution and/or the current surrogate. In the case of explorative design of experiments schemes (Mandal
1 Introduction
9
et al. 2012), the infill samples are added without referring to the system output but merely based on the data points allocated so far, with the purpose of improving uniformity of the set. In general, especially for the exploitative sampling schemes (Koziel and Ogurtsov 2019), a suitable infill criterion may be maximization of the mean square error, i.e., identifying locations where the error (as predicted by the surrogate, typically, kriging) is the highest (Jiang et al. 2018). In these cases, finding infill points requires global optimization, typically realized using population-based metaheuristics (Beheshti and Shamsuddin 2013; Mehmani et al. 2015; Park et al. 2018; Liu et al. 2018), which makes the procedure slow. The advantage is that more samples may be allocated in the regions corresponding to higher nonlinearity of the system responses, which improved the model accuracy. Unfortunately, this sort of approach rarely pays off for many high-frequency structures where “nonlinear” regions are almost everywhere within the model domain (San et al. 2004; Hajjaj et al. 2017; Shitvov et al. 2014). Co-kriging (Forrester et al. 2007) and gradient kriging (Han et al. 2013) are other ways of reducing the computational cost of the training data acquisition. Co-kriging uses densely sampled low-fidelity model data blended together with a limited number of high-fidelity points which allows for constructing the surrogate of the accuracy similar to that obtained solely from (densely sampled) high-fidelity simulations. This is of course under the assumption that the low- and high-fidelity models are sufficiently well correlated. Gradient kriging, on the other hand, incorporates sensitivity data into the surrogate, which also potentially reduces the number of necessary training points; however, this approach is only practical if the gradient information can be obtained in a computationally efficient matter (e.g., through adjoints; Giles and Pierce 2000; Allaire 2015). A different approach to handle dimensionality issues is high-dimensional model representation (HDMR) (Foo and Karniadakis 2010; Ma and Zabaras 2010; Shan and Wang 2011a; Cai et al. 2017; Liu et al. 2018; Wu et al. 2019), where the system responses are represented as a linear combination of functions that account for lower-order effects, specifically cooperative effects of single variables, pairs of variables, etc. Because domain dimensionalities of the functions contributing to the aforementioned expansion are low as compared to that of the original parameter space, considerable savings can be achieved in terms of the training data acquisition assuming that the higher-order interactions are weak and the corresponding expansion terms can be neglected (Shan and Wang 2011b). Clearly, HDMR is only applicable to systems that satisfy this assumption, which is, unfortunately, not the case for many high-frequency structures for that matter. Model order reduction (MOR) is another class of methods that generally refers to various ways of reducing the complexity of large-scale dynamical systems, while preserving (as much as possible) their input-output behavior (Baur et al. 2014; Henneron and Clénet 2014). The reduced models mimic (from the point of view of the input and output) the behavior of the large-scale system so that they can be efficiently used for design automation, parametric optimization, or sensitivity analysis. Particular numerical techniques utilized by MOR include, among others, Krylov subspace methods (Lin et al. 2007), Proper Orthogonal Decomposition (POD; Wilcox and Peraire, 2002),
10
1 Introduction
principal component analysis (PCA; Dray 2008), solution space projection (SSP; Lee and Jin 2007), or rational approximation (Deschrijver et al. 2007). Examples of MOR in high-frequency engineering include parameterized MOR models of complex electromagnetic systems (Burgard et al. 2013; Sato et al. 2015), reduced order macromodels of high-speed microstrip structures (Zhu and Cangellaris 2001), and accelerated frequency sweeps in FEM analysis (de la Rubia et al. 2009). Several methods have also been developed to handle situations when the regression problem (e.g., in construction of response surface approximation models; Khuri and Mukhopadhyay 2010) is heavily underdetermined. This essentially means that the system at hand is characterized by a large number of internal degrees of freedom (e.g., analog or mixed-signal circuits, analog-to-digital converters, or RF front ends), whereas the regression model has to be built using a relatively small number of samples due to the computational budget issues or simply numerical problems pertinent to handling large data sets. Orthogonal matching pursuit (Tropp 2004; Tao et al. 2016; Li 2010; Bishop 2006) is a technique that identifies a small set of basis functions that approximate the model (or function) of interest. In order to ensure fast convergence, a set of basis functions that are normalized and orthogonal is normally adopted (Needell and Tropp 2009). Another method, Bayesian model fusion (Wang et al. 2013; Wang et al. 2016), can be classified as a physics-based surrogate technique because it uses the so-called early-stage data (e.g., schematic-level simulation data) in order to fit the late-stage model (e.g., a post-layout one with extracted parasitic components) (Li et al. 2012). By blending these two-level data, realized using Bayesian inference, the overall cost of model construction can be greatly reduced because very few late-stage samples are typically used (Tao et al. 2019). The primary purpose of this book is a discussion of surrogate modeling techniques oriented toward simulation-based design optimization of high-frequency structures. The typical problems encountered here include vector-valued and highly nonlinear responses of the components and systems, multiple performance figures that have to be controlled, medium- to high-dimensional parameter spaces, as well as wide ranges of the parameters. For these types of problems, conventional surrogate modeling methods are insufficient. The book describes a number of methods, referred to as performance-driven modeling, that offer a way of overcoming the aforementioned difficulties. We discuss both forward and inverse models as well as how to apply them for rapid design purposes. The main theme is a utilization of an a priori prepared set of reference designs optimized for the selected values of performance figures of interest and defining the surrogate model domain based on this data (Koziel 2017; Koziel and Bekasiewicz 2017a; Koziel and Sigurðsson 2018; Koziel et al. 2018). This approach permits a dramatic reduction of the parameter space region that need to be sampled for the purpose of a model construction without formally reducing the parameter ranges (Koziel et al. 2019; Koziel and PietrenkoDabrowska 2019a; Koziel and Pietrenko-Dabrowska 2019b). In order to be selfcontained, the book also contains some background material concerning both the data-driven and physics-based surrogate modeling. The material is organized in the following manner. Chapter 2 provides an overview of the surrogate modeling
References
11
process, including design of experiments, model identification and validation, as well as discusses a number of popular data-driven modeling methods. Chapter 3 introduces the concept and implementation of physics-based modeling. Chapters 4, 5, 6, 7, 8, and 9 contain exposition of performance-driven modeling approaches, including, among others, triangulation-based constrained modeling, nested kriging modeling, feature-based constrained modeling, as well as variable-fidelity modeling. Chapters 10 and 11 discuss application of performance-driven modeling for multiobjective design optimization as well as expedited (warm-start) optimization, respectively. Chapter 12 outlines various physics-based surrogate modeling techniques involving response correction, whereas Chap. 13 focuses on utilization of inverse surrogates for accelerated simulation-driven design of high-frequency structures. Chapter 14 concludes the work. The book is illustrated with a large number of practical design cases from various areas of high-frequency electronics, especially microwave and antenna engineering. Despite this particular focus, most of the modeling methods discussed here are of a generic nature and can be applied to a variety of problems in other engineering disciplines. The authors believe that the presented material may be helpful for engineers and researchers interested in applying surrogate modeling techniques in their design work, especially while solving tasks such as design optimization or for other projects that require massive evaluations of expensive computer simulation models.
References ADS (Advanced Design System). (2019). Keysight Technologies, Fountaingrove Parkway 1400, Santa Rosa, CA 95403–1799. Akkaram, S., Beeson, D., Agarwal, H., & Wiggs, G. (2007). Inverse modeling technology for parameter estimation. Structural and Multidisciplinary Optimization, 34(2), 151–164. Alexandrov, N. M., & Lewis, R. M. (2001). An overview of first-order model management for engineering optimization. Optical Engineering, 2(4), 413–430. Allaire, G. (2015). A review of adjoint methods for sensitivity analysis, uncertainty quantification, and optimization in numerical codes. Ingenieurs de l’Automobile, SIA, 836, 33–36. Allaire, D., & Willcox, K. (2014). A mathematical and computational framework for multifidelity design and analysis with computer models. International Journal for Uncertainty Quantification, 4, 1–20. Altair FEKO. (2018). Altair HyperWorks, 1820 E Big Beaver Rd, Troy, MI 48083, USA. Angiulli, G., Cacciola, M., & Versaci, M. (2007). Microwave devices and antennas modelling by support vector regression machines. IEEE Transactions on Magnetics, 43(4), 1589–1592. Antenna Magus. (2019). Magus (Pty) Ltd, Magus (Pty) Ltd, Unit 9B Octo Place, Electron Street, Technopark Stellenbosch 7600 South Africa. Aubry, A., De Maio, A., Huang, Y., & Piezzo, M. (2016). Robust design of radar doppler filters. IEEE Transactions on Signal Processing, 64(22), 5848–5860. Ayed, R. B., Gong, J., Brisset, S., Gillon, F., & Brochet, P. (2012). Three-level output space mapping strategy for electromagnetic design optimization. IEEE Transactions on Magnetics, 48 (2), 671–674. Bandler, J. W., Biernacki, R. M., Chen, S. H., Hemmers, R. H., & Madsen, K. (1995). Electromagnetic optimization exploiting aggressive space mapping. IEEE Transactions on Microwave Theory and Techniques, 41(12), 2874–2882.
12
1 Introduction
Bandler, J. W., Rayas-Sánchez, J. E., & Zhang, Q. J. (2002). Yield-driven electromagnetic optimization via space mapping-based neuromodels. International Journal of RF and Microwave Computer-Aided Engineering, 12, 79–89. Bandler, J. W., Cheng, Q. S., Dakroury, S. A., Mohamed, A. S., Bakr, M. H., Madsen, K., & Søndergaard, J. (2004). Space mapping: The state of the art. IEEE Transactions on Microwave Theory and Techniques, 52(1), 337–361. Bandler, J. W., Koziel, S., & Madsen, K. (2008). Editorial—Surrogate modeling and space mapping for engineering optimization. Optimization and Engineering, 9(4), 307–310. Baratta, I. A., de Andrade, C. B., de Assis, R. R., & Silva, E. J. (2018). Infinitesimal dipole model using space mapping optimization for antenna placement. IEEE Antennas and Wireless Propagation Letters, 17(1), 17–20. Baur, U., Benner, P., & Feng, L. (2014). Model order reduction for linear and nonlinear systems: A system-theoretic perspective. Archives of Computational Methods in Engineering, 21(4), 331–358. Beheshti, Z., & Shamsuddin, S. M. H. (2013). A review of population-based meta-heuristic algorithm. International Journal of Advances in Soft Computing and its Applications, 5(1), 1–35. Bekasiewicz, A., & Koziel, S. (2015). Structure and computationally efficient simulation-driven design of compact UWB monopole antenna. IEEE Antennas and Wireless Propagation Letters, 14, 1282–1285. Biernacki, R., Chen, S., Estep, G., Rousset, J., & Sifri, J. (2012). Statistical analysis and yield optimization in practical RF and microwave systems. IEEE MTT-S International Microwave Symposium Digest. Montreal. pp. 1–3. Bilicz, S. (2016). Sparse grid surrogate models for electromagnetic problems with many parameters. IEEE Transactions on Magnetics, 52(3), 1–4. Bischof, C., Bücker, H. M., Hovland, P. D., Naumann, U., & Utke, J. (Eds.). (2008). Advances in automatic differentiation (Lecture Notes in Computational Science and Engineering). Berlin/ Heidelberg: Springer. Bishop, C. (2006). Pattern recognition and machine learning. New York: Springer. Booker, A. J., Dennis, J. E., Frank, P. D., Serafini, D. B., Torczon, V., & Trosset, M. W. (1999). A rigorous framework for optimization of expensive functions by surrogates. Structural Optimization, 17, 1–13. Bramerdorfer, G., & Zăvoianu, A. (2017). Surrogate-based multi-objective optimization of electrical machine designs facilitating tolerance analysis. IEEE Transactions on Magnetics, 53(8), 1–11. Brigham, J. C., & Aquino, W. (2007). Surrogate-model accelerated random search algorithm for global optimization with applications to inverse material identification. Computer Methods in Applied Mechanics and Engineering, 196(45–48), 4561–4576. Bubnicki, Z. (2005). Parametric optimization. In Modern control theory. Berlin/Heidelberg: Springer. Burgard, S., Farle, O., & Edlinger, R. D. (2013). A novel parametric model order reduction approach with applications to geometrically parameterized microwave devices. The International Journal for Computation and Mathematics in Electrical and Electronic Engineering, 32 (5), 1525–1538. Byun, G., Choo, H., & Ling, H. (2013). Optimum placement of DF antenna elements for accurate DOA estimation in a harsh platform environment. IEEE Transactions on Antennas and Propagation, 61(9), 4783–4791. Cadence Allegro. (2019). Cadence design systems, 2655 Seely Ave, San Jose, CA 95134, USA. Cai, X., Qiu, H., Gao, L., & Shao, X. (2017). Metamodeling for high dimensional design problems by multi-fidelity simulations. Structural and Multidisciplinary Optimization, 56(1), 151–166. Cao, Y., Reitzinger, S., & Zhang, Q. (2011). Simple and efficient high-dimensional parametric modeling for microwave cavity filters using modular neural network. IEEE Microwave and Wireless Components Letters, 21(5), 258–260.
References
13
Cervantes-González, J. C., Rayas-Sánchez, J. E., López, C. A., Camacho-Pérez, J. R., Brito-Brito, Z., & Chávez-Hurtado, J. L. (2016). Space mapping optimization of handset antennas considering EM effects of mobile phone components and human body. International Journal of RF and Microwave Computer-Aided Engineering, 26(2), 121–128. Chakravorty, P., & Mandal, D. (2016). Radiation pattern correction in mutually coupled antenna arrays using parametric assimilation technique. IEEE Transactions on Antennas and Propagation, 64(9), 4092–4095. Chávez-Hurtado, J. L., & Rayas-Sánchez, J. E. (2016). Polynomial-based surrogate modeling of RF and microwave circuits in frequency domain exploiting the multinomial theorem. IEEE Transactions on Microwave Theory and Techniques, 64(12), 4371–4438. Cho, C., Yi, X., Li, D., Wang, Y., & Tentzeris, M. M. (2017). An eigenvalue perturbation solution for the multiphysics simulation of antenna strain sensors. IEEE Journal on Multiscale and Multiphysics Computational Techniques, 2, 49–57. COMSOL Multiphysics. (2018). COMSOL Inc, 1 New England Executive Park, Burlington, MA 01803, USA. Conn, A. R., Scheinberg, K., & Vicente, L. N. (2009). Introduction to derivative-free optimization, MPS-SIAM Series on Optimization. Couckuyt, I., Declercq, F., Dhaene, T., Rogier, H., & Knockaert, L. (2010). Surrogate-based infill optimization applied to electromagnetic problems. International Journal of RF and Microwave Computer-Aided Engineering, 20(5), 492–501. Couckuyt, I., Forrester, A., Gorissen, D., De Turck, F., & Dhaene, T. (2012). Blind Kriging: Implementation and performance analysis. Advances in Engineering Software, 49, 1–13. Crevecoeur, G., Sergeant, P., Dupre, L., & Van de Walle, R. (2010). A two-level genetic algorithm for electromagnetic optimization. IEEE Transactions on Magnetics, 46(7), 2585–2595. CST Microwave Studio. (2018). CST AG, Bad Nauheimer Str. 19, D-64289 Darmstadt, Germany. Davidson, D. B. (2010). Computational electromagnetics for RF and microwave engineering (2nd ed.). Cambridge University Press. de la Rubia, V., Razafison, U., & Maday, Y. (2009). Reliable fast frequency sweep for microwave devices via the reduced-basis method. IEEE Transactions on Microwave Theory and Techniques, 57(12), 2923–2937. De Tommasi, L., Gorissen, D., Croon, J. A., & Dhaene, T. (2010). Surrogate modeling of RF circuit blocks. In A. Fitt, J. Norbury, H. Ockendon, & E. Wilson (Eds.), Progress in industrial mathematics at ECMI 2008 (Mathematics in Industry) (Vol. 15). Berlin/Heidelberg: Springer. Declercq, F., Couckuyt, I., Rogier, H., & Dhaene, T. (2013). Environmental high frequency characterization of fabrics based on a novel surrogate modelling antenna technique. IEEE Transactions on Antennas and Propagation, 61(10), 5200–5213. Dennis, J. M., Vertenstein, M., Worley, P. H., Mirin, A. A., Craig, A. P., & Jacob, R. (2012). Computational performance of ultra-high-resolution capability in the community earth system model. The International Journal of High Performance Computing Applications, 26 (1), 5–16. Deschrijver, D., Haegeman, B., & Dhaene, T. (2007). Orthonormal vector fitting: A robust macromodeling tool for rational approximation of frequency domain responses. IEEE Transactions on Advanced Packaging, 30(2), 216–225. Director, S. W., & Rohrer, R. A. (1969). The generalized adjoint network and network sensitivities. IEEE Transactions on Circuit Theory, 16(3), 318–323. Dray, S. (2008). On the number of principal components: A test of dimensionality based on measurements of similarity between matrices. Computational Statistics and Data Analysis, 52 (4), 2228–2237. Du, J., & Roblin, C. (2017). Statistical modeling of disturbed antennas based on the polynomial chaos expansion. IEEE Antennas and Wireless Propagation Letters, 16, 1843–1846. Du, J., & Roblin, C. (2018). Stochastic surrogate models of deformable antennas based on vector spherical harmonics and polynomial chaos expansions: Application to textile antennas. IEEE Transactions on Antennas and Propagation, 66(7), 3610–3622.
14
1 Introduction
Echeverria, D., & Hemker, P. W. (2005). Space mapping and defect correction. Computational Methods in Applied. Mathematics, 5(2), 107–136. Echeverria, D., Lahaye, D., Encica, L., Lomonova, E. A., Hemker, P. W., & Vandenput, A. J. A. (2006). Manifold-mapping optimization applied to linear actuator design. IEEE Transactions on Magnetics, 42(4), 1183–1186. El Sabbagh, M. A., Bakr, M. H., & Nikolova, N. K. (2006). Sensitivity analysis of the scattering parameters of microwave filters using the adjoint network method. International Journal of RF and Microwave Computer-Aided Engineering, 16, 596–606. em™ Version 16.56 (2018). Sonnet Software, Inc., Sonnet Software, Inc., 126 N. Salina Street, Syracuse, NY 13202, USA. Fakhfakh, M., Tlelo-Cuautle, E., & Siarry, P. (Eds.). (2015). Computational intelligence in analog and mixed-signal (AMS) and radio-frequency (RF) circuit design. Springer. Fang, M., Huang, Z., Sha, W. E. I., & Wu, X. (2017). Maxwell–hydrodynamic model for simulating nonlinear terahertz generation from plasmonic metasurfaces. IEEE Journal on Multiscale and Multiphysics Computational Techniques, 2, 194–201. Feng, F., Zhang, C., Na, W., Zhang, J., Zhang, W., & Zhang, Q. (2019). Adaptive feature zero assisted surrogate-based EM optimization for microwave filter design. IEEE Microwave and Wireless Components Letters, 29(1), 2–4. FLUENT, ver. 15.0, ANSYS Inc. (2015). Southpointe, 275 Technology Drive, Canonsburg, PA 15317, USA. Foo, J., & Karniadakis, G. E. (2010). Multi-element probabilistic collocation method in high dimensions. Journal of Computational Physics, 229(5), 1536–1557. Forrester, A. I. J., & Keane, A. J. (2009). Recent advances in surrogate-based optimization. Progress in Aerospace Sciences, 45(1), 50–79. Forrester, A. I. J., Sóbester, A., & Keane, A. J. (2007). Multi-fidelity optimization via surrogate modelling. Proceeding of the Royal Society A: Mathematical, Physical and Engineering Sciences, 463(2088). Fusion 360. (2019). Autodesk, 111 McInnis Parkway San Rafael, 94903 California, USA. Gibson, W. C. (2007). The method of moments in electromagnetics. Boca Raton: Chapman and Hall/CRC. Giles, M., & Pierce, N. (2000). An introduction to the adjoint approach to design. Flow, Turbulence and Combustion, 65(3–4), 393–415. Gorissen, D., Dhaene, T., & De Turck, F. (2009). Evolutionary model type selection for global surrogate modeling. Journal of Machine Learning Research, 10, 2039–2078. Gorissen, D., Crombecq, K., Couckuyt, I., Dhaene, T., & Demeester, P. (2010). A surrogate modeling and adaptive sampling toolbox for computer based design. Journal of Machine Learning Research, 11, 2051–2055. Griewank, A. (2000). Evaluating derivatives: principles and techniques of algorithmic differentiation. Philadelphia: Society for Industrial and Applied Mathematics (SIAM). Gutiérrez-Ayala, V., & Rayas-Sánchez, J. E. (2010). Neural input space mapping optimization based on nonlinear two-layer perceptrons with optimized nonlinearity. International Journal of RF and Microwave Computer-Aided Engineering, 20, 512–526. Hajjaj, A. Z., Hafiz, M. A., & Younis, M. I. (2017). TI - mode coupling and nonlinear resonances of MEMS arch resonators for bandpass filters. Scientific Reports, 7, 41820. Han, Z.-H., Görtz, S., & Zimmermann, R. (2013). Improving variable-fidelity surrogate modeling via gradient-enhanced kriging and a generalized hybrid bridge function. Aerospace Science and Technology, 25(1), 177–189. Hao, J., & Sheng, X. (2017). Accurate and efficient simulation model for the scattering from a ship on a sea-like surface. IEEE Geoscience and Remote Sensing Letters, 14(12), 2375–2379. Haykin, S. (1998). Neural networks: A comprehensive foundation (2nd ed.). Upper Saddle River: Prentice Hall. Hazdra, P., Polivka, M., & Sokol, V. (2005). Microwave antennas and circuits modeling using electromagnetic field simulator. Radioengineering, 14(4), 2–10.
References
15
Henneron, T., & Clénet, S. (2014). Model order reduction of non-linear magnetostatic problems based on POD and DEI methods. IEEE Transactions on Magnetics, 50(2), 33–36. HFSS. (2019). Release 19.0, ANSYS, http://www.ansoft.com/products/hf/hfss/, 2600 Ansys Dr., Canonsburg, PA 15317, USA. Hosder, S. (2012). Stochastic response surfaces based on non-intrusive polynomial chaos for uncertainty quantification. International Journal of Mathematical Modelling and Numerical Optimisation, 3(1/2), 117–139. Hu, X., Chen, X., Parks, G. T., & Yao, W. (2016). Review of improved Monte Carlo methods in uncertainty-based design optimization for aerospace vehicles. Progress in Aerospace Sciences, 86, 20–27. Inventor. (2019). Autodesk, 111 McInnis Parkway San Rafael, 94903 California, USA. Iuliano, E., & Andrés, P. E. (2016). Application of surrogate-based global optimization to aerodynamic design (Springer Tracts in Mechanical Engineering book series (STME)). Cham: Springer. Jacobs, J. P. (2012). Bayesian support vector regression with automatic relevance determination kernel for modeling of antenna input characteristics. IEEE Transactions on Antennas and Propagation, 60(4), 2114–2118. Jameson, A. (1988). Aerodynamic design via control theory. Journal of Scientific Computing, 3, 233–260. Jiang, C., Cai, X., Qiu, H., Gao, L., & Li, P. (2018). A two-stage support vector regression assisted sequential sampling approach for global metamodeling. Structural and Multidisciplinary Optimization, 58(4), 1657–1672. Jin, J. (2002). The finite element method in electromagnetics (2nd ed.). New York: Wiley. Jin, Y. (2011). Surrogate-assisted evolutionary computation: Recent advances and future challenges. Swarm and Evolutionary Computation, 1(2), 61–70. Jin, R., Chen, W., & Simpson, T. (2001). Comparative studies of metamodelling techniques under multiple modelling criteria. Structural and Multidisciplinary Optimization, 23(1), 1–3. Jones, D. R. (2001). A taxonomy of global optimization methods based on response surfaces. Journal of Global Optimization, 21, 345–383. Jones, D., Schonlau, M., & Welch, W. (1998). Efficient global optimization of expensive black-box functions. Journal of Global Optimization, 13, 455–492. Kabir, H., Wang, Y., Yu, M., & Zhang, Q. J. (2008). Neural network inverse modeling and applications to microwave filter design. IEEE Transactions on Microwave Theory and Techniques, 56(4), 867–879. Keyes, D. E., McInnes, L. C., Woodward, C., Gropp, W., Myra, E., Pernice, M., & Wohlmuth, B. (2013). Multiphysics simulations: Challenges and opportunities. International Journal of High Performance Computing Applications, 27(1), 4–83. Khalatpour, A., Amineh, R. K., Cheng, Q. S., Bakr, M. H., Nikolova, N. K., & Bandler, J. W. (2011). Accelerating input space mapping optimization with adjoint sensitivities. IEEE Microwave and Wireless Components Letters, 21(6), 280–282. Khuri, A. I., & Mukhopadhyay, S. (2010). Response surface methodology: Advanced review. Computational Statistics, 2(2), 128–149. Kitayama, S., Arakawa, M., & Yamazaki, K. (2011). Sequential approximate optimization using radial basis function network for engineering optimization. Optimization and Engineering, 12 (4), 535–557. Kleijnen, J. P. C. (2009). Kriging metamodeling in simulation: A review. European Journal of Operational Research, 192(3), 707–716. Kleijnen, J. P. C. (2018). Design and analysis of simulation experiments. In J. Pilz, D. Rasch, V. Melas, & K. Moder (Eds.), Statistics and simulation. IWS 2015. Springer Proceedings in Mathematics & Statistics (Vol. 231). Cham: Springer. Ko, J., Byun, J., Park, J., & Kim, H. (2011). Robust design of dual band/polarization patch antenna using sensitivity analysis and Taguchi's method. IEEE Transactions on Magnetics, 47(5), 1258–1261.
16
1 Introduction
Kouassi, A., Nguyen-Trong, N., Kaufmann, T., Lalléchère, S., Bonnet, P., & Fumeaux, C. (2016). Reliability-aware optimization of a wideband antenna. IEEE Transactions on Antennas and Propagation, 64(2), 450–460. Kozakoff, D. J. (2010). Analysis of radome-enclosed antennas. Boston: Artech House. Koziel, S. (2010). Shape-preserving response prediction for microwave design optimization. IEEE Transactions on Microwave Theory and Techniques, 58(11), 2829–2837. Koziel, S. (2017). Low-cost data-driven surrogate modeling of antenna structures by constrained sampling. IEEE Antennas and Wireless Propagation Letters, 16, 461–464. Koziel, S., & Bandler, J. W. (2015). Rapid yield estimation and optimization of microwave structures exploiting feature-based statistical analysis. IEEE Transactions on Microwave Theory and Techniques, 63(1), 107–114. Koziel, S., & Bekasiewicz, A. (2015). Expedited geometry scaling of compact microwave passives by means of inverse surrogate modeling. IEEE Transactions on Microwave Theory and Techniques, 63(12), 4019–4026. Koziel, S., & Bekasiewicz, A. (2016). Multi-objective design of antennas using surrogate models. Singapore: World Scientific. Koziel, S., Bekasiewicz, A., Kurgan P., & Bandler, J.W. (2016). Rapid multi-objective de-sign optimisation of compact microwave couplers by means of physics-based surro-gates, IET Microwaves, Antennas & Propagation, 10(5), 479–486. Koziel, S., & Bekasiewicz, A. (2017a). On reduced-cost design-oriented constrained surrogate modeling of antenna structures. IEEE Antennas and Wireless Propagation Letters, 16, 1618–1621. Koziel, S., & Bekasiewicz, A. (2017b). Computationally-efficient surrogate-assisted dimension scaling of compact dual-band couplers. IET Microwaves, Antennas & Propagation, 11(4), 465–470. Koziel, S., & Bekasiewicz, A. (2018a). Sequential approximate optimisation for statistical analysis and yield optimisation of circularly polarised antennas. IET Microwaves, Antennas & Propagation, 12(13), 2060–2064. Koziel, S., & Bekasiewicz, A. (2018b). Low-cost and reliable geometry scaling of compact microstrip couplers with respect to operating frequency, power split ratio, and dielectric substrate parameters. IET Microwaves, Antennas & Propagation, 12(9), 1508–1513. Koziel, S., & Leifsson, L. (Eds.). (2013a). Surrogate-based modeling and optimization. Applications in engineering. New York: Springer. Koziel, S., & Leifsson, L. (2013b). Surrogate-based aerodynamic shape optimization by variableresolution models. AIAA Journal, 51(1), 94–106. Koziel, S., & Leifsson, L. (2016). Simulation-driven design by knowledge-based response correction techniques. Cham: Springer. Koziel, S., & Ogurtsov, S. (2012). Model management for cost-efficient surrogate-based optimization of antennas using variable-fidelity electromagnetic simulations. IET Microwaves, Antennas and Propagation, 6, 1643–1650. Koziel, S., & Ogurtsov, S. (2014). Antenna design by simulation-driven optimization. Berlin: Springer. Koziel, S., & Unnsteinsson, S. D. (2018). Expedited design closure of antennas by means of trustregion-based adaptive response scaling. IEEE Antennas and Wireless Propagation Letters, 17(6), 1099–1103. Koziel, S., & Sigurðsson A.T. (2018). Triangulation-based constrained surrogate modeling of antennas, IEEE Transactions on Antennas and Propagation, 66(8), 4170–4179. Koziel, S., & Ogurtsov, S. (2019). Simulation-based optimization of antenna arrays. London: World Scientific. Koziel, S., Sigurðsson, A.T., Pietrenko-Dabrowska,& A., Szczepanski, S. (2019). Enhanced uniform data sampling for constrained data-driven modeling of antenna input characteristics, International Journal of Numerical Modelling: Electronic Devices and Fields, 32(5), e2584.
References
17
Koziel, S., & Pietrenko-Dabrowska, A. (2019a). Performance-based nested surrogate modeling of antenna input characteristics. IEEE Transactions on Antennas and Propagation, 67(5), 2904–2912. Koziel, S., & Pietrenko-Dabrowska, A. (2019b). Reduced-cost surrogate modelling of compact microwave components by two-level kriging interpolation. Engineering Optimization. https:// doi.org/10.1080/0305215X.2019.1630399. Koziel, S., Cheng, Q. S., & Bandler, J. W. (2008). Space mapping. IEEE Microwave Magazine, 9 (6), 105–122. Koziel, S., Bandler, J. W., & Madsen, K. (2009). Space mapping with adaptive response correction for microwave design optimization. IEEE Transactions on Microwave Theory and Techniques, 57, 478–486. Koziel, S., Bandler, J. W., & Cheng, Q. S. (2011). Constrained parameter extraction for microwave design optimisation using implicit space mapping. IET Microwave, Antennas & Propagation, 5, 1156–1163. Koziel, S., Yang, X. S., & Zhang, Q. J. (Eds.). (2013). Simulation-driven design optimization and modeling for microwave engineering. London: Imperial College Press. Koziel, S., Ogurtsov, S., Zieniutycz, W., & Sorokosz, L. (2014). Simulation-driven design of microstrip antenna subarrays. IEEE Transactions on Antennas and Propagation, 62(7), 3584–3591. Koziel, S., Sigurðsson, A. T., & Szczepanski, S. (2018). Uniform sampling in constrained domains for low-cost surrogate modeling of antenna input characteristics. IEEE Antennas and Wireless Propagation Letters, 17(1), 164–167. Krause, E., & Jäger, W. (Eds.). (2005). High performance computing in science and engineering. Stuttgart: Transactions of the High Performance Computing Center. Leary, S., Bhaskar, A., & Keane, A. (2003). Optimal orthogonal-array-based latin hypercubes. Journal of Applied Statistics, 30, 585–598. Lee, S. H., & Jin, J. M. (2007). Adaptive solution space projection for fast and robust wideband finite-element simulation of microwave components. IEEE Microwave and Wireless Components Letters, 17(7), 474–476. Li, X. (2010). Finding deterministic solution from underdetermined equation: Largescale performance modeling of analog/RF circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 29(11), 1661–1668. Li, X., Zhang, W., Wang, F., Sun, S., & Gu, C. (2012). Efficient parametric yield estimation of analog/mixed-signal circuits via Bayesian model fusion. 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). San Jose. pp. 627–634. Liersch, C. M., & Hepperle, M. (2011). A distributed toolbox for multidisciplinary preliminary aircraft design. CEAS Aeronautical Journal, 2(1–4), 57–68. Lim, D., Jin, Y., Ong, Y., & Sendhoff, B. (2010). Generalizing surrogate-assisted evolutionary computation. IEEE Transactions on Evolutionary Computation, 14(3), 329–355. Lim, D., Woo, D., Yeo, H., Jung, J., Ro, S., & Jung, H. (2015). A novel surrogate-assisted multiobjective optimization algorithm for an electromagnetic machine design. IEEE Transactions on Magnetics, 51(3), 1–4. Lin, Y., Bao, L., & Wei, Y. (2007). A model-order reduction method based on Krylov subspaces for mimo bilinear dynamical systems. Journal of Applied Mathematics and Computing, 25 (1–2), 293. Liu, B. (2017). Posterior exploration based sequential Monte Carlo for global optimization. Journal of Global Optimization, 69(4), 847–868. Liu, H., Ong, Y. S., & Cai, J. (2018). A survey of adaptive sampling for global metamodeling in support of simulation-based complex engineering design. Journal of Structural and Multidisciplinary Optimization, 57(1), 416. Liu, Y., Shi, Y., Zhou, Q., & Xiu, R. (2016). A sequential sampling strategy to improve the global fidelity of metamodels in multi-level system design. Structural and Multidisciplinary Optimization, 53(6), 1295–1313.
18
1 Introduction
Liu, H., Hervas, J. R., Ong, Y. S., Cai, J., & Wang, Y. (2018). An adaptive RBF-HDMR modeling approach under limited computational budget. Structural and Multidisciplinary Optimization, 57(3), 1–18. Lourenço, J. M., & Lebensztajn, L. (2015). Surrogate modeling and two-level infill criteria applied to electromagnetic device optimization. IEEE Transactions on Magnetics, 51(3), 1–4. Ma, X., & Zabaras, N. (2010). An adaptive high-dimensional stochastic model representation technique for the solution of stochastic partial differential equations. Journal of Computational Physics, 229, 3884–3915. Mack, Y., Goel, T., Shyy, W., & Haftka, R. (2007). Surrogate model-based optimization framework: A case study in aerospace. Design, Studies in Computational Intelligence (SCI), 51, 323–342. Mackman, T. J., Allen, C. B., Ghoreyshi, M., & Nadcock, K. J. (2013). Comparison of adaptive sampling methods for generation of surrogate aerodynamic models. AIAA Journal, 51(4), 797–808. Mandal, A., Zafar, H., Das, S., & Vasilakos, A. V. (2012). A modified differential evolution algorithm for shaped beam linear array antenna design. Progress in Electromagnetic Research, 125, 439–457. Mandic, T., Magerl, M., & Baric, A. (2019). Sequential buildup of broadband equivalent circuit model for low-cost SMA connectors. IEEE Transactions on Electromagnetic Compatibility, 61(1), 242–250. Manfredi, P., Ginste, D. V., Stievano, I. S., De Zutter, D., & Canavero, F. G. (2017). Stochastic transmission line analysis via polynomial chaos methods: an overview. IEEE Electromagnetic Compatibility Magazine, 6(3), 77–84, Third Quarter 2017. Mehmani, A., Chowdhury, S., Tong, W., & Messac, A. (2015). Adaptive switching of variablefidelity models in population-based optimization. In N. Lagaros & M. Papadrakakis (Eds.), Engineering and applied sciences optimization (Computational Methods in Applied Sciences) (Vol. 38). Cham: Springer. Mendes, M. H. S., Soares, G. L., Coulomb, J., & Vasconcelos, J. A. (2013). Appraisal of surrogate modeling techniques: A case study of electromagnetic device. IEEE Transactions on Magnetics, 49(5), 1993–1996. Momentum. (2019). Keysight Technologies, Fountaingrove Parkway 1400, Santa Rosa, CA 95403–1799. Mukhopadhyay, N. (2011). Sequential sampling. In M. Lovric (Ed.), International encyclopedia of statistical science. Berlin/Heidelberg: Springer. Muller, J., & Shoemaker, C. A. (2014). Influence of ensemble surrogate models and sampling strategy on the solution quality of algorithms for computationally expensive black-box global optimization problems. Journal of Global Optimization, 60(2), 123–144. Multiphysics Simulation. (2019). ANSYS Inc., Southpointe, 275 Technology Drive, Canonsburg, PA 15317, USA. Needell, D., & Tropp, J. A. (2009). CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. Applied and Computational Harmonic Analysis., 26(3), 301–321. NI Microwave Office. (2019). National Instruments, 11500 N Mopac Expy, Austin, TX 78759, USA. Nikolova, N. K., Li, Y., Li, Y., & Bakr, M. H. (2006). Sensitivity analysis of scattering parameters with electromagnetic time-domain simulators. IEEE Transactions on Microwave Theory and Techniques, 54(4), 1598–1610. Nocedal, J., & Wright, S. J. (2000). Numerical optimization (Springer Series in Operations Research). New York: Springer. Palacios, F., Colonno, M. R., Aranake, A. C., Campos, A., Copeland, S. R., Economon, T. D., Lonkar, A. K., Lukaczyk, T. W., Taylor, T. W. R., & Alonso, J. J. (2013). Stanford University unstructures (SU2): An open-source integrated computational environment for multi-physics simulation and design. Grapevine: AIAA Aerospace Sciences Meeting.
References
19
Pantoja, M. F., Meincke, P., & Bretones, A. R. (2007). A hybrid genetic-algorithm space-mapping tool for the optimization of antennas. IEEE Transactions on Antennas and Propagation, 55(3), 777–781. Papadimitriou, D. I., & Giannakoglou, K. C. (2008). Aerodynamic shape optimization using first and second order adjoint and direct approaches. Archives of Computational Methods in Engineering, 15, 447–488. Park, D., Chung, I. B., & Choi, D. H. (2018). Surrogate based global optimization using adaptive switching infill sampling criterion. In A. Schumacher, T. Vietor, S. Fiebig, K. U. Bletzinger, & K. Maute (Eds.), Advances in structural and multidisciplinary optimization. WCSMO 2017 (pp. 692–699). Cham: Springer. Paulotto, S., Baccarelli, P., Frezza, F., & Jackson, D. R. (2008). Full-wave modal dispersion analysis and broadside optimization for a class of microstrip CRLH leaky-wave antennas. IEEE Transactions on Microwave Theory and Techniques, 56(12), 2826–2837. Pironneau, O. (1984). Optimal shape design for elliptic systems. New York: Springer. Pistikopoulos, E. N., Georgiadis, M. C., & Dua, V. (2007). Multi-parametric programming. Weinheim: Wiley VCH. Queipo, N. V., Haftka, R. T., Shyy, W., Goel, T., Vaidynathan, R., & Tucker, P. K. (2005). Surrogate-based analysis and optimization. Progress in Aerospace Sciences, 41(1), 1–28. Rangel-Patiño, F. E., Chávez-Hurtado, J. L., Viveros-Wacher, A., Rayas-Sánchez, J. E., & Hakim, N. (2017). System margining surrogate-based optimization in post-silicon validation. IEEE Transactions on Microwave Theory and Techniques, 65(9), 3109–3115. Ravelo, B. (2018). Multiphysics model of microstrip structure under high voltage pulse excitation. IEEE Journal on Multiscale and Multiphysics Computational Techniques, 3, 88–96. Rayas-Sanchez, J. E. (2016). Power in simplicity with ASM: Tracing the aggressive space mapping algorithm over two decades of development and engineering applications. IEEE Microwave Magazine, 17(4), 64–76. Robinson, T. D., Eldred, M. S., Willcox, K. E., & Haimes, R. (2008). Surrogate-based optimization using multifidelity models with variable parameterization and corrected space mapping. AIAA Journal, 46(11), 2814–2822. Rosenberg, A., Selvaraj, S., & Sharma, A. (2014). A novel dual-rotor turbine for increased wind energy capture. Journal of Physics: Conference Series, 524, 1–10. Rossi, M., Dierck, A., Rogier, H., & Vande Ginste, D. (2014). A stochastic framework for the variability analysis of textile antennas. IEEE Transactions on Antennas and Propagation, 62 (12), 6510–6514. Rozhenko, A. I. (2018). Comparison of radial basis functions. Numerical Analysis and Applications, 11(3), 220–235. Sadrossadat, S. A., Cao, Y., & Zhang, Q. (2013). Parametric modeling of microwave passive components using sensitivity-analysis-based adjoint neural-network technique. IEEE Transactions on Microwave Theory and Techniques, 61(5), 1733–1747. Salleh, M. K. M., Prigent, G., Pigaglio, O., & Crampagne, R. (2008). Quarter-wavelength side-coupled ring resonator for bandpass filters. IEEE Transactions on Microwave Theory and Techniques, 56(1), 156–162. San, H., Kobayashi, H., Kawakami, S., & Kuroiwa, N. (2004). A noise-shaping algorithm of multi-bit DAC nonlinearities in complex bandpass ΔΣAD modulators. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E87-A(4), 792–800. Santner, T. J., Williams, B. J., & Notz, W. I. (2018). Space-filling designs for computer experiments. In The design and analysis of computer experiments (Springer Series in Statistics). New York: Springer. Sarkar, T. K., Chen, H., Palma, M. S., & Zhu, M. (2019). Lessons learned using a physics based macro model for analysis of radio wave propagation in wireless transmission. IEEE Transactions on Antennas and Propagation, 67(4), 2150–2157.
20
1 Introduction
Sato, Y., Campelo, F., & Igarashi, H. (2015). Fast shape optimization of antennas using model order reduction. IEEE Transactions on Magnetics, 51(3), 1–4. Sevgi, L. (2014). Electromagnetic modeling and simulation (IEEE Press Series on Electromagnetic Wave Theory). Hoboken: Wiley. Shaker, G. S. A., Bakr, M. H., Sangary, N., & Safavi-Naeini, S. (2009). Accelerated antenna design methodology exploiting parameterized Cauchy models. Progress in Electromagnetic Research (PIER B), 18, 279–309. Shan, S., & Wang, G. (2011a). Survey of modeling and optimization strategies to solve highdimensional design problems with computationally-expensive black-box functions. Structural and Multidisciplinary Optimization, 41(219), 219–241. Shan, S., & Wang, G. (2011b). Turning black-box functions into white functions. Journal of Mechanical Design, 133(3), 031003. Sharma, S., & Sarris, C. D. (2016). A novel multiphysics optimization-driven methodology for the design of microwave ablation antennas. IEEE Journal on Multiscale and Multiphysics Computational Techniques, 1, 151–160. Shitvov, A., Schuchinsky, A. G., Steer, M. B., & Wetherington, J. M. (2014). Characterisation of nonlinear distortion and intermodulation in passive devices and antennas, 8th European Conference on Antennas and Propagation (EuCAP 2014), The Hague. pp. 1454–1458. Siegler, J., Ren, J., Leifsson, L., Koziel, S., & Bekasiewicz, A. (2016). Supersonic airfoil shape optimization by variable-fidelity models and manifold mapping. Procedia Computer Science, 80, 1103–1113. Simpson, T. W., Peplinski, J., Koch, P. N., & Allen, J. K. (2001). Metamodels for computer-based engineering design: Survey and recommendations. Engineering Computers, 17, 129–150. Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14, 199–222. Sobester, A., & Forrester, A. I. J. (2015). Aircraft aerodynamic design: Geometry and optimization. Chichester: John Wiley & Sons. Sóbester, A., Forrester, A. I. J., Toal, D. J. J., Tresidder, E., & Tucker, S. (2012). Engineering design applications of surrogate-assisted optimization techniques. Optimization and Engineering, 15 (1), 243–265. Star-CCM+ (2015). CD-adapco Group, 60 Broadhollow Road, Melville, NY 11747, USA. Styblinski, M. A., & Opalski, L. J. (1986). Algorithms and software tools for IC yield optimization based on fundamental fabrication parameters. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 5(1), 79–89. Sudret, B. (2008). Global sensitivity analysis using polynomial chaos expansions. Reliability Engineering and System Safety, 93(7), 964–979. Sullivan, D. M. (2013). Electromagnetic simulation using the FDTD method (2nd ed.). Hoboken: John Wiley & Sons. Swanson, D. G., & Hoefer W. J. R. (2003). Microwave circuit modeling using electromagnetic field simulation, Artech House Microwave Library. Szakmany, G. P., Orlov, A. O., Bernstein, G. H., Lin, M., & Porod, W. (2018). Multiphysics THz antenna simulations. IEEE Journal on Multiscale and Multiphysics Computational Techniques, 3, 289–294. Tabatabaei, M., Hakanen, J., Hartikainen, M., Miettinen, K., & Sindhya, K. (2015). A survey on handling computationally expensive multiobjective optimization problems using surrogates: Non-nature inspired methods. Structural and Multidisciplinary Optimization, 52(1), 1–25. Tao, J., Liao, C., Zeng, X., & Li, X. (2016). Harvesting design knowledge from internet: Highdimensional performance trade-off modeling for large-scale analog circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35(1), 23–36. Tao, J., Wang, F., Cachecho, P., Zhang, W., Sun, S., Li, X., Kanj, R., Gu, C., & Zeng, X. (2019). Large-scale circuit performance modeling by bayesian model fusion. In I. Elfadel, D. Boning, & X. Li (Eds.), Machine learning in VLSI computer-aided design. Cham: Springer.
References
21
Toivanen, J. I., Makinen, R. A. E., Jarvenpaa, S., Yla-Oijala, P., & Rahola, J. (2009). Electromagnetic sensitivity analysis and shape optimization using method of moments and automatic differentiation. IEEE Transactions on Antennas and Propagation, 57(1), 168–175. Toropov, V. V. (1989). Simulation approach to structural optimization. Structural Optimization, 1, 37–46. Tropp, J. A. (2004). Greed is good: Algorithmic results for sparse approximation. IEEE Transactions on Information Theory, 50(10), 2231–2242. Wang, F., Zhang, W., Sun, S., Li, X., & Gu, C. (2013). Bayesian model fusion: Large scale performance modeling of analog and mixed-signal circuits by reusing early-stage data. Design Automation Conference (DAC). Austin. Wang, F., Cachecho, P., Zhang, W., Sun, S., Li, X., Kanj, R., & Gu, C. (2016). Bayesian model fusion: Large-scale performance modeling of analog and mixed-signal circuits by reusing earlystage data. IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, 35(8), 1255–1268. Wang, K., Ding, D., & Chen, R. (2018). A surrogate modeling technique for electromagnetic scattering analysis of 3-d objects with varying shape. IEEE Antennas and Wireless Propagation Letters, 17(8), 1524–1527. Wang, J., Yu, H., & Fu, X. (2018). Optimized design of the radome based on the digital beam forming technique. 12th International Symposium on Antennas, Propagation and EM Theory (ISAPE), Hangzhou. pp. 1–3. Webb, J. P. (2004). Matching a given field using hierarchal vector basis functions. Electromagnetics, 24, 113–122. Wehner, M. F., Bala, G., Duffy, P., Mirin, A. A., & Romano, R. (2010). Towards direct simulation of future tropical cyclone statistics in a high-resolution global atmospheric model. Advances in Meteorology, 2010, 1–13. Wei, X., Wu, Y.-Z., & Chen, L.-P. (2012). A new sequential optimal sampling method for radial basis functions. Applied Mathematics and Computation, 218(19), 9635–9646. White, J. F. (2004). High frequency techniques: An introduction to RF and microwave design and computer simulation. Hoboken: Wiley-IEEE Press. Wild, S. M., Regis, R. G., & Shoemaker, C. A. (2008). ORBIT: Optimization by radial basis function interpolation in trust-regions. SIAM Journal on Scientific Computing, 30, 3197–3219. Wu, X., Peng, X., Chen, W., & Zhang, W. (2019). A developed surrogate-based optimization framework combining HDMR-based modeling technique and TLBO algorithm for highdimensional engineering problems. Structural and Multidisciplinary Optimization, 60(2), 663–680. XFDTD. (2016). Remcom, Inc., South Allen 315, Suite 416, State College, PA 16801. Xiong, F., Xiong, Y., Chen, W., & Yang, S. (2009). Optimizing Latin hypercube design for sequential sampling of computer experiments. Engineering Optimization, 41(8), 793–810. Xiu, D. (2009). Fast numerical methods for stochastic computations: A review. Communications in Computational Physics, 5(2–4), 242–272. Xu, S., Liu, H., Wang, X., & Jiang, X. (2014). A robust error-pursuing sequential sampling approach for global metamodeling based on Voronoi diagram and cross validation, ASME. Journal of Mechanical Design, 136(7), 071009. Yang, X. S. (2010). Engineering optimization: An introduction with metaheuristic applications. Hoboken: Wiley. Yang, Z., Qiu, H., Gao, L., Jiang, C., & Zhang, J. (2019). Two-layer adaptive surrogate-assisted evolutionary algorithm for high-dimensional computationally expensive problems. Journal of Global Optimization, 74(2), 327–359. Yelten, M. B., Zhu, T., Koziel, S., Franzon, P. D., & Steer, M. B. (2012). Demystifying surrogate modeling for circuits and systems. IEEE Circuits and Systems Magazine, 12(1), 45–63.
22
1 Introduction
Yondo, R., Andrés, E., & Eusebio, V. (2018). A review on design of experiments and surrogate models in aircraft real-time and many-query aerodynamic analyses. Progress in Aerospace Sciences, 96, 23–61. Yondo, R., Bobrowski, K., Andrés, E., & Valero, E. (2019). A review of surrogate modeling techniques for aerodynamic analysis and optimization: Current limitations and future challenges in industry. In E. Minisci, M. Vasile, J. Periaux, N. Gauger, K. Giannakoglou, & D. Quagliarella (Eds.), Advances in evolutionary and deterministic methods for design, optimization and control in engineering and sciences (Computational Methods in Applied Sciences) (Vol. 48). Cham: Springer. You, J. W., Tan, S. R., Zhou, X. Y., Yu, W. M., & Cui, T. J. (2014). A new method to analyze broadband antenna-radome interactions in time-domain. IEEE Transactions on Antennas and Propagation, 62(1), 334–344. Zaslavski, A. J. (2010). Parametric optimization. In Optimization on metric and normed spaces (Springer Optimization and Its Applications) (Vol. 44). New York: Springer. Zhang, C., Jin, J., Na, W., Zhang, Q., & Yu, M. (2018). Multivalued neural network inverse modeling and applications to microwave filters. IEEE Transactions on Microwave Theory and Techniques, 66(8), 3781–3797. Zhang, W., Feng, F., Gongal-Reddy, V.-M.-R., Zhang, J., Yan, S., Ma, J., & Zhang, Q.-J. (2018). Space mapping approach to electromagnetic centric multiphysics parametric modeling of microwave components. IEEE Transactions on Microwave Theory and Techniques, 66(7), 3169–3185. Zhou, Z., Ong, Y. S., Nair, P. B., Keane, A. J., & Lum, K. Y. (2007). Combining global and local surrogate models to accelerate evolutionary optimization. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 37(1), 66–76. Zhu, L. (2002). Realistic equivalent circuit model of coplanar waveguide open circuit: Lossy shunt resonator network. IEEE Microwave and Wireless Components Letters, 12(5), 175–177. Zhu, Y., & Cangellaris, A. C. (2001). A new finite element model for reduced order electromagnetic modeling. IEEE Microwave and Wireless Components Letters, 11(5), 211–213.
Chapter 2
Basics of Data-Driven Surrogate Modeling
Data-driven models undoubtedly belong to the most popular types of surrogates around. Their fundamental advantages include versatility, low evaluation cost, easy handling, and a large number of modeling techniques and ready-to use implementations and computer codes available in various programming environments, including MATLAB. This chapter provides a brief introduction to approximation-based surrogate modeling within the scope necessary for other parts of the book. The readers interested in a more detailed exposition can refer to a rich literature of the subject (Biegler et al. 2014; Chugh et al. 2019; Jin 2005; Santana-Quintero et al. 2010; Gorissen et al. 2009). The following sub-sections outline the surrogate modeling flow and design of experiments strategies; highlight the selected modeling methods including, among others, polynomial regression, radial basis functions, kriging, support vector regression, and neural networks; as well as discuss model validation methods.
2.1
Overview
Approximation-based models constitute the largest and the most widely used category of surrogates. A variety of specific techniques have been developed and described in the literature (e.g., Simpson et al. 2001; Søndergaard 2003; Forrester and Keane 2009; Couckuyt 2013; Wang and Shan 2006; Chen et al. 2005). Furthermore, most of these techniques are readily available through various toolboxes (e.g., Lophaven et al. 2002; Gorissen et al. 2010), nowadays primarily implemented in MATLAB. The following appealing features of data-driven models can be distinguished: • Approximation models are solely based on the data; consequently, they can be constructed without a priori knowledge about the physical system of interest.
© Springer Nature Switzerland AG 2020 S. Koziel, A. Pietrenko-Dabrowska, Performance-Driven Surrogate Modeling of High-Frequency Structures, https://doi.org/10.1007/978-3-030-38926-0_2
23
24
2 Basics of Data-Driven Surrogate Modeling
• For the same reason, they are generic and therefore easily transferrable between various classes of problems and various application areas. • The models usually involve explicit analytical formulations (e.g., linear combinations of appropriate basis functions, Montegranario and Espinosa 2014); consequently, they are computationally cheap to evaluate. • Data-driven models are easily accessible through (usually MATLAB) toolboxes and therefore easy to handle even by non-experts. Unfortunately, the aforementioned versatility comes at a price. Data-driven surrogates require considerable amount of training data to ensure reasonable predictive power. As a matter of fact, the number of necessary training samples grows very quickly with the dimensionality of the design space (so-called curse of dimensionality) and, even more importantly, with the ranges of the system parameters (e.g., geometry dimensions in the case of most high-frequency structures such as antennas or microstrip circuits). This is a fundamental problem from the point of view of practical applications of the approximation surrogates. It is especially pertinent to the areas where the responses of the components and systems are highly nonlinear. This includes antenna and microwave engineering, where, typically, one considers frequency characteristics that exhibit sharp resonances and spikes and are very sensitive to some of the system parameters (Rayas-Sanchez et al. 2017; Goudos 2017; Rossi and Rizzo 2009; Hausmair et al. 2017). For cases like these, data-driven modeling is often limited to low-dimensional spaces (up to four or five parameters) and narrow parameter ranges (Wu et al. 2019; Koziel and Leifsson 2016). Depending on the particular purpose of the surrogate (e.g., a model for one-time parametric optimization; a model for statistical analysis and yield optimization; a multiple-use library model), the computational expenditures on training data acquisition may or may not be justified. Some of the methods briefly mentioned in Chap. 1, e.g., HDMR (Ma and Zabaras 2010), MOR-type of techniques (Baur et al. 2014), and variablefidelity modeling (Fernández-Godino et al. 2019), might alleviate these difficulties to a certain extent. Approximation surrogates are data-driven models, i.e., they are entirely constructed using the training samples. The latter are obtained from the original (or high-fidelity) simulation model although other arrangements are also possible (e.g., measurements of a physical system). The modeling process normally consists of the following stages: • Design of experiments (DOE). DOE is a process of allocating the training samples within the design space pertinent to the problem at hand. The preferred DOE strategy largely depends on the source of the training data. For example, if it comes from computer simulations, space-filling designs are usually employed (Simpson et al. 2001). Also, an available computational budget is the most important factor limiting the number of samples to be allocated. An outline of popular DOE techniques is provided in Sect. 2.2. • Training data acquisition. Nowadays, in a vast majority of cases, the training data is obtained through evaluation of computer simulation models.
2.1 Overview Fig. 2.1 A surrogate model construction flowchart. The gray box shows the main flow, whereas the dashed box comprises an optional iterative procedure in which additional training data points are allocated (using a suitable infill strategy), and the model is updated accordingly. The feedback from the model validation stage can also be used for identification of the model regularization parameters (Koziel and Ogurtsov 2019)
25
Design of Experiments
High-Fidelity Model Training Data Acquisition
Allocate Infill Training Points
Model Identification (Training Data Fitting) No Model Validation
Accuracy sufficient? Yes END
• Model identification. Typically, the approximation technique is selected beforehand so that this stage is primarily about determining the model parameters. In most cases, e.g., kriging (Kleijnen 2009) or neural networks (Rayas-Sanchez 2004), solving a suitably defined minimization problem is required; in some situations, the model parameters can be found using explicit formulas, e.g., by analytically solving an appropriate linear regression problem (polynomial approximation, Queipo et al. 2005). Model identification may also involve selection of the approximation technique itself (Simpson et al. 2001; Hong et al. 2008). • Model validation. The purpose of this step is to verify the model accuracy. One of the aspects here is approximation capability, i.e., how closely the model fits the particular training data set. Another is generalization capability, i.e., the predictive power at the designs not seen during the identification stage. There is a tradeoff between the two, also referred to as bias and variance (Arlot and Celisse 2010): reducing bias normally leads to a larger variance unless the size of the training set is increased, provided that the latter grows faster than the model complexity (James et al. 2013). Because the number of training points is always limited, a proper balance between model approximation and generalization needs to be found in practice. Figure 2.1 shows the surrogate modeling flowchart. Often, the process is iterative with the data acquisition, model identification, and validation repeated upon extending the training set using additional (infill) samples. The procedure continues until the accuracy goals are met or the computational budget is exceeded. In the
26
2 Basics of Data-Driven Surrogate Modeling
context of surrogate-assisted optimization, the model update may be oriented toward finding better designs rather than toward ensuring global accuracy of the model (Forrester and Keane 2009). It is important to emphasize the importance of the relationships between the model bias (the expected quality of approximating the training data samples), its variance (sensitivity of the surrogate output to particular data sets), the training data set size, and the model complexity. For example, the bias can be decreased by selecting more complex models, but this would normally lead to the increase of the model variance. On the other hand, reducing the variance by means of regularization (e.g., smoothing achieved by penalizing the model complexity, Tikhonov and Arsenin 1977) would increase the bias error. In order to reduce both the bias and the variance, it is necessary to enlarge the training data set (in other words, provide more information about the system at hand), but this has to be realized while carefully controlling the model complexity (so that the proper balance between the bias and the variance is maintained). However, increasing the number of training points may not be possible due to a limited computational budget. Also, the effects of the additional data on the modeling error is rather limited especially for higherdimensional problems due to poor scaling of point-to-point distance as a function of the data set size.
2.2
Design of Experiments
Design of experiments (DOE) (Giunta et al. 2003; Santner et al. 2003; Koehler and Owen 1996; Kleijnen 2018; Santner et al. 2018) refers to the strategies for allocating the training points in the design space. This is the first yet an important step of the modeling process because it determines how the information about the system of interest will be acquired. This, in turn, affects the model predictive power. Upon accomplishing the experimental design, the data from the highfidelity simulation model is acquired at the selected locations and utilized to construct the surrogate. As mentioned before, there is a clear trade-off between the training data set size and the amount of information about the system that can be obtained. However, in most instances, there is a hard limit on the number of samples that can be assigned due to a typically high computational cost of the simulation models. In the following sub-sections, the three groups of DOE strategies are briefly outlined: factorial designs (Sect. 2.2.1), space-filling designs (Sect. 2.2.2), and sequential sampling methods (Sect. 2.2.3). The last one has been attracting considerable attention recently because feeding back information about the current sample distribution (for exploration-based schemes; Santner et al. 2018) or the model performance allows us to allocate the data points in a more efficient manner, e.g., by increasing the sampling density in the regions that exhibit more nonlinear behavior (Couckuyt 2013; Devabhaktuni et al. 2001; Park et al. 2018; Woods and Lewis 2015).
2.2 Design of Experiments
2.2.1
27
Factorial Designs
Factorial DOEs are traditional sampling strategies (Giunta et al. 2003) that allow for estimating the main effect as well as cooperative effects of design variables without using excessive numbers of samples. The points are typically allocated in the corners, edges, and/or faces of the design space. Spreading the samples minimizes possible errors of estimating the main trends as well as interactions between design variables, in case the data about the system is coming from physical measurements. At the same time, some of the factorial designs are rather “economical” in terms of the number of samples, which is important if the computational budget for data acquiring is very limited. One of such designs, the so-called star distribution, is often used in conjunction with physics-based surrogate modeling procedures, especially space mapping (Cheng et al. 2006). Representative examples of factorial designs, along with some relevant data (training set sizes, the types of estimated effects, and graphical illustrations), are provided in Table 2.1.
2.2.2
Space-Filling Designs
Due to widespread utilization of computer simulation models nowadays, the issues that led to the development of factorial DOE strategies are no longer relevant. In particular, spreading out the samples to reduce the effects of random errors that result from imperfect measurements is not necessary because computer analysis results are deterministic. Consequently, the majority of contemporary DOE algorithms are space-filling designs attempting to allocate the training points uniformly within the design space (Queipo et al. 2005). This is especially useful for constructing an initial surrogate model when the knowledge about the system is limited. There are many different space-filling designs of experiments strategies available. Clearly, the two simplest ones are pseudorandom sampling (Giunta et al. 2003; Fig. 2.2a) and uniform grid sampling (Fig. 2.2b). Unfortunately, randomly allocated data sets exhibit poor uniformity, especially in higher-dimensional spaces. On the other hand, distributing samples on the rectangular grid is only practical for low-dimensional spaces because the number of samples is restricted to N1 N2 . . . Nn, where Nj is the number of samples along jth axis of the design space. Perhaps the most popular space-filling DOE is the Latin hypercube sampling (LHS) (McKay et al. 1979). The procedure works as follows. In order to allocate p samples with LHS, the range for each parameter is divided into p bins, which, for n design variables, yields a total number of pn bins in the design space. The samples are distributed into the bins using the following two rules: (i) each sample is randomly placed inside a bin, and (ii) for all one-dimensional projections of the p samples and bins, there is exactly one sample in each bin. Figure 2.3a shows an exemplary LHS realization of 20 samples in the two-dimensional space, which
28
2 Basics of Data-Driven Surrogate Modeling
Table 2.1 Examples of factorial designs Number of pointsa 2n
Estimated effects Main effects and factor interactions
Fractional factorial design
2n–p (typically p ¼ 1)
Some of main effects and factor interactions
Block design
3n
Main effects, factor interactions, and quadratic effects
Central composite design
2n + 2n + 1 or 2n–p + 2n + 1
Main effects, factor interactions, and some of quadratic effects
Box-Behnken design
n2n + 1
Some of main effects, factor interactions, and quadratic effects
Star distribution design
2n + 1
Main and some of quadratic effects
Type Full factorial design
a
n stands for the parameter space dimensionality
Fig. 2.2 Elementary spacefilling DOEs: (a) pseudorandom sampling and (b) grid sampling
a
b
Graphical illustration (n ¼ 3)
2.2 Design of Experiments Fig. 2.3 Latin hypercube sampling (LHS): (a) typical allocation of p ¼ 20 samples in two-dimensional space, (b) another set of 20 samples showing poor uniformity but still conforming to the LHS allocation rules; this case indicates the need for improvements of the basic LHS scheme
29
a
b
exhibits an acceptable uniformity. It should be mentioned, however, that relying only on the above rules may lead to exceptionally poor distributions as illustrated in Fig. 2.3b by an extreme case of the samples allocated on the diagonal: these are still LHS-compliant but obviously neither space-filling nor uniform. This issue has been addressed by various improvements over the standard LHS (e.g., Beachkofski and Grandhi 2002; Leary et al. 2003; Ye 1998; Palmer and Tsui 2001). Another type of a space-filling DOE is orthogonal array (OA) sampling (Queipo et al. 2005). It should be mentioned that “orthogonal” here has nothing to do with orthogonality as used in linear algebra (Giunta et al. 2003). OA generates a set of samples that yield uniform distribution within any t-dimensional projection of the ndimensional design space (with t < n); t is referred to as the strength of the OA. Note that in LHS, t ¼ 1; therefore LHS is effectively a special case of OA sampling. OA is parameterized by the total number of samples k, the design space dimension n, the number p of bins for each variable (the same for all parameters), and the strength t, i.e., we write OA(k, n, p, t). The index λ of the array is defined through an equality k ¼ λpt. The index corresponds to the number of samples occurring in each bin upon t-dimensional projection of the data set. Orthogonal arrays exhibit some important limitations, in particular, the OA may not exist for arbitrary chosen parameters; in other words, for t > 1, one cannot allocate an arbitrary number of samples. Also, replication of points may occur for OA designs projected onto the subspaces spanned by the most influential design variables (Queipo et al. 2005). On the other hand, OA can be used to improve LHS uniformity (OA-based LHS; Ai et al. 2016, optimal OA-based LHS; Ye et al. 2000). Apart from the most popular DOEs outlined above, a few other schemes are sometimes used, including quasi-Monte Carlo sampling (Giunta et al. 2003) or Hammersley sampling (Giunta et al. 2003). Space-filling DOE can also be realized as an optimization problem by minimizing a suitably defined nonuniformity meaP P sure, e.g., d2 (Leary et al. 2003), where dij is the Euclidean ij i¼1, ... ,p j ¼iþ1, ... ,p
distance between samples i and j. Minimization of uniformity measures like this is numerically involved because a large number of parameters have to be optimized (equal to p n).
30
2.2.3
2 Basics of Data-Driven Surrogate Modeling
Sequential Sampling
Sequential design of experiments (also referred to as adaptive sampling; Lehmensiek et al. 2002, or active learning; Sugiyama 2006) can be considered an improvement over conventional space-filling designs. The one-shot algorithm (cf. Sect. 2.2.2), where all the samples are allocated before evaluating the simulation model, is turned into an iterative process, where the data obtained from the previous iterations (both the surrogate model and samples) are analyzed in order to allocate the new (infill) samples, primarily in the regions that are more difficult to approximate. As a result, more efficient distribution of samples is obtained compared to traditional DOEs (Sasena et al. 2002). An important consideration for sequential DOE is the balance between exploration and exploitation (Crombecq et al. 2011). In the context of DOE, exploration aims at identifying the new (not explored before) regions of the design space that contain discontinuities, optima, plateaus, etc. In practice, it amounts to filling up the domain as uniformly as possible; therefore, the system response is not involved in the process. However, the locations of the infill samples in each iteration depend on the samples already distributed in the design space, which is the major difference between the sequential and one-shot DOEs. Exploitation, on the other hand, focuses on the regions of interest that have been already identified and tries to allocate infill samples therein. This is in order to obtain a better representation of the specific parts of the space, e.g., vicinities of the optima. As opposed to exploration, exploitation involves thesystem outputs, evaluated at the previously allocated points, to guide the sampling process (Crombecq et al. 2009). There are certain criteria of the experimental designs that are normally taken into account when developing or selecting the sampling strategy (Qian 2009). The first one is granularity. A fine-grained sequential DOE selects a small number of infill samples in each iteration, preferably just one. Coarse-grained DOEs allocate larger sets of infill samples per iteration. Fine-grained strategies are generally preferred because they allow for avoiding over- or undersampling (Crombecq et al. 2011). The second criterion is space-filling, i.e., the requirement that the set of samples should fill in the design space in a possibly uniform manner. Quantification of this requirement can be realized in various ways. Some of popular measures include the Manhattan (van Dam et al. 2007), defined as min {x(i), x( j ) 2 XB : ∑k ¼ 1, . . . ,n| xk(i) – xk( j )| }, where x( j ) ¼ [x1( j ) . . . xn( j )]T are the sample points and XB is the sample set; the Maximin (Joseph and Hung 2008) defined as min{x(i), x( j) 2 XB : [∑k ¼ 1, . . . ,n|xk(i) – xk( j)|2]1/2}, and ϕp-criterion (Viana et al. 2009) that utilizes the formula min{x(i), x( j) 2 XB : [∑k ¼ 1, . . . ,n|xk(i) – xk( j)|p]1/p}. The third fundamental criteria for space-filling designs are good projective properties (also referred to as non-collapsing property; van Dam et al. 2007), which basically means that for every sample point x( j), each value xk( j) should be strictly unique. One of the consequences is that when the experimental design XB is projected from the ndimensional space to (n – 1)-dimensional subspace along one of the axes, no two points are projected onto each other (van Dam et al. 2005). The quality of the sample
2.2 Design of Experiments
31
New sample point
Current sample set
Center of gravity Triangulation lines
Fig. 2.4 Sequential sampling using Delaunay triangulation in two-dimensional space. The current experimental design marked using large gray circles. Solid lines represent its Delaunay triangulation, whereas dashed lines determine the gravity centers of the corresponding simplexes. The new sample (marked using a large black circle) is allocated in the center of the largest-volume simplex. The procedure (triangulation and allocation of the new sample) is iterated until allocating a required number of points
set with respect to its projective properties can be measured using the minimum projected distance of points from each other kXBk – / ¼ {x(i), ( j) (i) ( j) x 2 XB : min {1 k n : | xk – xk | }}. This property is important if it is not known beforehand whether some of the parameters have little or no effect on the system response. If this is the case, evaluating two samples that only differ w.r.t. these parameters would be a waste of resources, and good projective properties allow for avoiding this situation (Xiong et al. 2009). There has been a considerable variety of sequential DOE strategies developed. Probably the most popular ones are sequential LHS (Liu et al. 2016; Wang 2003; Tong 2006). Other techniques include low-discrepancy sequences such as Halton sequence or Sobol sequence (Giunta et al. 2003; Chi et al. 2005) techniques based on Delaunay triangulation (Davis and Ierapetritou 2010), as well as techniques involving a Voronoi tessellation (Crombecq et al. 2011). All of the aforementioned methods are exploratory ones. For illustration purposes, Fig. 2.4 shows the concept of Delaunay-triangulation-based sampling. In the case of sequential designs involving exploitation, an important consideration is the surrogate model purpose. In particular, choosing appropriate infill criteria is an important aspect of surrogate-assisted optimization (Forrester and Keane 2009). The two main goals for selecting the infill points are reduction of the objective function value and improvement of the global accuracy of the surrogate (Couckuyt 2013). The simplest infill strategy would be to allocate a single sample at the surrogate model optimum. Assuming that the optimization algorithm is embedded in the trust region framework (Forrester and Keane 2009), and the surrogate model is first-order consistent with the high-fidelity model (Alexandrov et al. 1998), this strategy is
32
2 Basics of Data-Driven Surrogate Modeling
capable of finding at least a local minimum of the high-fidelity model. In general, allocation of the new training point may be oriented toward global search or constructing a globally accurate surrogate. In this context, kriging seems to be the most advantageous type of data-driven surrogate because it provides information about the expected model error (Kleijnen 2018; Jones et al. 1998; Gorissen et al. 2010). The following infill criteria based on this feature are commonly used: 1. Maximization of the expected improvement, i.e., the improvement one expects to achieve at an untried point x (Jones et al. 1998). 2. Minimization of the predicted objective function byðxÞ, i.e., surrogate optimization already mentioned above. A reasonable global accuracy of the surrogate has to be assumed (Liu et al. 2012). 3. Minimization of the statistical lower bound, i.e., LBðxÞ ¼ byðxÞ AsðxÞ (Forrester and Keane 2009), where byðxÞ is the surrogate model prediction and s2(x) is the variance; A is a user-defined coefficient. 4. Maximization of the probability of improvement, i.e., identifying locations that give the highest change of improving the objective function value (Forrester and Keane 2009). 5. Maximization of the mean square error, i.e., finding locations where the mean square error (predicted by the surrogate) is the highest (Liu et al. 2012). It should be emphasized that identifying the new samples according to the above infill criteria requires global optimization (Couckuyt 2013). As mentioned before, exploration and exploitation are important practical aspects of sequential DOEs. Putting more focus on design space exploitation usually leads to a reduced computational cost. Design space exploration normally results in higher cost but also global search capability (Forrester and Keane 2009). On the other hand, global exploration is often impractical, especially for expensive functions with a medium/large number of optimization variables (more than a few tens). It is especially in the optimization context where the balance between exploitation and exploration should be maintained. Minimization of the statistical lower bound is an example of achieving such a balance controlled by the constant A (from pure exploitation, i.e., LBðxÞ ! byðxÞ , for A ! 0 to pure exploration for A ! 1). However, choosing a good value of A is a nontrivial task (Forrester and Keane 2009).
2.3
Modeling Techniques
Data-driven surrogate modeling methods include several major and popular techniques, such as polynomial regression (Queipo et al. 2005), radial basis functions (Wild et al. 2008), kriging (Forrester and Keane 2009), neural networks (Haykin 1998), support vector regression (Gunn 1998), Gaussian process regression (Rasmussen and Williams 2006), multidimensional rational approximation (Shaker et al. 2009), or fuzzy systems (van der Herten et al. 2015). In this chapter, some of these
2.3 Modeling Techniques
33
methods are briefly outlined for the convenience of the reader. More detailed information can be found in the literature of the subject or the recent review papers (e.g., Simpson et al. 2001; Jin et al. 2001; Chen et al. 2005; Wang and Shan 2006; Goel et al. 2007). For the purpose of this section, the training data samples will be denoted as {x(i)}, i ¼ 1, . . . , p whereas the corresponding high-fidelity model evaluations as f(x(i)). The surrogate model is constructed by approximating the data pairs {x(i), f(x(i))}.
2.3.1
Polynomial Regression Models
Polynomial regression belongs to the simplest approximation techniques (Queipo et al. 2005). The surrogate model is defined as sðxÞ ¼
K X
βj vj ðxÞ,
ð2:1Þ
j¼1
where βj are unknown coefficients and vj are the (polynomial) basis functions. The model parameters can be found as a least squares solution to the linear system f ¼ ψ β,
ð2:2Þ
where f ¼ [f(x(1)) f(x(2)) . . . f(x( p))]T, ψ is a p K matrix containing the basis functions evaluated at the sample points, and β ¼ [β1 β2 . . . βK]T. The number of sample points p should be consistent with the number of basis functions considered K (typically p K ). If the sample points and basis functions are taken arbitrarily, some columns of ψ can be linearly dependent. If p K and rank(ψ) ¼ K, a solution to (2.2) in the least squares sense can be computed through ψ +, the pseudoinverse of ψ (Golub and Van Loan 1996), i.e., β ¼ ψ +f ¼ (ψ Tψ)1ψ T. One of the simplest yet useful examples of a regression model is a second-order polynomial one defined as n n X n X X s ð xÞ ¼ s ½ x1 x 2 . . . x n T ¼ β 0 þ β j xj þ βij xi xj , j¼1
i¼1
ð2:3Þ
ji
with the basis functions being monomials: 1, xj, and xixj. Because the flexibility of polynomial regression models is rather limited, this type of surrogates is mostly applicable to situations where the system response is regular and weakly nonlinear (as a function of design parameters). Consequently, they are not suitable for modeling of high-frequency structures whose frequency characteristics contain highly nonlinear features such as resonances (Rayas-Sanchez et al. 2017). Exceptions include local models for surrogate-based optimization (e.g.,
34
2 Basics of Data-Driven Surrogate Modeling
Rangel-Patiño et al. 2017) or statistical analysis (Rayas-Sanchez et al. 2010) as well as low-dimensional parameter spaces (Chávez-Hurtado and Rayas-Sánchez 2016).
2.3.2
Radial Basis Functions
Radial basis function interpolation/approximation surrogates (Forrester and Keane 2009; Wild et al. 2008) are essentially regression models that exploit combinations of K radially symmetric functions ϕ s ð xÞ ¼
K X
λj ϕ x cð jÞ ,
ð2:4Þ
j¼1
in which λ ¼ [λ1 λ2 . . . λK]T is the vector of model parameters and c( j ), j ¼ 1, . . . , K are the (known) basis function centers. The model parameters can be calculated as λ ¼ Φ+f ¼ (ΦTΦ)1ΦT f, where f ¼ [f(x(1)) f(x(2)) . . . f(x( p))]T, and the p K matrix Φ ¼ [Φkl]k ¼ 1, . . . , p; l ¼ 1, . . . , K, with the entries defined as Φkl ¼ ϕ xðkÞ cðlÞ :
ð2:5Þ
If the number of basis functions is equal to the number of samples, i.e., p ¼ K, the centers of the basis functions coincide with the data points and are all different, Φ is a regular square matrix. In such a case, we have λ ¼ Φ–1 f. However, finding the model coefficients by directly solving the system Φλ ¼ f is not practical when the number of training points is large (e.g., a few thousand) because matrix inversion requires computational time of the order of p3 and the storage space ~p2. Also, sparse methods cannot be used as Φ is normally non-sparse. Alternative methods of dealing with large numbers of samples include multilevel (Liu 2004) and multipole methods (Löschenbrand and Mecklenbrauker 2016). A popular choice of the basis function is a Gaussian, ϕ(r) ¼ exp (cr2), where c is the scaling parameter, typically adjusted using cross-validation (Fasshauer and McCourt 2012). Exemplary types of basis functions are listed in Table 2.2.
Table 2.2 Commonly used radial basis functions Name Gaussian Multiquadric Inverse multiquadric Thin place spline Polyharmonic spline
Formulation cr2
ϕðr Þ ¼ e ϕ(r) ¼ (r2 + c2)1/2 ϕ(r) ¼ (r2 + c2)1/2 ϕ(r) ¼ r2 log r ϕ(r) ¼ rk
Parameters r 0, c > 0 r 0, c > 0 r 0, c > 0 r0 r 0, k ¼ 1, 3, 5, . . .
2.3 Modeling Techniques
2.3.3
35
Kriging
Kriging belongs to the most popular techniques for interpolating deterministic noisefree data (Journel and Huijbregts 1981; Simpson et al. 2001; Kleijnen 2009; O’Hagan 1978). Kriging is a Gaussian process-based modeling method, which is compact and cheap to evaluate (Rasmussen and Williams 2006). In its basic formulation, kriging (Journel and Huijbregts 1981; Simpson et al. 2001) assumes that the function of interest is of the following form: f ðxÞ ¼ gðxÞT β þ Z ðxÞ,
ð2:6Þ
where g(x) ¼ [g1(x) g2(x) . . . gK(x)]T are known (e.g., constant) functions, β ¼ [β1 β2 . . . βK]T are the unknown model parameters (hyperparameters), and Z(x) is a realization of a normally distributed Gaussian random process with zero mean and variance σ 2. The regression part g(x)Tβ is a trend function for f, and Z(x) takes into account localized variations. The covariance matrix of Z(x) is given as h i h i Cov Z xðiÞ Z xð jÞ ¼ σ 2 R R xðiÞ , xð jÞ ,
ð2:7Þ
where R is a p p correlation matrix with Rij ¼ R(x(i), x( j )). Here, R(x(i), x( j )) is the correlation function between sampled data points x(i) and x( j ). The most popular choice is the Gaussian correlation function h Xn i Rðx, yÞ ¼ exp k¼1 θk jxk yk j2 ,
ð2:8Þ
where θk are the unknown correlation parameters and xk and yk are the kth components of the vectors x and y, respectively. The kriging predictor (Simpson et al. 2001; Journel and Huijbregts 1981) is defined as sðxÞ ¼ gðxÞT β þ rT ðxÞR1 ðf GβÞ,
ð2:9Þ
h i rðxÞ ¼ R x, xð1Þ . . . R x, xðpÞ ,
ð2:10Þ
h iT , f ¼ f xð1Þ f xð2Þ . . . f xðpÞ
ð2:11Þ
where
and G is a p K matrix with Gij ¼ gj(x(i)). The vector of model parameters β can be computed as
36
2 Basics of Data-Driven Surrogate Modeling
a
b
Function plot
2
2
0
0
-2 1
-2 1 0 -1
x2
c
Kriging model for N=10
0
-1
1
0 x2
x1
d
Kriging model for N=20
2
2
0
0
-2 1
-2 1 0 x2
-1
-1
0 x1
1
-1
-1
0
1
x1
Kriging model for N=50
0 x2
-1
-1
0
1
x1
Fig. 2.5 Exemplary function of two variables and its kriging surrogate obtained for various numbers of training samples: (a) function plot, (b) kriging model with 10 samples, (c) kriging model with 20 samples, and (d) kriging model with 50 samples. Design of experiments: Latin hypercube sampling
1 β ¼ GT R1 G GT R1 f :
ð2:12Þ
Model fitting is accomplished by maximum likelihood for θk (Journel and Huijbregts 1981), i.e., by maximizing
½p ln ðσ 2 Þ þ ln jRj , 2
ð2:13Þ
in which both σ 2 and R are functions of θk. An important property of kriging is that the random process Z(x) provides information on the approximation error that can be used for improving the surrogate, e.g., by allocating additional training samples at the locations where the estimated model error is the highest (Forrester and Keane 2009; Journel and Huijbregts 1981). This feature is also utilized in various global optimization methods (see Couckuyt 2013, and references therein). Figure 2.5 illustrates the kriging models for an exemplary function of two parameters and for different numbers of the training data samples. Kriging can be generalized to include variable-fidelity training data, which is realized by co-kriging (Forrester et al. 2007; Toal and Keane 2011). More specifically, co-kriging first constructs a regular kriging model using densely sampled low-fidelity (and cheap to acquire) data points; subsequently, a separate kriging
2.3 Modeling Techniques
37
s (model response)
1 0.8 0.6 0.4 0.2 0
0
0.2
0.4
x
0.6
0.8
1
Fig. 2.6 Co-kriging modeling concept (Koziel et al. 2013): high-fidelity model (—), low-fidelity model (- - -), high-fidelity model samples (□), and low-fidelity model samples (○). Kriging interpolation of the high-fidelity model samples (- -) is not an adequate representation of the high-fidelity model (due to the limited data set size). Co-kriging interpolation () of blended lowand high-fidelity model data provides better accuracy at low computational cost
model is generated on the residuals of the high- and low-fidelity samples (the number of the former is much smaller than for the low-fidelity model). Co-kriging is a rather recent method with relatively few applications in engineering (Toal and Keane 2011; Huang and Gao 2012; Laurenceau and Sagaut 2008; Koziel et al. 2013). Conceptual illustration of co-kriging has been provided in Fig. 2.6.
2.3.4
Support Vector Regression
Support vector regression (SVR) (Gunn 1998) is another popular surrogate modeling technique that found applications in many engineering areas, including highfrequency electronics (Ceperic and Baric 2004; Rojo-Alvarez et al. 2005; Yang et al. 2005; Meng and Xia 2007; Xia et al. 2007; Andrés et al. 2012; Zhang and Han 2013). SVR exhibits good generalization capability (Angiulli et al. 2007) and easy training by means of quadratic programming (Smola and Schölkopf 2004). SVR exploits the structural risk minimization (SRM) principle, which has been shown to be superior (Gunn 1998) to traditional empirical risk minimization (ERM) principle, employed by, e.g., neural networks. Support vector regression has been gaining popularity in various areas including electrical engineering and aerodynamic design. Here, SVR is formulated for a vector-valued function f. Let f k ¼ f(xk), k ¼ 1, 2, . . . , N, denote the sampled high-fidelity model responses. The objective is to use SVR to approximate f k at the base points xk, k ¼ 1, 2, . . . , N. We shall also use the notation f k ¼ [f1k f2k . . . fmk]T to denote the components of the vector f k. For linear regression, we aim at approximating a training data set, here, the pairs
38
2 Basics of Data-Driven Surrogate Modeling
Dj ¼ {(x1, fj1), . . . , (xN, fjN)}, j ¼ 1, 2, . . . , m, by a linear function fj(x) ¼ wjTx + bj. The optimal regression function is given by the minimum of the following functional (Smola and Schölkopf 2004) N X 1 2 Φj ðw, ξÞ ¼ wj þ C j ξþ þ ξ j:i j:i : 2 i¼1
ð2:14Þ
In (2.14), Cj is a user-defined value, whereas ξj. i+ and ξj. i– are the slack variables representing upper and lower constraints on the output of the system. The typical cost function used in SVR is an ε-insensitive loss function defined as ( Lε ðyÞ ¼
0
f j ðxÞ y
for f j ðxÞ y < ε otherwise
ð2:15Þ
The value of Cj determines the trade-off between the flatness of fj and the amount up to which deviations larger than ε are tolerated (Gunn 1998). Here, we describe nonlinear regression employing the kernel approach, in which the linear function wjTx + bj is replaced by the nonlinear function Σiγ j. iK(xk, x) + bj, where K is a kernel function. Thus, the SVR surrogate model is defined as 2 PN 6 sðxÞ ¼ 4
i¼1 γ 1:i K ðx
PN
i
, xÞ þ b1
⋮
i¼1 γ m:i K ðx
i
3 7 5,
ð2:16Þ
, xÞ þ bm
with parameters γ j. i and bj, j ¼ 1, . . . , m, i ¼ 1, . . . , N obtained according to a general SVR methodology. In particular, Gaussian kernels of the form K(x, y) ¼ exp (–0.5 kx – yk2/c2) with c > 0 can be used, where c is the scaling parameter. Both c and parameters Cj and ε can be adjusted to minimize the generalization error calculated using a cross-validation method (Queipo et al. 2005).
2.3.5
Artificial Neural Networks
Artificial neural network (ANN) is a large area of research (Haykin 1998). The primary application areas of ANNs include pattern classification (Ou and Murphey 2007), prediction and financial analysis (Takahashi et al. 2019), as well as control and optimization (Yan and Wang 2015). However, in the context of this book, ANNs can be considered as yet another way of approximating sampled high-fidelity model data to create a surrogate model. Neural networks have been popular choices in modeling electromagnetic systems (Christodoulou and Georgiopoulos 2001; Mishra 2001): for modeling of antennas (Rawat et al. 2012; Mishra et al. 2015), passive
2.3 Modeling Techniques
39
a
Inputs x1 x2
b
Input Units
Hidden Layer
Output Units
h = Σwixi + b
x3
x
Output
Neuron
y
y = (1+e-h/T)-1 xn Fig. 2.7 Basic concepts of artificial neural networks: (a) structure of a neuron; (b) two-layer feedforward neural network architecture
microwave components (Zhang et al. 2003; Rayas-Sanchez 2004), as well as active devices such as RF power amplifiers (Fang et al. 2000; Xu et al. 2002). The most important component of a neural network (Haykin 1998; Minsky and Papert 1969) is the neuron (or single-unit perceptron). A neuron realizes a nonlinear operation illustrated in Fig. 2.7a, where w1 through wn are regression coefficients, β is the bias value of the neuron, and T is a user-defined slope parameter. The most common neural network architecture is the multilayer feed-forward network shown in Fig. 2.7b. The construction of a neural network model involves selection of its architecture selection and the training, i.e., the assignment of the values to the regression parameters. Choosing the network architecture (the number of layers, the number of neurons in each layer, etc.) is a nontrivial task (Zhang et al. 2003), which can be automated using dedicated software frameworks (Neuromodeler; Zhang and Gupta 2000). The network training can be formulated as a nonlinear least squares regression problem. A popular technique for solving this regression problem is the error backpropagation algorithm (Simpson et al. 2001; Haykin 1998). Assuming the training data to be {(x(1), y1), (x(2), y2), . . . , (x( p), yp)}, the proper values of the network weights are determined to minimize the approximation error. E¼
X
2 yj byj ,
ð2:17Þ
j
where byj is the network output given the input x( j ). The weights are adjusted based on (∂E/∂y)/(∂y/∂wij). For large networks with complex architectures, the use of global optimization methods might be necessary (Alba and Marti 2006).
40
2.3.6
2 Basics of Data-Driven Surrogate Modeling
Fuzzy Systems
Fuzzy systems are commonly used in machine control (Passino and Yurkovich 1998) where the expert knowledge and a set of sampled input-output (state-control) pairs recorded from successful control are translated into the set of “IF-THEN” rules that specify actions to be taken in particular situations (Wang and Mendel 1992). Because of the incomplete and qualitative character of such information, it is represented using a fuzzy set theory (Zadeh 1965), where a given piece of information (element) belongs to a given (fuzzy) subset of an input space with a certain degree, according to so-called membership function (Li and Lan 1989). The process of converting a crisp input value to a fuzzy value is called “fuzzification.” Given the specific input state, the “IF-THEN” rules that apply are invoked, using the membership functions and the truth values obtained from the inputs, to determine the result of the rule. This result is subsequently mapped into a membership function and the truth value controlling the output variable. These are combined to give a specific answer, using a procedure known as “defuzzification.” The “centroid” defuzzification method is very popular, in which the “center of mass” of the result provides the crisp output value. Fuzzy systems can also be used as universal function approximators (Wang and Mendel 1992). In particular, given a set of numerical data pairs, it is possible to obtain a fuzzy-rule-based mapping from the input space (here, design variables) to the output space (here, surrogate model response). The mapping is realized by dividing the input and output spaces into fuzzy regions, generating fuzzy rules from given desired input-output data pairs, assigning a degree to each generated rule and forming a combined fuzzy rule base, and, finally, performing defuzzification. It can be shown that under certain conditions, such a mapping is capable of approximating any real continuous function over the compact domain to arbitrary accuracy (Wang and Mendel 1992). For the purpose of explaining the fuzzy surrogate as function approximator (here, for vector-valued functions), we assume the training set consisting of the data pairs {xk, fk}, where xk 2 XB and k ¼ 1, 2, . . . , N, and fk 2 Rn. The membership functions for the ith variable are defined as shown in Fig. 2.8. Each interval [x0.i – δ0.i, x0.i + δ0.i], i ¼ 1, 2, . . . , n, is divided into K subintervals (fuzzy regions). The number K corresponds to the number of base points N and is given by K ¼ bN1/nc – 1. In particular, if XB consists of the base points uniformly distributed in the region of interest XR, then K + 1 is exactly the number of points of this uniform grid along any of the design variable axes. In general, K is chosen in such a way that the number of n-dimensional subintervals (and, consequently, the maximum number of rules) is not larger than the number of base points. Division of [x0.i – δ0. i, x0.i + δ0.i] into K subintervals creates K + 1 values xi.k, k ¼ 0, 1, . . . , K. T In the case of a uniform base set, points xq ¼ ½x1:q1 . . . xn:qn , q 2 {0, 1, . . . , K}n i.k coincide with the base points. The value x corresponds to the fuzzy region [xi.k 1, xi.k + 1] for k ¼ 1, . . . , K – 1 ([xi.0, xi.1] for k ¼ 0, and [xi.K 1, xi.K] for k ¼ K ). We T also use the symbol xq to denote the n-dimensional fuzzy region ½x1:q1 . . . xn:qn . For any given x, the value of the membership function mi.k(x) determines the degree
2.3 Modeling Techniques
41
mi(x) 1.0
mi.0(x)
xi.0 xi.1 xi.2 mi.1(x) mi.2(x)
mi.K−1(x)
xi.K xi.K−1 mi.K(x)
...
0.0
x0.i+δ0.i
x0.i−d0.i
Fig. 2.8 Division of the input interval [x0. corresponding membership functions
– δ0. i, x0.
i
i
x
+ δ0. i] into fuzzy regions and the
of x in the fuzzy region xi.k. Triangular membership functions are widely used. Here, one vertex lies at the center of the region and has membership value unity, and the other two vertices lie at the centers of the two neighboring regions, respectively, and have membership values equal to zero. Having defined the membership functions, we need to generate the fuzzy rules from given data pairs. We use “IF-THEN” rules of the form IF xk is in xq THEN y ¼ fk, where y is the response of the rule. At the level of vector components, it means IF xk:1 is in x1:q1 AND xk:2 is in x2:q2 AND . . . . . . AND xk:n is in xn:qn THEN y ¼ f k ,
ð2:18Þ
where xk.i, i ¼ 1, . . . , n are components of vector xk. In general, some conflicting rules may occur, i.e., rules that have the same IF part but a different THEN part. Such conflicts are resolved by assigning a degree to each rule and accepting only the rule from a conflict group that has a maximum degree. A degree is assigned to a rule in the following way. For the rule “IF xk.1 is in x1:q1 AND xk.2 is in x2:q2 AND . . . AND xk.n is in xn:qn THEN y ¼ fk”, the degree of this rule, denoted by D(xk), is defined as n Y mi:qi ðxk:i Þ: D xk , xq ¼
ð2:19Þ
i¼1
Having resolved the conflicts, we have a set of non-conflicting rules, which we denote as si, i ¼ 1, 2, . . . , L. We denote s: XR ! Rm as the output of the fuzzy system, which is determined using a centroid defuzzification PL sðxÞ ¼ Pi¼1 L
Dðx, xi Þyi
i¼1 Dðx, x
iÞ
,
ð2:20Þ
where xi is an n-dimensional fuzzy region corresponding to the ith rule and yi is the output of the ith rule.
42
2.3.7
2 Basics of Data-Driven Surrogate Modeling
Polynomial Chaos Expansion
Polynomial chaos expansion (PCE) is a popular surrogate modeling technique that attempts to build a model of stochastic variations of the system of interest (Xiu and Karniadakis 2002). The model inputs are probability distributions of the considered random parameters such as manufacturing tolerances, uncertainties concerning operating conditions, material parameters, etc. (Kaintura et al. 2018). Although PCE can be traced back to the 1930s (Wiener 1938), the technique has gained considerable popularity over the last 20 years or so owing to some recent developments (Kim et al. 2013; Du and Roblin 2017; Manfredi et al. 2017). PCE is a powerful tool that allows us to directly estimate the statistical moments of the output probability distributions without the necessity of conducting Monte Carlo simulations (Gong et al. 2012). Here, a brief outline of the method is provided with a discussion of its basic properties, truncation schemes, as well as methods for calculating the expansion coefficients. Let X 2 Rn be a random vector described by the joint probability density function fX. The output Y of the system of interest is described by a map Y ¼ M(X) such that the second-order moments of Y are finite. The polynomial chaos expansion of M(X) is defined as (Blatman and Sudret 2010) Y ¼ M ðX Þ ¼
X
a Ψ ðX Þ, α2N n α α
ð2:21Þ
in which Ψα(X) are multivariate polynomials (orthonormal with respect to fX), α 2 Nn are multi-indices identifying polynomial components, whereas aα 2 R are the expansion coefficients (Kaintura et al. 2018). In practice, the infinite sum (2.21) has to be truncated so that M ðXÞ M PC ðX Þ ¼
X
a Ψ ðX Þ, α2A α α
ð2:22Þ
where A is the finite set of multi-indices. The polynomial basis construction starts from univariate orthonormal polynomials ϕk(i)(xi) that fulfill the conditions Z D E ðiÞ ðiÞ ðiÞ ðiÞ ϕj ðxi Þ, ϕk ðxi Þ ¼ ϕj ðxi Þϕk ðxi Þf X i ðxi Þdxi ¼ δjk ,
ð2:23Þ
DX i
where i is the index of the input variable, j and k are the polynomial degrees, fXi(xi) is the ith marginal distribution, whereas δjk is the Kronecker symbol (Sudret 2008). The multivariate polynomials are then Ψ α ð xÞ ¼
n Y i¼1
ϕðαiiÞ ðxi Þ:
ð2:24Þ
2.3 Modeling Techniques
43
Table 2.3 Families of univariate orthogonal polynomials for PCE applications Distribution Gaussian
Polynomials Hermite Hek(x)
PDF
Uniform
Legendre Pk(x)
Gamma
Laguerre Lka(x)
xaex
Beta
Jacobi Jka,b(x)
ð1xÞa ð1þxÞb BðaÞBðbÞ
2 p1ffiffiffiffi ex =2 2π 1 2
Hilbertian basis ψ k(x) pffiffiffiffi H ek ðxÞ= k! qffiffiffiffiffiffiffiffi 1 Pk ðxÞ= 2kþ1 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Þ Lak ðxÞ= Γðkþaþ1 k!
Support range [–/, /]
J a,b k ðxÞ=Ja,b,k 2aþbþ1 J2a,b,k ¼ 2kþaþbþ1
[1, 1]
[1, 1] [0, /]
Γðkþaþ1ÞΓðkþbþ1Þ Γðkþaþbþ1ÞΓðkþ1Þ
Table 2.3 shows the most popular families of univariate polynomials as well as the distributions with respect to which these polynomials are orthonormal. One of important reasons behind the popularity of PCE is that the stochastic moments of the system output can be conveniently obtained from the expansion coefficients. More specifically, the mean and the variance can be calculated as
2 PC σ
μPC ¼ E M PC ðX Þ ¼ a0 , h 2 i X 2 ¼ E M PC ðX Þ μPC ¼ α 2 A aα : α 6¼ 0
ð2:25Þ ð2:26Þ
As mentioned before, practical PCE can only use a finite number of basis functions. The most straightforward truncation scheme (or the standard truncation scheme) corresponds to using all polynomials in the n input variables of the total degree less than or equal to p, i.e., An, p ¼ {α 2 Nn:| α| p}. The number of basis functions (and, therefore, expansion coefficients) is then
n,p
card A
¼P¼
nþp p
ð2:27Þ
and grows quickly with both n and p. Other schemes are often used such as the maximum interaction scheme involving αs that have at most r non-zero elements (Blatman 2009), or hyperbolic truncation, where An, p, q ¼ {α 2 An, p : kαkq p}, where the q-norm is defined as kαk ¼
Xn
αq i¼1 i
1=q
:
ð2:28Þ
Note that for q ¼ 1, the hyperbolic truncation is identical with the standard truncation. There are several strategies available to calculate the polynomial chaos coefficients. Here, only the non-intrusive methods are outlined, i.e., those that determine the coefficients based on post-processing of a set of system evaluations obtained through sampling the input random variables. The first technique is a projection method, directly following the PCE definition (2.21). We have
44
2 Basics of Data-Driven Surrogate Modeling
aα ¼ E ½Ψα ðX Þ M ðX Þ,
ð2:29Þ
(2.29) can be cast into numerical integration problem solved by means of quadrature methods, e.g., the Gaussian quadrature (Gander and Gautschi 2000). We have Z aα ¼
ΩX
M ðxÞΨα ðxÞf X ðxÞdx
ðiÞ w M xðiÞ Ψα xðiÞ , i¼1
XN
ð2:30Þ
where the weights w(i) and the quadrature points x(i) (experimental design) come from Lagrange polynomial interpolation to guarantee evaluation exactness of the integral of functions of polynomial complexity (Gander and Gautschi 2000). Another approach to PCE coefficient determination is least squares regression. More specifically, one can write Y ¼ M ðX Þ ¼
XP1 j¼0
aj Ψj ð X Þ þ ε P ¼ aT Ψð X Þ þ ε P ,
ð2:31Þ
where P ¼ card An, p, εP stands for the truncation error, a ¼ [a0 . . . aP – 1]T is the coefficient vector, and Ψ(x) ¼ [Ψ0(x) . . . ΨP – 1(x)]T is the matrix of all orthonormal polynomials in X (Berveiller et al. 2006). The least squares problem is then defined as h 2 i a ¼ arg min E aT ΨðX Þ M ðX Þ ,
ð2:32Þ
and can be solved using ordinary least squares (OSL) (Berveiller et al. 2006). In most practical problems, low-order variable interactions are typically dominant; therefore, low-rank truncation schemes are preferred. An alternative strategy is penalizing the least squares problem (2.32) using a regularization term that favors low-rank solutions (Blatman 2009; Blatman and Sudret 2010; Kaintura et al. 2018). One of popular realizations of this idea is the LAR (least-angle regression) algorithm (Efron et al. 2004), in which the basis functions are iteratively added to the so-called active set based on their correlation with the current residual (i.e., the difference between the data set and the model constructed using the basis functions selected so far). The LAR algorithm may allow for obtaining sparse PCE models that exhibit good accuracy even for small training data sets (Ahadi et al. 2016).
2.3.8
Other Methods
This chapter provided formulations and brief characterization of the selected datadriven modeling techniques. The number of approximation methods utilized in various applications is much larger, and some of these are mentioned below.
2.3 Modeling Techniques
45
Moving least squares (MLS) (Levin 1998; Breitkopf et al. 2002) is a technique for reconstructing continuous functions from a set of sample points through the calculation of a weighted least squares measure biased toward the vicinity of the point at which the reconstruction is requested. The surrogate model is defined as sðxÞ ¼
K X
βj ðxÞvj ðxÞ,
ð2:33Þ
j¼1
where, unlike in the conventional regression surrogates (cf. Sect. 2.3.1), the coefficients are functions of x. The local objective at x is to minimize p X i¼1
K X ωi xðiÞ x βj xðiÞ vj ðxÞ f xðiÞ
!2 ,
ð2:34Þ
j¼1
where the weighting factors ωi are the functions of the distance between x and x(i), designed to be maximum at zero and monotonically decrease with the distance. A typical choice for the weights is 2 ωi x xðiÞ ¼ exp x xðiÞ :
ð2:35Þ
Adding weights improves the flexibility of the model, however, at the expense of the increased computational complexity, since computing the approximation for each point x requires solving a new optimization problem. Gaussian process regression (GPR) (Rasmussen and Williams 2006) is another surrogate modeling technique that, as kriging, addresses the approximation problem from a stochastic point view. From this perspective, and since Gaussian processes are mathematically tractable, it is relatively easy to compute error estimations for GPR-based surrogates in the form of uncertainty distributions. Under certain conditions, GPR models can be shown to be equivalent to large neural networks (Rasmussen and Williams 2006) while requiring much less regression parameters than NNs. A basic formulation of GPR has been given below. A Gaussian process (GP) describes a distribution over functions. It can be notated as f(x)~GP(m(x), k(x, x0)), with x, x0 2 Rn and m(x) and k(x, x0 ) being the mean and covariance functions, respectively (Rasmussen and Williams 2006). The GP encapsulates all possible functions in the space of functions that subscribe to m(x) and k(x, x0 ). The GPR model is semi-parametric in the sense that any sample function is not specified in terms of a finite number of parameters (such as weights in the case of a linear model), but directly in the space of functions. For a finite (practical) training data set of n observations, D ¼ {(xi, yi)| i ¼ 1, . . . , p}, where yi are scalars, the corresponding Gaussian process f(x) would be implemented as the collection of random variables fi ¼ f(xi), with any n-dimensional point under their jointly Gaussian distribution representing n values of a sample function with index set of
46
2 Basics of Data-Driven Surrogate Modeling
inputs {xi}. The only parameterization that takes place is the specification of hyperparameters which determine the properties of the mean and covariance functions. A popular function for calculating the covariance between the output random variables f(x) and f(x0 ) is the squared-exponential (SE) covariance function with automatic relevance determination (ARD) (MacKay 1993), 2 ! xi:k xj:k 1 Xp 2 kSE xi , xj ¼ σ f exp : k¼1 2 τ2k
ð2:36Þ
In (2.36), xi,k is the kth component of xi, τk is the (positive) characteristic length-scale parameter corresponding to the kth components of the input vectors, and σ 2f is the signal variance; σ 2f and τk are the hyperparameters of the covariance function. The hyperparameters may be found through a principled methodology which involves a process similar to Bayesian model selection. It entails finding the hyperparameters for which the negative log marginal likelihood is a minimum, using gradient-based optimization. The log marginal likelihood in the noise-free case is given by (Belyaev et al. 2015) 1 1 p log PðyjX Þ ¼ yT K 1 y log j K j log 2π, 2 2 2
ð2:37Þ
where K ¼ (X, X) is the p p matrix of covariances evaluated between all possible pairs of p training outputs using the covariance function, X is the n p matrix of training input (column) vectors xi, |K| is the determinant of K, and y is the training target (column) vector. The GPR predictions are made by assuming a jointly Gaussian (normal) distribution of zero mean over the p random variables representing the training outputs and contained in a column vector f and the n random variables representing the test outputs contained in f . This is the prior distribution:
f f
K ðX, X Þ
N 0, K ðX , X Þ
K ðX, X Þ K ðX , X Þ
,
ð2:38Þ
in which K(X, X ) is the p p matrix of covariances evaluated between all possible pairs of p training and n test outputs, with X being a matrix containing the test input vectors (other sub-matrices are similarly defined). The distribution of the test outputs conditioned on the known training outputs y, or the posterior distribution, can then be expressed as f j X , X, y~N(m, ∑) (Belyaev et al. 2015) with the mean m and covariance matrix ∑ given by m ¼ K ðX , X ÞK ðX, X Þ1 y, 1
Σ ¼ K ðX , X Þ K ðX , X ÞK ðX, X Þ K ðX, X Þ:
ð2:39Þ ð2:40Þ
2.3 Modeling Techniques
47
The predictive mean m contains the most likely values of the test outputs associated with the test input vectors in X , whereas the diagonal of the covariance matrix Σ gives the corresponding predictive variances. Conditioning on the known training data can be interpreted as retaining in the posterior distribution only functions that pass through the training data points. Response surface modeling (RSM; Khuri and Mukhopadhyay 2010) has been playing an important role in relieving computational cost of simulation-driven design tasks including design optimization (Dorica and Giannacopoulos 2006; Liu and Fu 2016), worst-case analysis (Sengupta et al. 2005; Dharchoudhury and Kang 1995), or parametric yield optimization (Bandler et al. 1993; Li et al. 2007). In certain areas such as design of analog and mixed-signal (AMS; Rutenbar et al. 2007) circuits, conventional RSM using least squares regression is very challenging due to extremely high-dimensional parameter spaces (e.g., thousands or even tens of thousands of variables necessary to model AMS systems consisting of multiple transistors; Wang et al. 2016) as well as expensive circuit simulation. Although a large number of basis functions are necessary to span the high-dimensional variable space, majority of these functions are of little importance, i.e., many RSM model coefficients are close to zero (Wang et al. 2016). This renders a sparse structure which may be explored in order to reduce the computational cost of surrogate model construction. In particular, a large number of model coefficients (say, from 105 up to 106) can be identified from a small set (e.g., 102 to 103) samples without overfitting by means of orthogonal matching pursuit (OMP) or L1-regularization (Tao et al. 2016; Li 2010). An alternative is Bayesian model fusion (BMF), where reduction of the cost is obtained by re-using the data obtained at the level of a simpler model, utilized at the early stages of the design process, when fitting a late-stage performance model (Wang et al. 2016). In order to give the reader some flavor of the aforementioned methodologies, a short outline of the OMP algorithm is discussed below. The RSM model is of the form s ð xÞ ¼
K X
bj vj ðxÞ,
ð2:41Þ
j¼1
where vj are the basis functions, whereas b ¼ [b1 . . . bK]T are unknown model coefficients found by solving Vb ¼ f ,
ð2:42Þ
in which f ¼ [f(x(1)) . . . f(x(p))]T are the system responses at the base designs x(i) and V ¼ [vi(x(j ))]i ¼ 1, . . . , K, j ¼ 1, . . . , p. The basis functions are assumed to be orthonormal (e.g., Hermite polynomials in case x represents a set of independent random variables following normal distribution; Li 2010). The OMP algorithm works as follows (Tropp and Gilbert 2007):
48
2 Basics of Data-Driven Surrogate Modeling
1. Start from (2.42) generated from {vj} and an integer λ being the total number of basis functions to be selected. 2. Initialize the residual r ¼ f, the number of selected basis functions MS ¼ 0, the index set of selected functions IS ¼ ∅, and the index set of all basis functions IC ¼ {1, 2, . . . , K}. 3. while MS λ. 4. For each m 2 IC, calculate the inner product ξm between r and the basis vector vm . 5. Set mS ¼ max {m:| ξm| }. 6. Update MS ¼ MS + 1, IS ¼ IS [ {mS}, and remove mS from IC. 7. Solve
min
fbm :m2I S g
p X X j¼1
! 2 bm vm xð jÞ f xð jÞ
m2I S
to determine the optimal model coefficients b m : m 2 I S . 8. Update the residual r¼f
X
b m vTm
m2I S
9. end while 10. Set the model coefficients {bm; m 2 IC} to zero. The last technique mentioned in this section is the tolerant Cauchy approximation (Shaker et al. 2009), which is suitable for modeling resonating responses of highfrequency structures such as filters or antennas. Given the system response f(x), x ¼ [x1 . . . xn]T, the multidimensional Cauchy approximation is defined as s ð xÞ ¼
a0 þ a1 x1 þ a2 x2 þ a3 x21 þ a4 x1 x2 þ a5 x22 þ . . . , b0 þ b1 x1 þ b2 x2 þ b3 x21 þ b4 x1 x2 þ b5 x22 þ . . .
ð2:43Þ
where a ¼ [a0 a1 . . . aM]T and b ¼ [b0 b1 . . . bM]T are the unknown coefficients, which are found by solving a linear program of the form min cT v v
subject to AðδÞv d,
ð2:44Þ
in which v ¼ [t aT bT]T, with t being an auxiliary variable of the linear program. The matrix A is a function of the training data set {(xi, f(xi))}i ¼ 1, . . . , p; the vectors c and d are constant vectors of the dimension p 1. The p 1 vector δ is the vector of tolerances for the Cauchy approximation. The allowed model tolerance for the ith sample is εi, i.e., we have f(xi) – εi s(xi) f(xi) + εi. The tolerances are utilized to account for the errors of the electromagnetic simulation or physical measurements (Basl et al. 2010).
2.4 Model Validation
2.4
49
Model Validation
Model validation is an important stage of the modeling process in which one assesses the quality of the surrogate, among others, its predictive power. The important components of the validation process include selection of the modeling error as well as choosing a measure for estimating the generalization capability of the model. Both are briefly discussed in this section. Selection of the error function and the target accuracy of the surrogate are generally problem dependent. Normally, certain knowledge about the structure of the system response is required along with sufficient understanding of the meaning of a particular generalization estimator to be used (Ling and Mahadevan 2013). There are two categories of error functions: absolute and relative. Absolute errors are not unit-free and depend on the particular prediction value of the response. Although these features are undesirable, absolute errors are quite popular. A representative example is the root mean square error (RMSE), defined as (Goel et al. 2009) vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N u1 X RMSEðy, eyÞ ¼ t ðy eyi Þ2 , N i¼1 i
ð2:45Þ
where yi , eyi stand for the actual and surrogate-predicted system responses, respectively. RMSE penalizes large errors and virtually ignores small errors (i.e., it is overly pessimistic; Li and Zhao 2006), and it is unintuitive to interpret. Consequently, it is not recommended. Another absolute error is the average Euclidean error (AEE), defined by (Stuart and Ord 1994) N qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 X AEEðy, eyÞ ¼ ðyi eyi Þ2 : N i¼1
ð2:46Þ
AEE is also pessimistic but not to the extent of RMSE. Other options include the geometric average error (GAE) and the harmonic average error (HAE) (Couckuyt 2013) GAEðy, eyÞ ¼
N qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Y ðyi eyi Þ2
!1=N ,
ð2:47Þ
i¼1
0
11 N X 1 B1 C qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiA : HAEðy, eyÞ ¼ @ N i¼1 2 ðyi eyi Þ
ð2:48Þ
Both GAE and HAE are optimistic error functions as they are dominated by the small error terms. More properties of both functions can be found in the literature (Li and Zhao 2006; Bullen 2003; Mitchell 2004).
50
2 Basics of Data-Driven Surrogate Modeling
Relative error measures are generally preferred because of being more context independent. A large variety of error functions are used in practice, including, among others, the average relative error (ARE), root relative square error (RRSE), and Bayesian estimation error quotient (BEEQ) (Gorissen et al. 2010) 1 X jyi eyi j , N i¼1 jyi j sP ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi N ðyi eyi Þ2 , RRSEðy, eyÞ ¼ Pi¼1 N 2 i¼1 ðyi yi Þ !1=N N PN Y e y y j j i i BEEQðy, eyÞ ¼ : Pi¼1 N i¼1 i¼1 jyi yi j N
AREðy, eyÞ ¼
ð2:49Þ
ð2:50Þ
ð2:51Þ
The meaning of ARE is intuitively clear; however, it becomes problematic when the true system outputs are close to zero. RRSE provides a solution to these issues, and it is intuitively attractive because it measures the improvement of the model fit over the mean (Couckuyt 2013). Yet, RRSE gives pessimistic estimate if the response to be fitted is very smooth (i.e., the mean is already a good fit). BEEQ offers some improvements over both ARE and RRSE, e.g., it is less sensitive to large errors (Gorissen et al. 2010). Having selected the error function, the model generalization capability has to be estimated, which can be realized using several approaches. Clearly, it is not a good idea to evaluate the model quality merely based on its performance on the training set where, in particular, interpolative models (such as kriging) exhibit zero error by definition. Some of the techniques described above identify a surrogate model together with some estimation of the attendant approximation error (e.g., kriging or Gaussian process regression). Alternatively, there are procedures that can be used in a stand-alone manner to validate the prediction capability of a given model beyond the set of training points. A simple and probably the most popular way for validating a model is the split-sample method (Queipo et al. 2005), also referred to as a validation set methods, where part of the available data set (the training set) is used to construct the surrogate, whereas the second part (the test set) serves purely for model validation. However, the error estimated by a split-sample method depends strongly on how the set of data samples is partitioned. In particular, it may give extremely biased results if the testing set is selected poorly or if only a few points are available. A more accurate estimation of the model generalization error can be obtained using cross-validation (Queipo et al. 2005; Geisser 1993). In cross-validation the data set is divided into K subsets, and each of these subsets is sequentially used as testing set for a surrogate constructed on the other K – 1 subsets. The prediction error can be estimated with all the K error measures obtained in this process (e.g., as an average value). Its extreme version, the leave-one-out error (Vehtari et al. 2017) only uses a single point at a time for error estimation. Cross-validation provides an error
2.4 Model Validation
51
Fig. 2.9 Graphical illustration of the cross-validation process. In each experiment, 1/K of available data set is used for the testing purposes, whereas the remaining samples serve as the training set for temporary model construction. The generalization error estimate is rendered by averaging the errors obtained in all K experiments over the respective testing sets
estimation that is less biased than with the split-sample method. The disadvantage of this method is that the surrogate has to be constructed more than once. Figure 2.9 shows a conceptual illustration of the cross-validation process. Yet another approach is bootstrapping (Hall 1986), which is also an iterative procedure but different from cross-validation because the testing set is randomly selected from the available data at each iteration. It has been shown to work better than cross-validation in many cases (Efron and Tibshirani 1993). As mentioned before, selection of the error function and the method for estimating the generalization capability, as well as the target accuracy, are all nontrivial tasks. In practice, some initial guesses concerning these factors are made, and the model is constructed upon acquiring the training data. The decisive stage is typically visual inspection of the model responses and its agreement with the system outputs. Upon this assessment, the modeling procedure may need to be repeated with the changed setup (Couckuyt 2013). The surrogate modeling flow, i.e., the procedure of allocating samples, acquiring data, model identification, and validation, can be repeated until the prescribed surrogate accuracy level is reached. In each repetition, a new set of training samples is added to the existing ones. Some of the strategies of allocating the new samples, especially the exploitative ones (Forrester and Keane 2009), usually aim at improving the global accuracy of the model, i.e., inserting new samples at the locations where the estimated modeling error is the highest. The details concerning these strategies have been provided in Sect. 2.2.3. From the perspective of simulation-driven design, the main advantage of the approximation surrogates is that their evaluation cost is typically very low. Unfortunately, as outlined in Chap. 1 (see also Sect. 2.1), acquisition of the training data may incur considerable computational expenses, often unmanageable. As a matter of fact, especially when the system responses are highly nonlinear (a typical situation
52
2 Basics of Data-Driven Surrogate Modeling
for high-frequency structures), practical usefulness of data-driven surrogates is limited to just a few parameters with relatively narrow ranges. One of the main goals of this book is to introduce the ways of overcoming these issues in the context of models constructed for design and design optimization purposes. This will be discussed from Chap. 4 on.
References Ahadi, M., Prasad, A. K., & Roy, S. (2016). Hyperbolic polynomial chaos expansion (HPCE) and its application to statistical analysis of nonlinear circuits. Proceeding of IEEE 20th Workshop on Signal and Power Integrity (SPI). Turin. pp. 1–4. Ai, M., Kong, X., & Li, K. (2016). A general theory for orthogonal array based latin hypercube sampling. Statistica Sinica, 26(2), 761–777. Alba, E., & Marti, R. (Eds.). (2006). Metaheuristic procedures for training neural networks. New York: Springer. Alexandrov, N. M., Dennis, J. E., Lewis, R. M., & Torczon, V. (1998). A trust-region framework for managing the use of approximation models in optimization. Structural Optimization, 15(1), 16–23. Andrés, E., Salcedo-Sanz, S., Monge, F., & Pérez-Bellido, A. M. (2012). Efficient aerodynamic design through evolutionary programming and support vector regression algorithms. International Journal of Expert Systems with Applications, 39, 10700–10708. Angiulli, G., Cacciola, M., & Versaci, M. (2007). Microwave devices and antennas modelling by support vector regression machines. IEEE Transactions on Magnetics, 43(4), 1589–1592. Arlot, S., & Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistics Surveys, 4, 40–79. Bandler, J. W., Biernacki, R. M., Chen, S. H., Grobelny, P. A., & Ye, S. (1993). Yield-driven electromagnetic optimization via multilevel multidimensional models. IEEE Transactions on Microwave Theory and Techniques, 41(12), 2269–2278. Basl, P. A. W., Gohary, R. H., Bakr, M. H., & Mansour, R. R. (2010). Modelling of electromagnetic responses using a robust multi-dimensional Cauchy interpolation technique. IET Microwaves, Antennas and Propagation, 4(11), 1955–1964. Baur, U., Benner, P., & Feng, L. (2014). Model order reduction for linear and nonlinear systems: A system-theoretic perspective. Archives of Computational Methods in Engineering, 21(4), 331–358. Beachkofski, B., & Grandhi, R. (2002). Improved distributed hypercube sampling, American Institute of Aeronautics and Astronautics, Paper AIAA, 2002–1274. Belyaev, M., Burnaev, E., & Kapushev, Y. (2015). Gaussian process regression for structured data sets. In A. Gammerman, V. Vovk, & H. Papadopoulos (Eds.), Statistical learning and data sciences (Lecture Notes in Computer Science) (Vol. 9047). Cham: Springer. Berveiller, M., Sudret, B., & Lemaire, M. (2006). Stochastic finite elements: A non intrusive approach by regression. European Journal of Computational Mechanics, 15(1–3), 81–92. Biegler, L. T., Lang, Y., & Lin, W. (2014). Multi-scale optimization for process systems engineering. Computers & Chemical Engineering, 60(10), 17–30. Blatman, G. (2009). Adaptive sparse polynomial chaos expansions for uncertainty propagation and sensitivity analysis, PhD Thesis. Universite Blaise Pascal, Clermont-Ferrand, France. Blatman, G., & Sudret, B. (2010). An adaptive algorithm to build up sparse polynomial chaos expansions for stochastic finite element analysis. Probabilistic Engineering Mechanics, 25(2), 183–197. Breitkopf, P., Rassineux, A., & Villon, P. (2002). An introduction to moving least squares meshfree methods. Revue Europ. Elements Finis, 11(7–8), 825–867.
References
53
Bullen, P. S. (2003). Handbook of means and their inequalities (Mathematics and its Applications) (Vol. 560). Dordrecht/Boston/London: Kluwer Academic. Ceperic, V., & Baric, A. (2004). Modeling of analog circuits by using support vector regression machines. Proceedings of the 2004 11th IEEE International Conference on Electronics, Circuits and Systems. Tel-Aviv. pp. 391–394. Chávez-Hurtado, J. L., & Rayas-Sánchez, J. E. (2016). Polynomial-based surrogate modeling of RF and microwave circuits in frequency domain exploiting the multinomial theorem. IEEE Transactions on Microwave Theory and Techniques, 64(12), 4371–4438. Chen, V. C. P., Tsui, K.-L., Barton, R. R., & Meckesheimer, M. (2005). A review on design, modeling and applications of computer experiments. IIE Transactions, 38(4), 273–291. Cheng, Q. S., Koziel, S., & Bandler, J. W. (2006). Simplified space mapping approach to enhancement of microwave device models. International Journal of RF and Microwave Computer-Aided Engineering, 16(5), 518–535. Chi, H., Mascagni, M., & Warnock, T. (2005). On the optimal Halton sequence. Mathematics and Computers in Simulation, 70, 9–21. Christodoulou, C., & Georgiopoulos, M. (2001). Applications of neural networks in electromagnetics. Norwood: Artech House. Chugh, T., Sindhya, K., Hakanen, J., & Miettinen, K. (2019). A survey on handling computationally expensive multiobjective optimization problems with evolutionary algorithms. Soft Computing, 23(9), 3137–3166. Couckuyt, I. (2013). Forward and inverse surrogate modeling of computationally expensive problems, PhD Thesis. Ghent University. Crombecq, K., Gorissen, D., Tommasi, L. D., & Dhaene, T. (2009). A novel sequential design strategy for global surrogate modeling. Proceeding 41st Winter Simulation Conference. pp. 731–742. Crombecq, K., Laermans, E., & Dhaene, T. (2011). Efficient space-filling and non-collapsing sequential design strategies for simulation-based modeling. European Journal of Operational Research, 214(3), 683–696. Davis, E., & Ierapetritou, M. (2010). A centroid-based sampling strategy for kriging global modeling and optimization. AICHE Journal, 56(1), 220–240. Devabhaktuni, V. K., Yagoub, M. C. E., & Zhang, Q. J. (2001). A robust algorithm for automatic development of neural-network models for microwave applications. IEEE Transactions on Microwave Theory and Techniques, 49(12), 2282–2291. Dharchoudhury, A., & Kang, S. M. (1995). Worst-case analysis and optimization of VLSI circuit performances. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 14(4), 481–492. Dorica, M., & Giannacopoulos, D. D. (2006). Response surface space mapping for electromagnetic optimization. IEEE Transactions on Magnetics, 42(4), 1123–1126. Du, J., & Roblin, C. (2017). Statistical modeling of disturbed antennas based on the polynomial chaos expansion. IEEE Antennas and Wireless Propagation Letters, 16, 1843–1846. Efron, B., & Tibshirani, R. (1993). Introduction to the bootstrap. New York: Chapman & Hall. Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32(2), 407–499. Fang, Y. H., Yagoub, M. C. E., Wang, F., & Zhang, Q. J. (2000). A new macromodeling approach for nonlinear microwave circuits based on recurrent neural networks. IEEE Transactions on Microwave Theory and Techniques, 48(12), 2335–2344. Fasshauer, G. E., & McCourt, M. J. (2012). Stable evaluation of Gaussian radial basis function interpolants. SIAM Journal on Scientific Computing, 34(2), A737–A762. Fernández-Godino, M. G., Park, C., Kim, N. H., & Haftka, R. T. (2019). Issues in deciding whether to use multifidelity surrogates. AIAA Journal, 57(5), 2039–2054. Forrester, A. I. J., & Keane, A. J. (2009). Recent advances in surrogate-based optimization. Progress in Aerospace Sciences, 45(1), 50–79. Forrester, A. I. J., Sóbester, A., & Keane, A. J. (2007). Multi-fidelity optimization via surrogate modelling. Proceeding of the Royal Society A: Mathematical, Physical and Engineering Sciences, 463(2088).
54
2 Basics of Data-Driven Surrogate Modeling
Gander, W., & Gautschi, W. (2000). Adaptive quadrature revisited. BIT Numerical Mathematics, 40(1), 84–101. Geisser, S. (1993). Predictive inference. New York/London: Chapman and Hall. Giunta, A. A., Wojtkiewicz, S. F., & Eldred, M. S. (2003). Overview of modern design of experiments methods for computational simulations. Paper AIAA. pp. 2003–0649. Goel, T., Haftka, R. T., Shyy, W., & Queipo, N. V. (2007). Ensemble of surrogates. Structural and Multidisciplinary Optimization, 33(3), 199–216. Goel, T., Haftka, R. T., & Shyy, W. (2009). Comparing error estimation measures for polynomial and kriging approximation of noise-free functions. Structural and Multidisciplinary Optimization, 38(5), 429–442. Golub, G. H., & Van Loan, C. F. (1996). Matrix computations (3rd ed.). Baltimore: Johns Hopkins University Press. Gong, F., Liu, X., Yu, H., Tan, S. X. D., Ren, J., & He, L. (2012). A fast non-Monte-Carlo yield analysis and optimization by stochastic orthogonal polynomials. ACM Transactions on Design Automation of Electronic Systems, 17(10), 1–23. Gorissen, D., Dhaene, T., & De Turck, F. (2009). Evolutionary model type selection for global surrogate modeling. Journal of Machine Learning Research, 10, 2039–2078. Gorissen, D., Crombecq, K., Couckuyt, I., Dhaene, T., & Demeester, P. (2010). A surrogate modeling and adaptive sampling toolbox for computer based design. Journal of Machine Learning Research, 11, 2051–2055. Goudos, S. (Ed.). (2017). Microwave systems and applications. London: IntechOpen. Gunn, S. R. (1998). Support vector machines for classification and regression, Technical Report. School of Electronics and Computer Science, University of Southampton. Hall, P. (1986). On the bootstrap and confidence intervals. The Annals of Statistics, 14, 1432–1452. Hansen, P. C. (1992). Analysis of discrete ill-posed problems by means of the L-curve. SIAM Review, 34, 561–580. Hausmair, K., Gustafsson, S., Sanchez Perez, C., Landin, P. N., Gustavsson, U., Eriksson, T., & Fager, C. (2017). Prediction of nonlinear distortion in wideband active antenna arrays. IEEE Transactions on Microwave Theory and Techniques, 65(11), 4550–4563. Haykin, S. (1998). Neural networks: A comprehensive foundation (2nd ed.). Upper Saddle River: Prentice Hall. Hong, X., Mitchell, R. J., Chen, S., Harris, C. J., Li, K., & Irwin, G. W. (2008). Model selection approaches for non-linear system identification: A review. International Journal of Systems Science, 39(10), 925–946. Huang, L., & Gao, Z. (2012). Wing-body optimization based on multi-fidelity surrogate model. 28th International Congress of the Aeronautical Sciences. Brisbane. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. New York: Springer. Jin, Y. (2005). A comprehensive survey of fitness approximation in evolutionary computation. Soft Computing, 9(1), 3–12. Jin, R., Chen, W., & Simpson, T. (2001). Comparative studies of metamodelling techniques under multiple modelling criteria. Structural and Multidisciplinary Optimization, 23(1), 1–3. Jones, D., Schonlau, M., & Welch, W. (1998). Efficient global optimization of expensive black-box functions. Journal of Global Optimization, 13, 455–492. Joseph, V. R., & Hung, Y. (2008). Orthogonal-maximin latin hypercube designs. Statistica Sinica, 18, 171–186. Journel, A. G., & Huijbregts, C. J. (1981). Mining geostatistics. London: Academic Press. Kaintura, A., Dhaene, T., & Spina, D. (2018). Review of polynomial chaos-based methods for uncertainty quantification in modern integrated circuits. Electronics, 7(30), 1–21. Khuri, A. I., & Mukhopadhyay, S. (2010). Response surface methodology: Advanced review. Computational Statistics, 2(2), 128–149. Kim, K. K., Shen, D. E., Nagy, Z. K., & Braatz, R. D. (2013). Wiener’s polynomial chaos for the analysis and control of nonlinear dynamical systems with probabilistic uncertainties [historical perspectives]. IEEE Control Systems Magazine, 33(5), 58–67.
References
55
Kleijnen, J. P. C. (2009). Kriging metamodeling in simulation: A review. European Journal of Operational Research, 192(3), 707–716. Kleijnen, J. P. C. (2018). Design and analysis of simulation experiments. In J. Pilz, D. Rasch, V. Melas, & K. Moder (Eds.), Statistics and simulation. IWS 2015. Springer Proceedings in Mathematics & Statistics (Vol. 231). Cham: Springer. Koehler, J. R., & Owen, A. B. (1996). Computer experiments. In S. Ghosh & C. R. Rao (Eds.), Handbook of statistics (Vol. 13, pp. 261–308). Elsevier Science B.V. Koziel, S., & Leifsson, L. (2016). Simulation-driven design by knowledge-based response correction techniques. Cham: Springer. Koziel, S., Ogurtsov, S., Couckuyt, I., & Dhaene, T. (2013). Variable-fidelity electromagnetic simulations and co-kriging for accurate modeling of antennas. IEEE Transactions on Antennas and Propagation, 61(3), 1301–1308. Koziel, S., & Ogurtsov, S. (2019). Simulation-based optimization of antenna arrays. London: World Scientific. Laurenceau, J., & Sagaut, P. (2008). Building efficient response surfaces of aerodynamic functions with kriging and cokriging. AIAA Journal, 46, 498–507. Leary, S., Bhaskar, A., & Keane, A. (2003). Optimal orthogonal-array-based latin hypercubes. Journal of Applied Statistics, 30, 585–598. Lehmensiek, R., Meyer, P., & Muller, M. (2002). Adaptive sampling applied to multivariate, multiple output rational interpolation models with application to microwave circuits. International Journal of RF and Microwave Computer-Aided Engineering, 12(4), 332–340. Levin, D. (1998). The approximation power of moving least-squares. Mathematics of Computation, 67, 1517–1531. Li, X. (2010). Finding deterministic solution from underdetermined equation: Largescale performance modeling of analog/RF circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 29(11), 1661–1668. Li, Y. F., & Lan, C. C. (1989). Development of fuzzy algorithms for servo systems. IEEE Control Systems Magazine, 9(3), 65–72. Li, X. R., & Zhao, Z. (2006). Evaluation of estimation algorithms part I: Incomprehensive measures of performance. IEEE Transactions on Aerospace and Electronic Systems, 42(4), 1340–1358. Li, X., Le, J., Gopalakrishnan, P., & Pileggi, L. (2007). Asymptotic probability extraction for nonnormal performance distributions. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 26(1), 16–37. Ling, Y., & Mahadevan, S. (2013). Quantitative model validation techniques: New insights. Reliability Engineering & System Safety, 111, 217–231. Liu, J. S. (2004). Multilevel sampling and optimization methods. In Monte Carlo strategies in scientific computing (Springer Series in Statistics) (pp. 205–244). New York: Springer. Liu, X., & Fu, W. N. (2016). A dynamic dual-response-surface methodology for optimal design of a permanent-magnet motor using finite-element method. IEEE Transactions on Magnetics, 52(3), 1–4. Liu, J., Han, Z., & Song, W. (2012). Comparison of infill sampling criteria in kriging-based aerodynamic optimization, 28th International Congress of the Aeronautical Sciences. Brisbane. Liu, Z., Yang, M., & Li, W. (2016). A sequential Latin hypercube sampling method for metamodeling. In L. Zhang, X. Song, & Y. Wu (Eds.), Theory, methodology, tools and applications for modeling and simulation of complex systems (AsiaSim 2016, Communication in Computer and Information Science) (Vol. 643, pp. 176–185). New York: Springer. Lophaven, S. N., Nielsen, H. B., & Søndergaard, J. (2002). DACE: A Matlab kriging toolbox. Lyngby: Technical University of Denmark. Löschenbrand, D., & Mecklenbrauker, C. (2016). Fast antenna characterization via a sparse spherical multipole expansion. 4th International Workshop on Compressed Sensing Theory and its Applications to Radar, Sonar and Remote Sensing. Aachen. pp. 212–216. Ma, X., & Zabaras, N. (2010). An adaptive high-dimensional stochastic model representation technique for the solution of stochastic partial differential equations. Journal of Computational Physics, 229, 3884–3915.
56
2 Basics of Data-Driven Surrogate Modeling
MacKay, D. J. C. (1993). Bayesian methods for backpropagation networks. In J. L. van Hemmen, E. Domany, & K. Schulten (Eds.), Models of neural networks II. New York: Springer. Manfredi, P., Ginste, D. V., Stievano, I. S., De Zutter, D., & Canavero, F. G. (2017). Stochastic transmission line analysis via polynomial chaos methods: an overview. IEEE Electromagnetic Compatibility Magazine, 6(3), 77–84, Third Quarter 2017. McKay, M., Conover, W., & Beckman, R. (1979). A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 21, 239–245. Meng, J., & Xia, L. (2007). Support-vector regression model for millimeter wave transition. International Journal of Infrared and Milimeter Waves, 28, 413–421. Minsky, M. I., & Papert, S. A. (1969). Perceptrons: An introduction to computational geometry. Cambridge, MA: The MIT Press. Mishra, R. K. (2001). An overview of neural network methods in computational electromagnetics. International Journal of RF and Microwave Computer-Aided Engineering, 12(1), 98–108. Mishra, S., Yadav, R. N., & Singh, R. P. (2015). Directivity estimations for short dipole antenna arrays using radial basis function neural networks. IEEE Antennas and Wireless Propagation Letters, 14, 1219–1222. Mitchell, D. W. (2004). More on spreads and non-arithmetic means. Mathematical Gazette, 88, 142–144. Montegranario, H., & Espinosa, J. (2014). Radial basis functions. In Variational regularization of 3D Data (Springer Briefs in Computer Science). New York: Springer. O’Hagan, A. (1978). Curve fitting and optimal design for predictions. Journal of Royal Statistical Society B, 40, 1–42. Ou, G., & Murphey, Y. L. (2007). Multi-class pattern classification using neural networks. Pattern Recognition, 40(1), 4–18. Palmer, K., & Tsui, K.-L. (2001). A minimum bias latin hypercube design. IIE Transactions, 33, 793–808. Park, D., Chung, I. B., & Choi, D. H. (2018). Surrogate based global optimization using adaptive switching infill sampling criterion. In A. Schumacher, T. Vietor, S. Fiebig, K. U. Bletzinger, & K. Maute (Eds.), Advances in structural and multidisciplinary optimization. WCSMO 2017 (pp. 692–699). Cham: Springer. Passino, K. M., & Yurkovich, S. (1998). Fuzzy control. Menlo Park: Addison Wesley Longman Inc. Qian, P. Z. G. (2009). Nested Latin hypercube designs. Biometrika, 96(4), 957–970. Queipo, N. V., Haftka, R. T., Shyy, W., Goel, T., Vaidynathan, R., & Tucker, P. K. (2005). Surrogate-based analysis and optimization. Progress in Aerospace Sciences, 41(1), 1–28. Rangel-Patiño, F. E., Chávez-Hurtado, J. L., Viveros-Wacher, A., Rayas-Sánchez, J. E., & Hakim, N. (2017). System margining surrogate-based optimization in post-silicon validation. IEEE Transactions on Microwave Theory and Techniques, 65(9), 3109–3115. Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. Cambridge, MA: MIT Press. Rawat, A., Yadav, R. N., & Shrivastava, S. C. (2012). Neural network applications in smart antenna arrays: A review. AEU - International Journal of Electronics and Communications, 66(11), 903–912. Rayas-Sanchez, J. E. (2004). EM-based optimization of microwave circuits using artificial neural networks: The state-of-the-art. IEEE Transactions on Microwave Theory and Techniques, 52 (1), 420–435. Rayas-Sanchez, J. E., Aguilar-Torrentera, J., & Jasso-Urzúa, J. A. (2010). Surrogate modeling of microwave circuits using polynomial functional interpolants. IEEE MTT-S International Microwave Symposium. Anaheim. pp. 197–200. Rayas-Sanchez, J. E., Chávez-Hurtado, J. L., & Brito-Brito, Z. (2017). Optimization of full-wave EM models by low-order low-dimension polynomial surrogate functionals. International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, 30(3–4), e2094.
References
57
Rojo-Alvarez, J. L., Camps-Valls, G., Martinez-Ramon, M., Soria-Olivas, E., Navia-Vazquez, A., & Figueiras-Vidal, A. R. (2005). Support vector machines framework for linear signal processing. Signal Processing, 85, 2316–2326. Rossi, J. O., & Rizzo, P. N. (2009). Study of hybrid nonlinear transmission lines for high power RF generation. 2009 IEEE Pulsed Power Conference. Washington, D.C. pp. 46–50. Rutenbar, R., Gielen, G., & Roychowdhury, J. (2007). Hierarchical modeling, optimization, and synthesis for system-level analog and RF designs. Proceedings of the IEEE, 95(3), 640–669. Santana-Quintero, L. V., Montaño, A. A., & Coello, C. A. C. (2010). A review of techniques for handling expensive functions in evolutionary multi-objective optimization. In Y. Tenne & C. K. Goh (Eds.), Computational intelligence in expensive optimization problems. Adaptation learning and optimization (Vol. 2). Berlin/Heidelberg: Springer. Santner, T. J., Williams, B., & Notz, W. (2003). The design and analysis of computer experiments. New York: Springer. Santner, T. J., Williams, B. J., & Notz, W. I. (2018). Space-filling designs for computer experiments. In The design and analysis of computer experiments (Springer Series in Statistics). New York: Springer. Sasena, M., Parkinson, M., & Goovaerts, P. (2002). Adaptive experimental design applied to an ergonomics testing procedure, ASME 2002 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. Montreal. Sengupta, M., Saxena, S., Daldoss, L., Kramer, G., Minehane, S., & Cheng, J. (2005). Applicationspecific worst case corners using response surfaces and statistical models. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 24(9), 1372–1380. Shaker, G. S. A., Bakr, M. H., Sangary, N., & Safavi-Naeini, S. (2009). Accelerated antenna design methodology exploiting parameterized Cauchy models. Progress in Electromagnetic Research (PIER B), 18, 279–309. Simpson, T. W., Peplinski, J., Koch, P. N., & Allen, J. K. (2001). Metamodels for computer-based engineering design: Survey and recommendations. Engineering Computers, 17, 129–150. Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14, 199–222. Søndergaard, J. (2003). Optimization using surrogate models – by the space mapping technique, Ph.D. Thesis, Informatics and Mathematical Modelling, Technical University of Denmark, Lyngby. Stuart, A., & Ord, K. (1994). Kendall’s advanced theory of statistics, Vol. 1: Distribution theory. London: Arnold. Sudret, B. (2008). Global sensitivity analysis using polynomial chaos expansions. Reliability Engineering and System Safety, 93(7), 964–979. Sugiyama, M. (2006). Active learning in approximately linear regression based on conditional expectation of generalization error. Journal of Machine Learning Research, 7, 141–166. Takahashi, S., Chen, Y., & Tanaka-Ishii, K. (2019). Modeling financial time-series with generative adversarial networks. Physica A: Statistical Mechanics and its Applications, 527, 1–12. Tao, J., Liao, C., Zeng, X., & Li, X. (2016). Harvesting design knowledge from internet: Highdimensional performance trade-off modeling for large-scale analog circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35(1), 23–36. Tikhonov, A. N., & Arsenin, V. Y. (1977). Solutions to ill-posed problems. New York: Wiley. Toal, D. J. J., & Keane, A. J. (2011). Efficient multipoint aerodynamic design optimization via cokriging. Journal of Aircraft, 48, 1685–1695. Tong, C. (2006). Refinement strategies for stratified sampling algorithms. Reliability Engineering and System Safety, 91(10–11), 1257–1265. Tropp, J. A., & Gilbert, A. C. (2007). Signal recovery from random measurements via orthogonal matching pursuit. IEEE Transactions on Information Theory, 53(12), 4655–4666. van Dam, E. R., Husslage, B., den Hertog, D., & Melissen, H. (2005). Maximin Latin hypercube designs in two dimensions. (CentER Discussion Paper, no. 2005–008). van Dam, E. R., Husslage, B., den Hertog, D., & Melissen, H. (2007). Maximin Latin hypercube design in two dimensions. Operations Research, 55, 158–169.
58
2 Basics of Data-Driven Surrogate Modeling
van der Herten, J., Couckuyt, I., Deschrijver, D., & Dhaene, T. (2015). A fuzzy hybrid sequential design strategy for global surrogate modeling of high-dimensional computer experiments. SIAM Journal on Scientific Computing, 37(2), A1020–A1039. Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-oneout cross-validation and WAIC. Journal of Statistics and Computing, 27(5), 1413–1432. Viana, F. A. C., Venter, G., & Balabanov, V. (2009). An algorithm for fast optimal Latin hypercube design of experiments. International Journal for Numerical Methods in Engineering, 82, 135–156. Wang, G. G. (2003). Adaptive response surface algorithm using inherited latin hypercube design points. Journal of Mechanical Design, 125(2), 210–220. Wang, L.-X., & Mendel, J. M. (1992). Generating fuzzy rules by learning from examples. IEEE Transactions on Systems, Man, and Cybernetics, 22(6), 1414–1427. Wang, G., & Shan, S. S. (2006). Review of metamodeling techniques in support of engineering design optimization. Journal of Mechanical Design, 129(4), 370–380. Wang, F., Cachecho, P., Zhang, W., Sun, S., Li, X., Kanj, R., & Gu, C. (2016). Bayesian model fusion: Large-scale performance modeling of analog and mixed-signal circuits by reusing earlystage data. IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, 35(8), 1255–1268. Wiener, N. (1938). The homogeneous chaos. American Journal of Mathematics, 60, 897–936. Wild, S. M., Regis, R. G., & Shoemaker, C. A. (2008). ORBIT: Optimization by radial basis function interpolation in trust-regions. SIAM Journal on Scientific Computing, 30, 3197–3219. Woods, D. C., & Lewis, S. M. (2015). Design of experiments for screening. In R. Ghanem, D. Higdon, & H. Owhadi (Eds.), Handbook of uncertainty quantification. Cham: Springer. Wu, X., Peng, X., Chen, W., & Zhang, W. (2019). A developed surrogate-based optimization framework combining HDMR-based modeling technique and TLBO algorithm for highdimensional engineering problems. Structural and Multidisciplinary Optimization, 60(2), 663–680. Xia, L., Xu, R. M., & Yan, B. (2007). LTCC interconnect modeling by support vector regression. Progress In Electromagnetics Research, 69, 67–75. Xiong, F., Xiong, Y., Chen, W., & Yang, S. (2009). Optimizing Latin hypercube design for sequential sampling of computer experiments. Engineering Optimization, 41(8), 793–810. Xiu, D., & Karniadakis, G. E. (2002). The Wiener-Askey polynomial chaos for stochastic differential equations. Journal of Scientific Computing, 24(2), 619–644. Xu, J. J., Yagoub, M. C. E., Ding, R., & Zhang, Q. J. (2002). Neural-based dynamic modeling of nonlinear microwave circuits. IEEE Transactions on Microwave Theory and Techniques, 50 (12), 2769–2780. Yan, Z., & Wang, J. (2015). Nonlinear model predictive control based on collective neurodynamic optimization. IEEE Transactions on Neural Networks and Learning Systems, 26(4), 840–850. Yang, Y., Hu, S. M., & Chen, R. S. (2005). A combination of FDTD and least-squares support vector machines for analysis of microwave integrated circuits. Microwave and Optical Technology Letters, 44, 296–299. Ye, K. Q. (1998). Orthogonal column latin hypercubes and their application in computer experiments. Journal of the American Statistical Association, 93, 1430–1439. Ye, K. Q., Li, W., & Sudjianto, A. (2000). Algorithmic construction of optimal symmetric Latin hypercube designs. Journal of Statistical Planning and Inference, 90(1), 145–159. Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353. Zhang, Q. J., & Gupta, K. C. (2000). Neural networks for RF and microwave design. Norwood: Artech House. Zhang, K., & Han, Z. (2013). Support vector regression-based multidisciplinary design optimization in aircraft conceptual design, AIAA Aerospace Sciences Meeting, AIAA paper 2013–1160. Zhang, Q. J., Gupta, K. C., & Devabhaktuni, V. K. (2003). Artificial neural networks for RF and microwave design: From theory to practice. IEEE Transactions on Microwave Theory and Techniques, 51(4), 1339–1350.
Chapter 3
Physics-Based Surrogate Modeling
Physics-based models constitute the second major class of surrogates. Although they are not as popular as the data-driven models outlined in Chap. 2, their importance is growing because of the challenges related to construction and handling of approximation surrogates for many real-world problems. The high cost of evaluating computational models, nonlinearity of system responses, dimensionality issues, as well as combinations of these factors may lead to a situation, where setting up a datadriven model is not possible. On the other hand, incorporation of the problemspecific knowledge, typically in the form of a lower-fidelity computational model, often alleviates the aforementioned difficulties. Enhancement of the low-fidelity models using a limited amount of high-fidelity data is the essence of physics-based surrogate modeling. This chapter provides a brief characterization of this class of surrogates, explains the concept and various types of low-fidelity models, as well as outlines several specific modeling approaches, also in the context of surrogateassisted optimization.
3.1
Overview
Data-driven surrogates outlined in Chap. 2 are convenient solutions for many practical modeling problems. Their major advantages include versatility, low evaluation cost, a wide selection of well-established techniques, but also availability of ready-to-use toolboxes implementing these methods within popular programming environments, especially Matlab. However, this class of models exhibits some serious limitations as already mentioned in Chaps. 1 and 2. The major issues result from the high cost of training data acquisition, the curse of dimensionality, as well as wide parameter ranges that the models should be valid for in order to represent reasonable ranges of operating conditions, material parameters, etc. In the case of high-frequency structures such as microwave and antenna components, an additional difficulty is high nonlinearity of the system responses as well as the fact that the © Springer Nature Switzerland AG 2020 S. Koziel, A. Pietrenko-Dabrowska, Performance-Driven Surrogate Modeling of High-Frequency Structures, https://doi.org/10.1007/978-3-030-38926-0_3
59
60
3 Physics-Based Surrogate Modeling
vector-valued outputs have to be handled most of the time (Rayas-Sanchez et al. 2017; Goudos 2017; Koziel and Ogurtsov 2014; Petosa 2007). Physics-based models offer at least partial alleviation of the aforementioned difficulties (Bandler et al. 2003; Cheng et al. 2006; Koziel et al. 2016; Dorica and Giannacopoulos 2006; Zhang et al. 2018). The phrase “physics-based” originates from the fact that an important component of the modeling process is the problemspecific knowledge embedded into the surrogate, most often in the form of a lowerfidelity model that undergoes a suitable correction or enhancement (Bandler et al. 2004). The latter is typically realized using a limited amount of high-fidelity data (Bandler et al. 2008) which, in extreme cases, may be just a single sample point (Fernández-Godino et al. 2019). The correction process aims at improving the alignment between the low-fidelity model and the high-fidelity model, either locally or across the entire design space. In the case of high-frequency structures, the low-fidelity model may be implemented as an equivalent circuit (Bandler et al. 2004), coarse-discretization EM simulation model (Koziel and Ogurtsov 2014), or, in rare cases, analytical or semiempirical formulas (Koziel et al. 2014). More detailed exposition of low-fidelity modeling will be presented in Sect. 3.2. The fundamental features of physics-based surrogates can be summarized as follows: • The models incorporate the problem-specific knowledge, typically in the form of an underlying low-fidelity model. • The surrogate is constructed by aligning the low-fidelity model with the highfidelity one using a limited amount of information from the latter, usually by means of appropriately formulated nonlinear regression. • The low- and high-fidelity models are normally well correlated which translates into good generalization capability of the surrogate. • Because the low-fidelity model often involves computer simulation, the physicsbased surrogates are not as cheap to evaluate as the data-driven ones. • Physics-based models are more immune to the curse of dimensionality and can often be established over wider ranges of parameters as compared to the approximation models. • Their versatility is limited (i.e., the surrogates are not easily transferrable between application areas) because each problem entails the development of dedicated low-fidelity models. • Physics-based surrogates are more difficult to handle than data-driven models because the establishment of the low-fidelity model as well as its appropriate correction requires certain experience and engineering insight. • Availability of surrogates of this class is rather limited (e.g., there are no generalpurpose Matlab toolboxes, etc.). Thus, the fundamental advantage of physics-based surrogates is that they overcome (to a certain extent) the issues affecting approximation models at the expense of limited versatility and computational efficiency. As a matter of fact, the primary application area of this class of models is (typically local) surrogate-based optimization where the alignment between the coarse model and the high-fidelity model is
3.2 Low-Fidelity Modeling
61
of a concern mostly along the optimization path (Koziel et al. 2011). In this sort of setup, the surrogate is often updated before each iteration of the optimization algorithm and using a very small number of high-fidelity data points (often, just one) (Bandler et al. 2003; Bandler et al. 2004; Salleh et al. 2008). This will be discussed at some length in Sect. 3.6. Construction of the physics-based surrogate model involves several steps: • Selection of the low-fidelity model. This step normally requires engineering insight and experience because a sufficient accuracy of the low-fidelity model may be difficult to quantify and often entails visual assessment of the system outputs, grid convergence studies, etc. (Salleh et al. 2008; Leifsson et al. 2012; Zhu et al. 2007). • Selection of the appropriate correction technique. This step is also area- and experience-dependent. The major types of correction techniques for surrogate model construction are outlined in Sect. 3.3. Among them response correction belongs to the most popular methods (cf. Sect. 3.4). • Design of experiments. In the case of physics-based surrogates, simple and economical DOE strategies are often utilized such as factorial designs (Kleijnen 2018) with a notable example of the star distribution (Giunta et al. 2003). • Acquisition of the high-fidelity model data and model identification. • Model validation, typically carried out similarly as for the data-driven surrogates (cf. Chap. 2). However, in the cases of surrogates constructed for the purpose of local optimization, validation is often omitted as long as the models ensure (e.g., by formulation) satisfaction of zero- and first-order consistency conditions (Alexandrov and Lewis 2001; Eldred and Dunlavy 2006). In this section the main concepts of physics-based surrogate modeling are outlined. More detailed treatment of the subject can be found in the literature (Koziel et al. 2011; Koziel and Ogurtsov 2014; Bekasiewicz et al. 2014; Robinson et al. 2008; Sarkar et al. 2019; Bakr et al. 2000).
3.2
Low-Fidelity Modeling
The fundamental component of any physics-based surrogate is the low-fidelity (or coarse) model, which is a simplified representation of the original, high-fidelity (or fine) computational model of the system of interest. The simplifications are intentional and introduced to make the model computationally cheaper as compared to the high-fidelity one. This is, however, achieved at the expense of certain accuracy loss. Maintaining appropriate balance between the speed and accuracy is one of the considerations that are important for physics-based modeling and optimization (Koziel 2017; Guan et al. 2008; Hazaveh et al. 2017). This section provides a general overview of the low-fidelity modeling, providing illustrations of model development through examples from the areas of high-frequency electronics (including microwave and antenna engineering) as well as aerodynamics.
62
3.2.1
3 Physics-Based Surrogate Modeling
Principal Properties and Techniques
The primary reason for using the low-fidelity model is a reduced evaluation cost. At the same time, the low-fidelity model should adequately represent the important features of the system of interest, which can be further translated into the reliability of the physics-based surrogate. For example, in design optimization, information from the surrogate is used to navigate the high-fidelity design space and search for the optimum. Therefore, the low-fidelity model needs to be able to capture the changes in the measures of merit with respect to the design parameters in the same way as the high-fidelity model. In particular, it should properly represent the trends, e.g., if a measure of merit of the high-fidelity model decreases due to design variable changes, the measure of merit at the low-fidelity model level has to decrease as well for the same design variable changes. However, the amount by which the measure of merit changes does not need to be the same for both models. In other words, the lowand the high-fidelity model have to be sufficiently well correlated (Perdikaris et al. 2017). The purpose of the surrogate model, in general, is not to replace any missing physics of the high-fidelity model (Smith 2014). Hence, it is up to the low-fidelity model to be capable of following the high-fidelity model characteristics within the design space. There are several generic approaches to low-fidelity model development that can be clearly distinguished (Alexandrov et al. 1998): 1. Simplified physics: the governing equations pertinent to the high-fidelity model are replaced by a set of simplified equations. These are often referred to as variable-fidelity physics models. 2. Coarse discretization: In this approach, the same governing equations are used as in the high-fidelity model but with a coarser computational grid discretization. Often referred to as variable-resolution models. 3. Relaxed convergence criteria: Reduce/relax the solver convergence criteria. Sometimes referred to as variable-accuracy models. 4. A combination of (1)–(3). Clearly, the above list is just a general classification. In practice, implementation of (1)–(4) is domain-specific. For example, in the area of high-frequency electronics (in the context of full-wave electromagnetic analysis), reducing discretization density of the structure at hand can be accompanied by a number of other simplifications, including: 1. Reduction of the computational domain and applying simple absorbing boundaries with the finite-volume methods implemented in full-wave EM simulations (Baumann et al. 2004; Liu and Gedney 2000). 2. Using low-order basis functions with the finite-element and moment method solvers (Schmidthausler and Clemens 2012; Kolundzija and Sumic 2004). 3. Using relaxed solution termination criteria such as the scattering parameter error for the frequency domain methods with adaptive meshing and residue energy for the time-domain solvers (Petrides and Demkowicz 2017; Caratelli and Yarovoy 2010).
3.2 Low-Fidelity Modeling
63
Moreover, simplifications of the physics for modeling of high-frequency structures, aside from using different governing physics, include (Obaidat et al. 2019): 1. Ignoring dielectric and metal losses as well as material dispersion if their impact to the simulated response is not significant. 2. Setting metallization thickness to zero for traces, strips, and patches. 3. Ignoring moderate anisotropy of substrates. 4. Energizing the antenna with discrete sources rather than waveguide ports. The provided lists are pertinent to full-wave electromagnetic models, whereas, in general, the particular simplification approaches depend on the engineering discipline and the specific type of simulation utilized in there. Sections 3.2.2 and 3.2.3 demonstrate the various low-fidelity modeling approaches using the examples from the areas of computational electromagnetics and aerodynamics.
3.2.2
Variable-Resolution and Variable-Accuracy Modeling
For the sake of illustration, let us consider a microstrip antenna shown in Fig. 3.1. Figure 3.2 shows the high- and low-fidelity computational models of the structure, both defined, discretized, and simulated using CST Microwave Studio (CST 2018). The finer model is discretized with about 410,000 mesh cells and evaluates in about 70 min, whereas the coarser model contains about 20,000 mesh cells and its simulation time is only 1 min. The responses of the models (reflection characteristics) are shown in Fig. 3.3a, along with responses of other models with various discretization levels. The figure indicates that the two “finest” coarse-discretization Fig. 3.1 Microstrip antenna: top/side views, substrates shown transparent (Koziel and Ogurtsov 2012)
64
3 Physics-Based Surrogate Modeling
Fig. 3.2 Variable-resolution models of the microstrip antenna in Fig. 3.1: (a) high-fidelity model shown with a fine tetrahedral mesh (~410, 000 cells) and (b) low-fidelity model shown with a much coarser mesh (~20, 000 cells). Both models are developed in CST Microwave Studio (Koziel and Ogurtsov 2012)
models (with ~400,000 and ~740,000 mesh cells) represent the high-fidelity model response (the model with ~1,600,000 mesh cells, shown as a thick solid line) quite properly. The model with ~270, 000 cells can be considered as a borderline one. The two remaining models could be considered as poor ones, particularly the model with ~20,000 cells; its response is essentially unreliable. The simulation time increases rapidly with the number mesh cells, as can be seen in Fig. 3.3b. It takes around 120 min to simulate the high-fidelity model, whereas the second finest model (~740,000 cells) takes 42 min, and the coarsest model needs roughly 1 min. As another example, consider two-dimensional transonic air flow past an airfoil (such as sections of a typical transport aircraft wing; Siegler et al. 2016). Typical computational meshes for the airfoil flow are shown in Fig. 3.4. Variations of the measures of merit (the lift (Cl) and drag (Cd) coefficients) with the mesh discretization are shown in Fig. 3.5a. Mesh f is the high-fidelity mesh (~408,000 cells), and meshes{ci}, i ¼ 1, 2, 3, 4, are the coarse meshes. Simulation time of the high-fidelity mesh is approximately 33 min (Fig. 3.5b). By visual inspection of Fig. 3.5a, one can see that mesh c4 is the most inaccurate model and only slightly faster than model c3. In this case, one can select model c3 as the low-fidelity model. This particular grid has roughly 32,000 cells and takes about 2 min to run. The flow solution history for the low-fidelity model is shown in Fig. 3.6a, and it indicates the lift and drag coefficients being nearly converged after 80–100 iterations. The maximum number of iterations can, therefore, be set to 100 for the low-fidelity model. This reduces the overall simulation time to 45 s. A comparison of the pressure distributions of the high- and low-fidelity models, shown in Fig. 3.6b, indicates that the low-fidelity model, in spite of being based
3.2 Low-Fidelity Modeling
a
65
0
|S11| [dB]
-5 -10 -15 -20 -25
3
3.5
4.5
4
5
Frequency [GHz]
Evaluation time [sec]
b
104
103
102 4 10
105
106
107
The number of mesh cells Fig. 3.3 Microstrip antenna of Fig. 3.1 at a selected design simulated with the CST transient solver (CST 2018): (a) reflection responses at different discretization densities, 19,866 cells (▪▪▪), 40,068 cells (∙ – ∙), 266,396 cells (– –), 413,946 cells (∙∙∙), 740,740 cells (—), and 1,588,608 cells (—), and (b) the antenna simulation time versus the number of mesh cells (Koziel and Ogurtsov 2012)
Fig. 3.4 Structured computational mesh close to the surface of a supercritical airfoil: (a) fine mesh and (b) coarse mesh (Koziel and Leifsson 2016)
on much coarser mesh and reduced flow solver iterations, captures the main features of the high-fidelity model pressure distribution quite well. The biggest discrepancy in the distributions is around the shock on the upper surface, leading to an overestimation of both lift and drag (see Fig. 3.5a).
66
b
80 c4
Cl (l.c.), Cd (d.c.)
70
Cl Cd
60
103 xf 102 f
50
c3
40
c2 c 1
f
Δ t [min]
a
3 Physics-Based Surrogate Modeling
xf
30
c1
101 c3 c4
0
20
c2
10
10 0 103
104
105
106
107
10-1 103
104
Number of Grid Cells
105
106
107
Number of Grid Cells
Fig. 3.5 Variation of the measures of merit with the number of grid cells: (a) lift (Cl) and drag (Cd) coefficients and (b) simulation time (Koziel and Leifsson 2016)
a
b
0.4 0.35 0.3
1
Cl 100 x Cd
High-fidelity Low-fidelity
0.5 -Cp
Cl , Cd
0.25 0.2 0.15
0 -0.5
0.1 0.05 0
-1 20 40 60 80 100 120 140 160 180 200 Iterations
0
0.2
0.4
0.6
0.8
1
x
Fig. 3.6 Low-fidelity model (model c3 in Fig. 3.5) responses: (a) evolution of the lift and drag coefficients during flow solver solution and (b) comparison of the pressure distribution with the high-fidelity model (Koziel and Leifsson 2016)
The ratio of the simulation times of the high- and low-fidelity model in this case is 44. However, in many cases, the solver does not fully converge with respect to the residuals and goes on up to the (user-specified) maximum allowable iterations. In those cases, the ratio of simulation times of the high- and low-fidelity models is over 100, i.e., the low-fidelity model is over two orders of magnitude faster than the highfidelity one.
3.2.3
Variable-Fidelity Physics Modeling
Variable-fidelity physics models are constructed by replacing the set of high-fidelity governing equations by a set of simplified or modified equations. Let us consider two
3.2 Low-Fidelity Modeling
67
a
b 1
Cl
Euler
-1.5
TSDE
TSDE
-1
0
Euler
-0.5 -1 0.1
-2
0
2
Cp
0 0.5
Cd 0.05 0
1 -2
0 a [deg]
2
1.5
a = 2 deg 0
0.2
0.4
0.6
0.8
1
x/c
Fig. 3.7 Responses of the Euler equations and the transonic small disturbance equation (TSDE) for flow past the NACA 0012 airfoil in transonic flow: (a) lift (Cl) and drag (Cd) coefficients and (b) pressure coefficient (Cp) distributions (Koziel and Leifsson 2016)
illustrative examples, demonstrating this process for specific case studies in aerospace and microwave engineering. In the case of aerodynamics, models with lower-fidelity governing equations are generated by making certain assumptions about the flow field (Koziel and Leifsson 2016). As an example, the Euler equations can be used as the high-fidelity model, and by assuming weak shocks and small disturbances, the transonic small disturbance equation (TSDE) can be used as the low-fidelity one. Figure 3.7 compares these models for the transonic flow past at a given airfoil shape and operating conditions. The measures of merit compare well for small perturbations (small angles of attack), as can be seen from Fig. 3.7a. The airfoil surface responses also compare well, aside from an area around the upper surface shock (Fig. 3.7b). The central issue is whether the low-fidelity model exhibits sufficient correlations with the high-fidelity one. Again, let us consider two-dimensional transonic flow past a supercritical airfoil. The compressible Reynolds-averaged Navier-Stokes (RANS) equations with the Spalart-Allmaras turbulence model (Kim et al. 2000) is utilized as the high-fidelity model. The low-fidelity model solves the compressible Euler equations coupled with a two-equation integral boundary-layer formulation using the displacement thickness concept (Lund et al. 1998). On a typical desktop computer, the high-fidelity model evaluates in roughly 40 min whereas the low-fidelity model in less than 15 s or 160 times faster (Koziel and Leifsson 2016). Figure 3.8 shows a comparison of the output of these models for several (locally) perturbed shapes of the original airfoil, indicating that the lower-fidelity model captures (at least locally) the main physics of the higher-fidelity model. In a similar manner, variable-fidelity models can be developed in the area of microwave engineering. Figure 3.9 shows an example of a dual-band microstrip filter (Guan et al. 2008). The high-fidelity model of this device determines the scattering parameters of the filter using a commercial planar-3D solver Sonnet em (Sonnet 2018). Simulation time is a few minutes per design. The low-fidelity model (Fig. 3.9b) is an equivalent circuit implemented in ADS (Keysight ADS 2019). The low-fidelity model is a simplified representation of the filter, where the lumped
68
3 Physics-Based Surrogate Modeling
Cl (l.c.)
82 80 78 Low-fidelity High-fidelity
76 74
1
2
3
4
5 6 7 Sample Index
8
9
10
1
2
3
4
5 6 7 Sample Index
8
9
10
Cd (d.c.)
200 180 160 140
Fig. 3.8 Measures of merit for several (locally) perturbed shapes of a supercritical airfoil at a transonic flow condition computed using variable-fidelity physics models (Koziel and Leifsson 2016)
a
b
MCLIN CLin1 W=W mm S=S1 mm L=L1 mm
S2 L1
L2 L2
.
g
Term 1 Z=50 Ohm
d Input
Output
S3
c
MCLIN MLIN CLin2 TL2 W=W mm W=W mm S=S2 mm L=L2-L1+d mm L=L2 mm
MLIN TL1 W=W mm L=L0 mm
W
S1
MCLIN CLin4 W=W mm S=S1 mm L=L1 mm
MLIN TL3 W=W mm L=L2-d mm
MCLIN CLin3 W=W mm S=S2 mm L=L2 mm
MLIN TL5 W=W mm L=L2-L1+d mm
MGAP Gap1 W=W mm S=g mm
MCLIN CLin5 W=W mm S=S3 mm L=L2/2-g mm
MCLIN CLin6 W=W mm S=S3 mm L=L2/2 mm
MGAP Gap2 W=W mm S=g mm
. MLIN TL6 W=W mm L=L0 mm
MLIN TL4 W=W mm L=L2-d mm
Term 2 Z=50 Ohm
0
|S21| [dB]
-10 -20 -30 -40 -50
0.6
0.8
1
1.2 1.4 1.6 Frequency [GHz]
1.8
2
2.2
Fig. 3.9 Dual-band bandpass filter (Guan et al. 2008): (a) high-fidelity model, (b) low-fidelity model, (c) responses at two designs dashed line ¼ high-fidelity model at design 1, thin dashed line ¼ low-fidelity at design 2, solid line ¼ high-fidelity model at design 2, thick horizontal lines ¼ minimax constraints (Koziel and Leifsson 2016)
3.2 Low-Fidelity Modeling
69
microstrip line models are connected together using the circuit theory rules. The low-fidelity model is very fast (simulation time is in the range of milliseconds). A comparison of the model responses is given in Fig. 3.9c. In should be emphasized that while equivalent circuit models are often acceptable for relatively simple structures such as the filter in Fig. 3.9a, they are usually not sufficiently accurate for complex circuits featuring considerable electromagnetic cross-couplings (e.g., highly miniaturized microstrip passives; Koziel and Kurgan 2015).
3.2.4
Low-Fidelity Model Selection
As illustrated in this section, the low-fidelity model can be set up in various ways. Often, there are many options available, characterized by different trade-offs between the accuracy and evaluation cost. While choosing the faster model may be tempting, one needs to make sure that the model accuracy is sufficient, in particular, that it properly represents all of the important characteristics of the system output. Also, relaxing accuracy has to be balanced by a larger number of highfidelity data points for subsequent model correction. In a particular context of surrogate-based optimization, the selection of the model coarseness strongly affects the simulation time and, therefore, the performance of the design optimization process. Coarser models are faster, which turns into a lower cost per design iteration of the optimization process. However, lower accuracy of the coarser models typically results in a larger number of iterations necessary to find a satisfactory design. Furthermore, there is an increased risk of failure for the optimization algorithm to find a satisfactory design. Finer models, on the other hand, are more expensive, but they are more likely to produce a useful design with a smaller number of iterations. Experimental studies have been carried out to investigate the issues of the low-fidelity model selection and its impact on the performance of the optimization process (Leifsson and Koziel 2015a). It should be mentioned that up to now, no general-purpose methods for automated selection of the physics-based low-fidelity model are available. The decision about the “right” low-fidelity model setup is normally made based on grid convergence studies such as those presented in Figs. 3.3 and 3.5, engineering experience, and visual inspection of the low- and high-fidelity model responses. Again, in the context of surrogate-assisted optimization, some results reported by Leifsson et al. 2014a indicate that certain parameters of the low-fidelity model (e.g., those controlling grid density) may be automatically adjusted using optimization methods. Also, multifidelity methods (e.g., Koziel et al. 2011; Leifsson and Koziel 2015b) where the design optimization process is conducted using an entire family of models of increasing accuracy, may, to some extent, mitigate the risk of improper model setup by automatically switching to the higher-fidelity model in case the lowerfidelity one fails to improve the design (Koziel and Ogurtsov 2013; Koziel and Leifsson 2013b; Tesfahunegn et al. 2015).
70
3.3
3 Physics-Based Surrogate Modeling
Physics-Based Surrogates: Basic Concepts
This section discusses some basic concepts pertinent to physics-based surrogate modeling. More detailed treatment of the subject can be found in the literature (Koziel et al. 2011; Koziel and Ogurtsov 2014; Bekasiewicz et al. 2014; Robinson et al. 2008; Sarkar et al. 2019; Bakr et al. 2000). The low-fidelity (coarse) model of the system of interest will be denoted as c(x). Because physics-based surrogates are very often used within surrogate-assisted optimization frameworks, such a context will be considered when explaining some of the modeling methodologies. In particular, let us consider an iterative surrogatebased optimization (SBO) procedure xðiþ1Þ ¼ arg min U sðiÞ ðxÞ , x
ð3:1Þ
which produces a series x(i), i ¼ 0, 1, . . . , of approximations to the original problem x ¼ argmin{x : U( f(x))}. The high-fidelity model is denoted as f, whereas U is the merit function to be minimized; s(i) stands for the surrogate model established at the current iteration point x(i) (i.e., the starting point for the next iteration of (3.1)). First, let us discuss a simple case of multiplicative response correction. For ensuring convergence of a sequence {x(i)} to x, it is imperative to have the surrogate and the high-fidelity model well (at least locally) aligned. The surrogate s(i)(x) at the iteration i can be constructed with response correction sðiÞ ðxÞ ¼ βk ðxÞcðxÞ,
ð3:2Þ
where βk(x) ¼ βk(x(i)) + ∇ β(x(i))T(x – x(i)) and β(x) ¼ f(x)/c(x). This construction ensures both the zero-order consistency, s(i)(x(i)) ¼ f(x(i)), and the first-order consistency,∇s(i)(x(i)) ¼ ∇ f(x(i)) (Alexandrov and Lewis 2001). At the same time, the problem-specific knowledge embedded in the low-fidelity model ensures that the surrogate is more accurate than the alternative data-driven model established using the same high-fidelity model information (e.g., the first-order Taylor model). Figure 3.10 illustrates this for an exemplary one-dimensional problem. A generalization of this concept is a multipoint response correction involving a larger set of high-fidelity data points. For example (Koziel and Leifsson 2013b), the surrogate can be defined for vector-valued models as sðiÞ ðxÞ ¼ ΛðiÞ ∘cðxÞ þ ΔðriÞ þ δðiÞ ,
ð3:3Þ
with column vectors Λ(i), Δr(i), and δ(i) and denoting component-wise multiplication. The global response correction parameters Λ(i) and Δr(i) are obtained as 2 h i Xi ðk Þ ðk Þ ΛðiÞ , ΔðriÞ ¼ arg min f x Λ∘c x þ Δ r : k¼0 ½Λ, Δr
ð3:4Þ
3.3 Physics-Based Surrogates: Basic Concepts
71
Fig. 3.10 Visualization of the response correction (3.2) for the example analytical functions c (low-fidelity model) and f (high-fidelity model). The correction is established at x0 ¼ 1. Note that the surrogate exhibits good alignment with the high-fidelity model in a relatively wide vicinity of x0, especially compared to the first-order Taylor model set up using the same data from f (the value and the gradient at x0)
The response scaling (3.4) is supposed to improve alignments of the models for all previous iterations. The (local) additive term δ(i) is then defined as h i δðiÞ ¼ f xðiÞ ΛðiÞ ∘c xðiÞ þ ΔðriÞ :
ð3:5Þ
In (3.5), a perfect match between the surrogate and the high-fidelity model at the current design x(i), s(i)(x(i)) ¼ f(x(i)) is ensured. The correction terms Λ(i), Δr(i), and δ(i) can be obtained analytically by solving appropriate linear regression problems (Koziel and Leifsson 2013b). For the sake of illustration, consider a dielectric resonator antenna (DRA) shown in Fig. 3.11. The DRA is suspended above the ground plane in order to enhance its impedance bandwidth. The high-fidelity model f is simulated using the CST transient solver (CST 2018) (~800,000 mesh cells, evaluation time 20 min). The low-fidelity model c is also evaluated in CST (~30,000 mesh cells, evaluation time 40 s). Figure 3.12a shows the responses of the low- and high-fidelity DRA model at several designs. As shown in Fig. 3.12b, conventional one-point response correction perfectly aligns c and f at the design where it is established, but the alignment is not as good for other designs. On the other hand, as shown in Fig. 3.12c, the multipoint response correction improves model alignment at all designs involved in the model construction (note that Fig. 3.12c only shows the global part Λ(i)∘c(x) + Δr(i) without δ(i) which would give a perfect alignment at x(i)). Another approach to low-fidelity model correction is to apply the correction at the level of the model domain. Perhaps the most popular example of such a procedure is input space mapping (ISM) (Bandler et al. 2004), where the surrogate is created as
72
b
3 Physics-Based Surrogate Modeling
a
bx by
dy
ax
Z
bx
X
ys ac
Y
by X
us
dz
c
dx Z
az
g1
X
ws w0
cx dzb
Fig. 3.11 Suspended DRA (a) 3D view of its housing, top (b), and front (c) views
sðiÞ ðxÞ ¼ c x þ qðiÞ :
ð3:6Þ
Again, in (3.6), both the low-fidelity model and the surrogate are vector-valued. The model parameters q(i) are obtained by minimizing the misalignment between the surrogate and the high-fidelity model, kf(x(i)) – c(x(i) + q(i))k; x(i) is a reference design (e.g., the most recent design encountered during the optimization run) at which the surrogate is established. The advantages of ISM are demonstrated in Fig. 3.13 for microwave filter design (Hong and Lancaster 2001). The high-fidelity model is evaluated using EM simulation, whereas the low-fidelity model is an equivalent circuit. The response of interest is the reflection coefficient jS11j as a function of frequency. For this particular example, ISM offers both good approximation and generalization capability (cf. Fig. 3.13d). An alternative way of low-fidelity model correction is to exploit parameters that are normally fixed in the high-fidelity model such as microstrip substrate height and/or dielectric permittivity (Koziel et al. 2008; Bandler et al. 2004). Because the surrogate model is just an auxiliary tool and it is not supposed to be built or measured, such parameters can be freely adjusted in the low-fidelity model to improve its alignment with the high-fidelity one. Implicit space mapping (Bandler et al. 2003, 2004; Koziel et al. 2010) is a technique that utilizes this concept. More specifically, the surrogate is obtained as sðiÞ ðxÞ ¼ cI x, pðiÞ :
ð3:7Þ
The vector p in (3.7) denotes a set of implicit space mapping (preassigned) parameters, and cI is the low-fidelity model with the explicit dependence on these parameters. When the surrogate is established at the current design x(i), the vector p(i) is typically obtained by minimizing the norm-wise discrepancy between the models, kf(x(i)) – cI(x(i), p)k. Figure 3.14 provides an illustration of the implict space mapping. The mapping parameters are dielectric permittivities of the microstrip line components (rectangle elements in Fig. 3.14b).
|S11| [dB]
a -10
-20 4.5
5
5.5
6
6.5
Frequency [GHz]
b
c
-5
-10 |S11| [dB]
|S11| [dB]
-10
-5
-15 -20
-15 -20
-25 4.5
5
5.5
6
-25 4.5
6.5
Frequency [GHz]
5
5.5
6
6.5
Frequency [GHz]
Fig. 3.12 Suspended DRA of Fig. 3.11: (a) low- () and high- (- - - and —) fidelity model responses at three designs; (b) OSM-corrected low- () and high- (- - - and —) fidelity model responses at the same designs (OSM correction at the design marked —); (c) multipoint-corrected low- () and high- (- - - and —) fidelity model responses
a
Input s1
Output
w1 l3 l2
w2
s3 s2
l1
d
0
|S11| [dB]
|S11| [dB]
c
-10 -20 2.2
b
2.3 2.4 2.5 2.6 Frequency [GHz]
2.7
0 -10 -20 2.2
2.3 2.4 2.5 2.6 Frequency [GHz]
2.7
Fig. 3.13 Low-fidelity model correction through parameter shift (input space mapping): (a) microstrip filter geometry (high-fidelity model f evaluated using EM simulation); (b) low-fidelity model c (equivalent circuit); (c) response of f (—) and c (), as well as response of the surrogate model s (– –) created using input space mapping; (d) surrogate model verification at a different design (other than that at which the model was created) f (—), c (), and s (– –). Good alignment indicates excellent generalization of the model (Koziel and Bekasiewicz 2016b)
74
3 Physics-Based Surrogate Modeling
a
b
W1
W1
S1
W2
L2
W1 Term 1 Z=50 Ohm
L3 L1 Input
Output W1
S2
|S21| [dB]
c
MACLIN Clin3 W=W1 mm S=S2 mm L=L2 mm
MLIN TL1 W=W0 mm L=L0 mm
MLIN TL3 W=W2 mm L=L3 mm
MLIN TL2 W=W2 mm L=L3 mm MACLIN Clin1 W=W1 mm S=S1 mm L=L1 mm
MACLIN Clin2 W=W1 mm S=S1 mm L=L1 mm
MACLIN Clin4 W=W1 mm S=S2 mm L=L2 mm
Term 2 Z=50 Ohm
MLIN TL4 W=W0 mm L=L0 mm
0 -10 -20 -30
|S21| [dB]
d
1.4
1.6
1.8 2 2.2 Frequency [GHz]
2.4
2.6
1.4
1.6
1.8
2.4
2.6
0 -10 -20 -30 2 2.2 Frequency [GHz]
Fig. 3.14 Low-fidelity model correction through parameter shift (input space mapping): (a) microstrip filter geometry (high-fidelity model f evaluated using EM simulation); (b) response of f (—) and c (), as well as response of the surrogate model s (- - -) created using implicit space mapping; (c) surrogate model verification at a different design (other than that at which the model was created) f (—), c (), and s (- - -) (Koziel and Leifsson 2016)
Another type of low-fidelity model enhancement is derived from the observation of the vector-valued responses of the system, which are evaluations of the same design but at different values of certain parameters such as the time, frequency, or a specific geometry parameter. Subsequently, and based on such observations, the low-fidelity model is enhanced by applying a linear or nonlinear scaling to these parameters. A representative example of such a correction method is frequency scaling. It is popular in electrical engineering where the figures of interest are often frequency characteristics, e.g., S-parameters or gain (Koziel et al. 2006a; Koziel and Ogurtsov 2014).
3.4 Response Correction Models
75
Let us assume that f(x) ¼ [f(x, ω1) f(x, ω2) . . . f(x, ωm)]T where f(x, ωk) is the evaluation of the high-fidelity model at a frequency ωk; similarly, c(x) ¼ [c(x, ω1) c(x, ω2) . . . c(x, ωm)]T. The frequency-scaled surrogate model sF(x) is then defined as sF ðx, ½F 0 F 1 Þ ¼ cðx, F 0 þ F 1 ω1 Þ . . . cðx, F 0 þ F 1 ωm ÞT :
ð3:8Þ
The scaling parameters F0 and F1 are obtained as [F0, F1] ¼ argmin{[F0, F1] : kf(x( j)) – sF(x( j), [F0F1])k}. Figure 3.15 shows an example of frequency scaling applied to the low-fidelity model of a substrate-integrated cavity antenna (Bekasiewicz and Koziel 2016). In this example, both the low- and high-fidelity models are evaluated using coarse- and fine-discretization EM simulations. Interested reader can find more extensive discussion on physics-based surrogates in the literature (e.g., Koziel and Ogurtsov 2014; Bandler et al. 2004; Koziel et al. 2013; Leifsson and Koziel 2015b).
3.4
Response Correction Models
This section discusses physics-based models involving response correction. As mentioned before, physics-based surrogates are most often used for design optimization purposes where the primary purpose of the surrogate is to ensure a good local alignment with the high-fidelity model, whereas global accuracy of the model is not of a major concern. In a more general setting, i.e., global or quasi-global modeling, the surrogate is to be valid within a larger portion of the design space. This is important for creating multiple use library models and applications such as statistical analysis, uncertainty quantification, or global optimization. In this section, quasi-global modeling using space mapping, and space mapping enhanced by function approximation layers, and surrogate modeling with the shape-preserving response prediction are outlined. The high-fidelity model (and, consequently, the surrogate) are, in general, vector-valued, i.e., f : Xf ! Rm, Xf ⊆ Rn. The surrogate s is to match f as well as possible in the region of interest XR ⊆ Xf, which is typically, an n-dimensional interval in Rn with center at the reference point x0 ¼ [x0.1 . . . x0. n]T 2 Rn and the size δ ¼ [δ1 . . . δn]T.
3.4.1
Global Modeling Using Multipoint Space Mapping
Let us denote by c : Xc ! Rm, Xc ⊆ Rn, the (vector-valued) low-fidelity model. The training set is denoted as XB ¼ {x1, x2, . . . , xN} ⊂ XR with the high-fidelity model responses known at all x j, j ¼ 1, 2, . . . , N. The goal is to enhance the low-fidelity model c and create a space mapping surrogate s using auxiliary transformations, whose parameters are determined so that s matches the high-fidelity model as well as possible at all base points. A standard SM model (SM-standard) is defined as
76
3 Physics-Based Surrogate Modeling
a
b
lp
ld2
a
wd1
ld1
lt
ws
wt
wp ls g 1
wd2
l0 w0
wf
wd3
b
d
0 -5
|S11| [dB]
|S11| [dB]
c
-10 -15
0 -5 -10 -15
4
4.5
5 5.5 6 6.5 Frequency [GHz]
7
4
4.5
5 5.5 6 6.5 Frequency [GHz]
7
Fig. 3.15 Low-fidelity model correction with frequency scaling: (a) antenna view; (b) antenna geometry (both f and c evaluated using EM simulation, coarse discretization used for c). Response of f (—) and c (), as well as response of the surrogate model s (- - -) created using frequency scaling at (c) a certain reference design and (d) another (test) design. Note that the surrogate properly accounts for the frequency shifts between c and f (Koziel and Bekasiewicz 2016b)
sSM ðxÞ ¼ sSM ðx, pÞ,
ð3:9Þ
where the SM parameters p are obtained using the parameter extraction process p ¼ arg min r
XN k¼1
kf xk sSM xk , r k,
ð3:10Þ
while s is a generic space mapping model, i.e., the low-fidelity model composed with some suitable mappings. A model often used in practice has the form sSM ðx, pÞ ¼ sSM ðx, A, B, c, dÞ ¼ A cðB x þ qÞ þ d,
ð3:11Þ
where A ¼ diag {a1, . . . , am}, B is an n n matrix, q is an n 1 vector, and d is an m 1 vector (Koziel and Bandler 2007a, b). The flexibility of the model represented by (3.9, 3.10, and 3.11) can be enhanced in many ways, e.g., by exploiting so-called preassigned or implicit parameters. These parameters (e.g., dielectric constants, substrate height in the case of highfrequency structures; Koziel et al. 2008) are fixed in the high-fidelity model but can
3.4 Response Correction Models
77
be freely modified in the low-fidelity model in order to allow better alignment between the high-fidelity model and the surrogate (Koziel et al. 2006a; Bandler et al. 2004; Cheng et al. 2008). Note that the standard SM surrogate is simple. However, linear mappings such as (3.11) may not be able to provide sufficient accuracy. Also, (3.11) may only provide a limited modification of the range of the low-fidelity model, and this modification is basically independent of the design variables. Finally, because of the finite number of parameters, which are extracted in one shot for the entire region of interest, the surrogate is, in fact, a nonlinear regression model. Consequently, the modeling error might not decrease below certain, problem dependent, non-zero limits even if the number of base points becomes unlimited (cf. Koziel et al. 2006b). In Koziel and Bandler 2006, an approach with variable weight coefficients was proposed which provides better accuracy than the standard method, however, at the expense of a significant increase in the evaluation time, which is due to a separate parameter extraction required for each evaluation of the surrogate model. This limits the practical applicability of the method.
3.4.2
Space Mapping with a Function Approximation Layer
A straightforward way of working around the limitations of the standard SM surrogate, without compromising the computational cost, is to incorporate an auxiliary approximation layer. An enhanced SM surrogate is defined as sðxÞ ¼ sSM ðxÞ þ esðxÞ,
ð3:12Þ
where sSM is the standard SM model (3.9, 3.10, and 3.11), whereas es is a datadriven model. One can consider sSM as a trend function and es as an output space mapping term that models the residuals between the high-fidelity model and sSM at all base points (Koziel and Bandler 2007a). The advantages of this approach are as follows: 1. A relatively good modeling accuracy can be obtained using a limited amount of high-fidelity model data due to underlying physics-based low-fidelity model. 2. The resulting surrogate is computationally as cheap as the low-fidelity model because the function approximation layer typically exploits analytical formulas. 3. It is possible to take advantage of any amount of available high-fidelity model data, so that modeling accuracy can be as good as required provided that the base set is sufficiently “dense.” Several implementations of (3.12) have been proposed, including es realized through radial basis function interpolation (Koziel and Bandler 2007c), fuzzy systems (Koziel and Bandler 2007b), and kriging interpolation (Koziel and Bandler 2012). It has been demonstrated that the modeling accuracy of the model (3.12) is better than the accuracy of the standard space mapping surrogate and, at the same
78
3 Physics-Based Surrogate Modeling
time, better than the accuracy of the function approximation model used alone, provided that in each case the same amount of high-fidelity model data was used to set up the model. For illustration, let us consider the microstrip bandpass filter with two transmission zeros shown in Fig. 3.16a (Hsieh and Chang 2003). The design parameters are x ¼ [L g s d]T. The high-fidelity model is simulated in Altair FEKO (Altair FEKO 2018). The region of interest is defined by the reference point x0 ¼ [7.0 0.1 0.5 1.5]T mm and the region size δ ¼ [0.25 0.05 0.1 0.5]T mm. The low-fidelity model (Fig. 3.16b) is implemented in ADS (Keysight ADS 2019). The base set contains 100 points allocated using the LHS algorithm. The standard SM surrogate sSM (SM-standard) is the model (3.9, 3.10, and 3.11) enhanced by implicit SM with two preassigned parameters: dielectric constant (initial value 10.2) and the substrate height (initial value 0.635 mm). The average and maximum modeling error for the models is given in Table 3.1 (see also Fig. 3.17). For comparison, the results concerning the SM model enhanced by RBF (Koziel and Bandler 2007a) and by fuzzy system surrogate (Koziel and Bandler 2007b) are also included in Table 3.1. The SM-kriging model is utilized to optimize the filter with respect to the following design specifications: jS21j –1 dB for 1.75 GHz ω 2.25 GHz and jS21j –20 dB for 1.0 GHz ω 1.5 GHz and 2.5 GHz ω 3.0 GHz using x0 as a starting point (specification error +9.5 dB). The optimized surrogate model design is x ¼ [22.98 19.78 26.80 0.183 0.053]T mm (high-fidelity model specification error –0.34 dB). Figure 3.18 shows the high-fidelity model response at x0 and at x.
a
b L
L MTEE Tee 1 W1=W mm W2=W mm W3=W mm
Input d L
g
g
L d
Output
MLIN TL1 W=W mm L=L0 mm Term 1 Z=50 Ohm
s
MLIN TL2 W=W mm L=3L/2-d mm MGAP Gap1 W=W mm S=g mm MCLIN Clin2 W=W mm S=S mm L=(L2-G)/g mm
MLIN TL3 W=W mm L=3L/2+d mm
MLIN TL4 MCLIN W=W mm Clin1 L=3L/2+d mm W=W mm MLIN S=S mm TL6 L=(L2-G)/g mm W=W mm MGAP L=L0 mm Gap2 W=W mm MTEE S=g mm Tee 2 W1=W mm MLIN W2=W mm TL5 W3=W mm W=W mm Term 2 L=3L/2-d mm Z=50 Ohm
Fig. 3.16 Bandpass filter with two transmission zeros (Hsieh and Chang 2003): (a) geometry, (b) low-fidelity model (Keysight ADS) Table 3.1 Modeling results for the filter of Fig. 3.16 Model type Space mapping Enhanced space mapping
Data-driven surrogates
Model name SM-standard SM-RBF SM-fuzzy SM-kriging RBF Fuzzy Kriging
Average error (%) 3.8 1.8 3.4 1.5 6.9 9.5 6.1
Maximum error (%) 6.8 5.9 6.7 4.1 24.8 22.0 20.1
3.4 Response Correction Models
79
|S21| [dB]
0 -10 -20 -30 -40 1.5
2.5
2 Frequency [GHz]
Fig. 3.17 Bandpass filter with two transmission zeros: high-fidelity model (solid line) and surrogate model (circles) responses at the three selected test points for the SM-kriging model (Koziel and Bandler 2012)
0 |S21| [dB]
-5 -10 -15 -20 -25 1.6
1.7
1.8
1.9 2 2.1 Frequency [GHz]
2.2
2.3
2.4
Fig. 3.18 Bandpass filter with two transmission zeros: high-fidelity model responses at the reference point x0 (dashed line) and at the optimal solution x of the SM-kriging surrogate model (solid line) (Koziel and Bandler 2012)
3.4.3
Multipoint Output Space Mapping
The multipoint output space mapping (OSM) (Shah et al. 2015) applies correction terms directly to the low-fidelity model output components. Below, the technique is explained assuming the scalar low-fidelity model output c(x). The OSM surrogate model is defined as (Leifsson and Koziel 2015a). sðxÞ ¼ AðxÞcðxÞ þ DðxÞ:
ð3:13Þ
Both the multiplicative and additive correction terms are design-variable-dependent and take the form AðxÞ ¼ a0 þ ½a1 a2 . . . an x x0 , D ð xÞ ¼ d 0 þ ½ d 1 d 2 . . . d n x x0 ,
ð3:14Þ ð3:15Þ
80
3 Physics-Based Surrogate Modeling
where x0 is the center of the design space. Response correction parameters A and D are obtained as ½A, D ¼ arg min ½A, D
XN f xk A xk c xk þ D xk 2 , k¼1
ð3:16Þ
i.e., the response scaling is supposed to globally improve the matching for all training points xk, k ¼ 1, . . . , N. The problem (3.16) is equivalent to a linear regression problem [a0 a1 . . . an d0 d1 . . . dn]TC ¼ F, the solution of which can be found as (Leifsson and Koziel 2015a) ½A, D ¼ arg min ½A, D
XN f xk A xk c xk þ D xk 2 , k¼1
ð3:17Þ
where 2
c ðx 1 Þ 6 c ðx 2 Þ 6 C¼6 4 ⋮ cðxN Þ
cðx1 Þ x11 x01 cðx2 Þ x21 x01 ⋮ cðxN Þ xN1 x01
⋱
cðx1 Þ x1n x0n cðx2 Þ x2n x0n ⋮ cðxN Þ xNn x0n
1 x11 x01 1 1 x1 x01 ⋮ ⋮ 1 x11 x01
f x2
N T
⋮
1 x x0n n1 xn x0n ⋮ 1 xn x0n
3 7 7 7 5
ð3:18Þ F ¼ f x1
...
f x
:
ð3:19Þ
Note that the matrices CTC are non-singular for N > n + 1 assuming that all training points are distinct. Choosing a star distribution training set (Giunta et al. 2003) satisfies this condition and is sufficient in many modeling cases. The star distribution training set consists of N ¼ 2n + 1 points allocated at the center of the design space x0 ¼ (l + u)/2 (l and u being the lower and upper bound for the design variables, respectively) and the centers of its faces, i.e., points with all coordinates but one equal to those of x0, and the remaining one equal to the corresponding component of l or u. Generalization of the model (3.13, 3.14, 3.15, 3.16, 3.17, 3.18, and 3.19) for vector-valued outputs is straightforward. Multipoint OSM is illustrated using the example involving robust design of transonic airfoil shapes (Leifsson et al. 2013). The goal in this case is to find an airfoil shape with minimum drag and a given lift coefficient that is least sensitive to changes in the operating conditions. The problem is formulated such that the operating condition (in this case it is only the free-stream Mach number) is treated as an input uncertainty and represented as a uniform random variable with bounds. The objective function is to reduce the mean drag coefficient (μCd) and the standard deviation of the drag coefficient (σ Cd) subject to a given minimum mean lift coefficient (μCd). The airfoil shape parameters are taken as the deterministic design variables. The high-fidelity model solves steady, two-dimensional, compressible RANS equations and the Spalart-Allmaras turbulence model on a structured
3.4 Response Correction Models
81
C-grid (Tannehill et al. 1997). The model has around 400,000 mesh cells, and the simulation time is around 2 h. The low-fidelity model is constructed in the same way as the high-fidelity one but with a coarser mesh (~32, 000 cells) and is around 80 times faster than the high-fidelity one. The airfoil shape is parameterized using NACA four-digit airfoils using three parameters. In order to satisfy the lift constraint, the angle of attack is taken as a design variable. Thus, in total, there are four deterministic design variables. The freestream Mach number (M1) is the only uncertain variable, and it is bounded as follows: 0.7 M 1 0.8. The minimum lift coefficient is set Cl ¼ 0.5. The statistical properties are calculated based on stochastic expansions derived from the non-intrusive polynomial chaos (NIPC) technique (see, e.g., Hosder 2012; Zhang et al. 2012). In NIPC, a stochastic response surface approximation (RSA) model is created based on high-fidelity data. This case requires at least 42 highfidelity model evaluations to setup the stochastic RSA model. To reduce the cost, the multipoint OSM is used to create a globally accurate surrogate model which is used in place of the high-fidelity one to construct the stochastic RSA. Given the problem size, 53 low-fidelity and 11 high-fidelity model evaluations are needed, corresponding to less than 12 equivalent high-fidelity model evaluations in total. Figure 3.19a shows the optimized airfoil shapes when using the high-fidelity model, and the surrogate, as well as the low-fidelity model. The shapes produced by means of the high-fidelity model and the surrogate have the same thickness but different camber. As a result, the angle of attack necessary to attain the prescribed lift coefficient is different. The shape produced by using the low-fidelity model is different than the others. Figure 3.19b shows the variation of the drag coefficients of the shapes with respect to the Mach number. The comparator shape, NACA 2412, has a significant drag rise over this Mach number range, whereas the optimized shapes maintain lower drag coefficient values. Furthermore, the variation of the drag coefficient values of the shapes obtained by the high-fidelity and the OSM-based method is very similar. The variation of the airfoil obtained using the low-fidelity model is significantly higher than the shapes obtained by the other two methods.
3.4.4
Surrogate Modeling Using Generalized Shape-Preserving Response Prediction
The shape-preserving response prediction (SPRP) technology has been initially developed for numerical optimization purposes (cf. Sect. 3.6) (Koziel 2010a). SPRP is based on so-called characteristic points that describe the most critical parts of the system output and are employed to set up a mapping between the lowand high-fidelity models. The mapping is further utilized to predict the response of the latter based on the actual response of the low-fidelity model. Although the primary purpose of SPRP was a local optimization and the surrogate was normally set up using a single high-fidelity model evaluation (at the most recent design along
82
3 Physics-Based Surrogate Modeling
a
0.06 0.04 High-fidelity 0.02
Surrogate
z/c
Low-fidelity
0 -0.02 -0.04
0
0.2
0.4
0.6
0.8
1
0.76
0.78
0.8
x/c
b
0.08
Cd
0.06
NACA 2412 Low-fidelity High-fidelity Surrogate
0.04 0.02 0 0.7
0.72
0.74 M∞
Fig. 3.19 Characteristics of the initial and optimized airfoils: (a) optimized shapes, (b) variation of the drag coefficient with Mach number at a lift coefficient of Cl ¼ 0.5 (Leifsson et al. 2013)
the optimization path), SPRP has been also utilized for quasi-global modeling (e.g., Leifsson and Koziel 2016; Koziel and Szczepanski 2011). In this section, the generalized SPRP (GSPRP; Koziel and Leifsson 2012) is outlined. GSPRP permits arbitrary allocation of the training points xk, k ¼ 1, . . . , N, although a uniform distribution is preferred. For the sake of subsequent considerations, let us denote the training set as XB ¼ {x1, x2, . . . , xN} ⊂ XR. The high-fidelity model responses at the base designs, f(xk), are assumed to be known. The GSPRP concept will be explained for high-frequency structures with the outputs of interest being frequency characteristics. Consequently, f(x) ¼ [f(x, ω1) . . . f(x, ωm)]T, where ωi, i ¼ 1, . . . , m, is the frequency sweep. As mentioned above, the SPRP technique is based on processing the sets of so-called characteristic points that describe the response of the structure under consideration. In particular, these characteristic points can correspond to specific levels of the response, local response minima and/or maxima, etc. (see also Sect. 3.6 for more details). The main assumption—and the major limitation—is that the characteristic points of the low-fidelity and high-fidelity model responses are in
3.4 Response Correction Models
83
|S21| [dB]
0
-10
-20 2.15
2.2
2.25
2.3 2.35 2.4 Frequency [GHz]
2.45
2.5
2.55
Fig. 3.20 Example response of the high-fidelity model f at several training points (solid lines). Circles indicate the characteristic points of the responses. Ellipses indicate the groups of corresponding characteristic points (Koziel and Leifsson 2012)
one-to-one correspondence (see Koziel (2010a) for details and Koziel (2010b) for generalizations). The SPRP model created using the high-fidelity model response at a specific reference design and its characteristic points and similar data from the low-fidelity model allow us to predict the high-fidelity model response at other designs of interest (Koziel 2010b). Figure 3.20 shows the response of the high-fidelity model of an exemplary microstrip filter at a certain number of training points (only a few points are shown for clarity). A set of characteristic points is distinguished on each of the plots, in this case corresponding to jS21j ¼ –6 dB, –15 dB as well as local jS21j minima within the pass band. A discussion on selecting a set of characteristic points for a given design case can be found in (Koziel 2010b). The notation pkj ¼ [ωkj λkj]T, j ¼ 1, . . . , K, is used to denote the characteristic points of f(xk). Let ωkj and λkj denote the frequency and magnitude components of pkj, respectively. GSPRP predicts the response of the high-fidelity model at any design variable vector x using the information contained in the training points. The model is initialized by constructing auxiliary models sω. j(x) and sλ. j(x), j ¼ 1, . . . , K, of the sets of corresponding characteristic points for all training points, {p1j, p2j, . . . , pNj}, j ¼ 1, . . . , K. For this purpose, kriging interpolation is used (Lophaven et al. 2002). The flow diagram of the initialization process is shown in Fig. 3.21a. Evaluation of the GSPRP model is a three-step process (cf. the flow diagram of Fig. 3.21b). In the first step, the characteristic points corresponding to the vector x are obtained as pj ðxÞ ¼ sω:j ðxÞ
T sλ:j ðxÞ ,
ð3:20Þ
where j ¼ 1, . . . , K. In the second step, the index kmin(x) of the training point is the one identified being the closest to x, i.e.,
kmin ðxÞ ¼ arg min k 2 f1, . . . , N g : kxk xk , and then the translation vectors defined as
ð3:21Þ
84
3 Physics-Based Surrogate Modeling
Fig. 3.21 Flowchart of the GSPRP-based surrogate modeling methodology; (a) model initialization, EM solver is used to evaluate the high-fidelity model response at the training points, auxiliary models are constructed using kriging interpolation, and the characteristic points of the training points are found; (b) model evaluation at a design x, the characteristic points corresponding to x are obtained using (3.20), then the index kmin(x) of the training point closest to x is identified as in (3.21), and the corresponding translation vectors are calculated through (3.22), and, finally, the surrogate model response s(x) is calculated using (3.23, 3.24, and 3.25) (Koziel and Leifsson 2016)
Base points { x k}
a
GSPRP Initialization
EM Solver
Evaluate fine model at base points {x k} {f(x k)} Construct auxiliary models using kriging interpolation {x k, sω (x k), sλ(x k)} Find the characteristic points corresponding to { x k} {p(x k)} GSPRP model: { x k, p(x k)}
b
GSPRP Model Evaluation
GSPRP model: {x k, p(x k)}
x
Calculate characteristic points corresponding to x {sω (x), sλ(x)} Identify the training point closest to x {x kmin} Calculate the translation vectors {t} Calculate model response
s(x)
tj ¼ ωjt
λjt
T
h ¼ sω:j ðxÞ ωjkmin ðxÞ
sλ:j ðxÞ λjkmin ðxÞ
iT ,
ð3:22Þ
j ¼ 1, . . . , K, are calculated. These vectors indicate the change of the characteristic points of the f response while moving from xkmin ðxÞ to x. Figure 3.22 shows a conceptual illustration of a training set, an example vector x, as well as a corresponding vector xkmin ðxÞ . Figure 3.23 shows the high-fidelity model responses at xkmin ðxÞ and x, the translation vectors t j, as well as the GSPRP model response at x.
3.4 Response Correction Models
85
Fig. 3.22 Illustration of the region of interest XR, the training points (filled circles), as well as an example vector of interest x (evaluation point). The training point that is the closest to x is denoted as xkmin ðxÞ (Koziel and Leifsson 2016)
x xkmin(x)
XR
|S21| [dB]
0
-10
-20 2.15
2.2
2.25
2.3 2.35 2.4 Frequency [GHz]
2.45
2.5
2.55
Fig. 3.23 The high-fidelity model f at xkmin ðxÞ (the training point closest to the evaluation point x), f xkmin ðxÞ , (—), the characteristic points of f xkmin ðxÞ (o), the characteristic points p j(x) corresponding to x obtained using (3.20) (□), as well as the translation vectors t j (3.22) (short line segments). The GSPRP-predicted high-fidelity model response at x, s(x) (- - -) is obtained using p j(x), t j, and f xkmin ðxÞ from (3.20, 3.21, 3.22, 3.23, 3.24) to (3.25)
Using the translation vectors t j, defined in (3.22), the GSPRP surrogate model s of f can be defined as tj ¼ ωjt
λjt
T
h ¼ sω:j ðxÞ ωjkmin ðxÞ
sλ:j ðxÞ λjkmin ðxÞ
iT ,
ð3:23Þ
where s is determined at frequencies ωk min ðxÞ j þ ωt j , j ¼ 0, 1, . . . , K, K + 1, as (with ωk min ðxÞ 0 ¼ ω1 , ωk min ðxÞ Kþ1 ¼ ωm , and ωt0 ¼ ωtK+1 ¼ 0) s x, ωjkmin ðxÞ þ ωjt ¼ f xkmin ðxÞ , ωjkmin ðxÞ þ λjt ,
ð3:24Þ
for j ¼ 1, . . . , m. For other frequencies, the model s is obtained through linear interpolation j jþ1 sðx, ωÞ ¼ f x0 , ð1 αÞωjkmin ðxÞ þ αωjþ1 , k min ðxÞ þ ð1 αÞλt þ αλt
ð3:25Þ
þ1 and α ¼ ω ωk min ðxÞ j þ ωt j = where ωk min ðxÞ j þ ωt j ω ωk min ðxÞ jþ1 þiωt j h þ1 ωk min ðxÞ jþ1 þ ωt j ωk min ðxÞ j þ ωt j . f xkmin ðxÞ , ω is an interpolation of
86
3 Physics-Based Surrogate Modeling
a
b Input
L2
L1
Input
W1
S2
S2
S1
L1
d W1 S1
S2 W1
L3
L1
L2
L3
S1 d
Output Output
W1
Fig. 3.24 Filter structures for GSPRP modeling: (a) stacked slotted resonators filter (Huang et al. 2008), (b) bandpass filter with open stub inverter (Lee et al. 2000) Table 3.2 GSPRP modeling results for filters 1 and 2 Number of training pointsa 20 50 100 200 400 a
Average error (filter 1) GSPRP Kriging interpolation (%) (%) 3.9 15.6 1.8 11.9 0.9 11.0 0.6 9.3 0.4 6.7
Average error (filter 2) GSPRP Kriging interpolation (%) (%) 7.7 11.6 2.7 8.8 1.8 6.1 1.4 4.8 1.2 3.1
Training points allocated using Latin hypercube sampling (Beachkofski and Grandhi 2002)
k ðxÞ
onto the frequency interval [ω1, ωm]. This f x min , ω1 , . . . , f xkmin ðxÞ , ωm interpolation is necessary because the original frequency sweep is a discrete set. The operation and performance of GSPRP are demonstrated using two examples of microstrip filters: the stacked slotted resonators bandpass filter (Huang et al. 2008) shown in Fig. 3.24a (Filter 1) and the microstrip bandpass filter with open stub inverter (Lee et al. 2000); see Fig. 3.24b (Filter 2). Design variables are x ¼ [L1 L2 W1 S1 S2 d]T (Filter 1) and (Filter 2). Filter 1 is simulated in Sonnet em (Sonnet 2018) using a grid of 0.05 mm 0.05 mm. Filter 2 is evaluated in Altair FEKO (Altair FEKO 2018) with the total mesh number of 432. The region of interest for Filter 1 is defined by the reference point x0 ¼ [6 9.6 1 1 2 2]T mm and the region size δ ¼ [0.8 0.8 0.2 0.2 0.4 0.4]T mm. For Filter 2, x0 ¼ [24 10 2 0.6 0.2 0.5]T mm and δ ¼ [2 2 1 0.4 0.1 0.4]T mm. Table 3.2 shows the average modeling error for the GSPRP model, as well as kriging interpolation (Forrester and Keane 2009; Salleh et al. 2008), with different number of training points from 20 to 400. For both filters, the accuracy of the GSPRP model is better than the accuracy of the kriging surrogate for the corresponding number of training points. Figure 3.25 shows the high-fidelity and GSPRP model responses at selected test points for Filters 1 and 2. A comparison with kriging interpolation of high-fidelity model date reveals that the comparable error level can be obtained for the training set that is several times smaller than for kriging interpolation. The results are consistent for all considered examples.
3.5 Feature-Based Modeling
a
87
0
|S21| [dB]
-5 -10 -15 -20 -25 2.1
b
2.2
2.3 2.4 2.5 Frequency [GHz]
2.6
2.7
0
|S21| [dB]
-5 -10 -15 -20 -25 1.6
1.7
1.9
1.8
2
2.1
2.2
2.3
Frequency [GHz] Fig. 3.25 High-fidelity (—) and GSPRP surrogate model responses (o) obtained for 100 base points at the selected test designs: (a) Filter 1, (b) Filter 2 (Koziel and Leifsson 2012)
3.5
Feature-Based Modeling
Surrogate modeling of high-frequency structures entails handling vector-valued outputs, typically frequency characteristics (Koziel and Ogurtsov 2019). In the case of microwave engineering, these are often so-called scattering parameters such as return loss or transmission responses (Pozar 2012), whereas in the case of antenna design, the outputs of interest may include reflection characteristics (Koziel and Bekasiewicz 2017b), gain (Manshari et al. 2019), or axial ratio (in the cases of circularly polarized antennas; Nosrati and Tavassolian 2017). Typically, the responses are highly nonlinear, both as a function of the frequency and geometry and/or material parameters of the structure at hand. This poses significant challenges when it comes to constructing the surrogate models. Feature-based modeling (Koziel and Bekasiewicz 2015; Koziel and Bekasiewicz 2018) offers to work around these issues by shifting the interest from the entire characteristics into a set of carefully defined points, so-called response features, that are sufficient for a particular design task considered, yet their functional dependence on the structure parameters is much less nonlinear than that of the original characteristics (Koziel and Bekasiewicz 2017a). Response feature
88
3 Physics-Based Surrogate Modeling
modeling belongs to the category of physics-based models because the allocation of the feature points is strictly related to the particular characteristics of the structure (e.g., resonances of the antenna or the poles of the microwave filter), which are, in turn, determined by the physical operation of the device. In this section, the concept of feature-based modeling is briefly outlined and illustrated. First, a construction of local surrogates for the purpose of statistical analysis is discussed, followed by quasi-global modeling of antenna input characteristics.
3.5.1
Feature-Based Modeling for Statistical Analysis
Reliable design of microwave components and circuits has to account for manufacturing tolerances and uncertainties. In many cases, the objective is a robust design, i.e., maximization of the probability that the fabricated structure satisfies given performance specifications under assumed deviations from the nominal values of geometry and/or material parameters (yield-driven design or design centering, see, e.g., Bandler et al. 1976a, b; Abdel-Malek and Bandler 1978; Styblinski and Opalski 1986; Scotti et al. 2005). In this context, statistical analysis and yield estimation are indispensable steps of the design process (Bandler and Chen 1988; Biernacki et al. 2012; Swidzinski and Chang 2000). Rapid and reliable statistical analysis and yield estimation of EM-simulated microwave structures can be realized using feature-based surrogate. Below, it is illustrated using filters (Koziel and Bandler 2015). The response features are chosen so that they can be used to uniquely determine whether or not the structure satisfies given performance requirements. The approximation model is constructed using a few training designs (and, consequently, only a few corresponding EM simulations of the structure are necessary for its setup), which only grows linearly with the dimensionality of the design space. Let x0 ¼ [x10x20 . . . xn0]T be a nominal design (typically, an optimum design with respect to given performance specifications). It is assumed that due to manufacturing uncertainties, the actual parameters of the fabricated device are x0 + dx, where a random deviation dx ¼ [dx1 . . . dx2 dxn]T is described by a given probability distribution, such as a Gaussian distribution with zero mean and a certain standard deviation or a uniform distribution with specified lower and upper bounds, e.g., dxk 2 [–dk.max dk.max], k ¼ 1, . . . , n. Let us define an auxiliary function H(x) as follows (Bandler and Chen 1988) H ð xÞ ¼
1
if f ðxÞ satisfies the design specifications
0
otherwise
Then, the yield at the nominal design x0 can be estimated as
:
ð3:26Þ
3.5 Feature-Based Modeling
89
0
|S11|
-10 -20 -30 -40
10.5
11.5
11 Frequency (GHz)
Fig. 3.26 Reflection response of the bandpass filter (—) at the optimum design with respect to given minimax specifications (marked with horizontal lines), as well as the response at a perturbed design (- - -). Circles and squares denote feature points for both responses corresponding to the –1 and –20 dB levels as well as the response maxima in the passband. Design specifications are jS11j 20 dB for 10.55–11.45 GHz and jS11j –1 dB for frequencies lower than 10.3 GHz and higher than 11.7 GHz (Koziel and Bandler 2015)
Y x0 ¼
PN
j¼1 H ðx
0
N
þ dxj Þ
,
ð3:27Þ
where dx j, j ¼ 1, . . . , N, are random vectors sampled according to the assumed probability distribution. Obviously, evaluating (3.27) by means of multiple simulations of the perturbed nominal design may be extremely expensive, particularly because reliable yield estimation requires a large number of samples (typically, a few hundred or more). In the case of small yield values, the number of samples has to be even larger (a few thousand or more) in order to avoid high variance of the estimator. The yield can be estimated using the so-called feature points. The concept was introduced in Koziel and Szczepanski (2011) in the context of the SPRP technique. Let us consider jS11j responses of a bandpass filter (cf. Figure 3.26). The plot shows the response at the nominal design (i.e., a typically desired minimax optimum w.r.t. the design specifications marked with horizontal lines, with jS11j –20 dB for 10.55–11.45 GHz and jS11j –1 dB for frequencies lower than 10.3 GHz and higher than 11.7 GHz), as well as a set of so-called feature points in this case represented by –1 and –20 dB levels as well as the peaks of the response in the passband. The location of these points is sufficient to determine whether the response violates or satisfies the given design specifications. In particular, assuming small design perturbations, the feature points corresponding to –1 and –20 dB may move toward lower or higher frequencies violating (in some cases) the specifications regarding passband and/or stopband frequencies; the feature points corresponding to jS11j maxima in the passband may move up leading to violation of the jS11j –20 requirement. The choice of feature points for a given problem is straightforward. Figure 3.26 also shows the response and the corresponding feature points at a perturbed design
90
3 Physics-Based Surrogate Modeling
(which, in this case, violates the specifications). As indicated in (Koziel 2012), modeling feature points is easier than constructing response surfaces for the entire responses. This is because the dependence of both the frequency and vertical locations of those points on respective designable parameters is much less nonlinear than for the S-parameters modeled (conventionally) as functions of frequency. As a result, only a limited number of training samples is necessary for creating such models, particularly, if only local approximations are of interest (i.e., around the nominal design). It should also be emphasized that an accurate prediction of the entire response of the structure is not considered. The focus is on the critical parts of the response where the design specifications can potentially be violated. This significantly simplifies the modeling process. In order to construct the model, let us consider 2n + 1 evaluations of the original model f at the nominal design, f(x0), and at the perturbed designs xk ¼ [x0.1 . . . x0. k + sign (k) δk . . . x0. n]T, k ¼ – n, . . . , – 1, 1, . . . , n, where δk may be, e.g., a maximum assumed deviation of the kth parameter from its nominal value. The feature points of the response vector f(xk) are denoted as pkj ¼ [ωkj λkj]T, j ¼ 1, . . . , K, where ω and λ are the frequency and magnitude components of the respective point and K is the total number of feature points. The aim is to predict the position of the feature points corresponding to a perturbed vector x0 + dx, using the available training set {xk; p–nj, . . . , p–1j, p0j, p1j, . . . , pnj}. For any given dx, a subset XS of the base set {xk} is found that defines an area containing x0 + dx. The surrogate model is set up using all the points from XS, as shown in Fig. 3.27 for n ¼ 2. Without loss of generality, it can be assumed that XS ¼ {x0, x1, . . . , xn}. Let us define pj ¼ pj0 þ β1 pj1 pj0 þ β2 pj2 pj0 þ . . . þ βn pjn pj0 ,
ð3:28Þ
where β1, β2, . . . , βn determines a unique representation of dx using vectors vi ¼ xk – x0, i ¼ 1, . . . , n. Coefficients βi can be explicitly found as
Fig. 3.27 Deviation vector dx and its expansion using star-distributed training vectors x0 and xk, k ¼ –2, –1, 1, 2 (denoted as •) (Koziel and Bandler 2015). The shaded area denotes an area defined by a subset XS of points being the closest to x0 + dx, which is represented as a linear combination of vectors xk – x0. The feature points at x0 + dx are calculated using the coefficients of this linear combination and the feature points of f(xk) for xk 2 XS, with p j ¼ p0j + β1(p1j – p0j) + β2(p2j – p0j)
3.5 Feature-Based Modeling
½β 1
91
β2 . . . βn T ¼ ½v1 v2 . . . vn 1 x x0 :
ð3:29Þ
The approximation model of the feature points can be defined as s x0 þ dx ¼ p1 x0 þ dx pK x0 þ dx
ð3:30Þ
where p j(x0 + dx) ¼ [ω j(x0 + dx) λ j(x0 + dx)]T, j ¼ 1, . . . , K. Having s(x0 + dx), one can estimate yield in a way similar to (3.26 and 3.27). The fundamental difference is that the satisfaction/violation of the design specification frequencies/levels is verified for the feature points only rather than for the entire responses. It should be emphasized that using response features for estimating yield rather than constructing, e.g., a linear model of the entire S-parameter response, is critical to accuracy. Let us consider a simple, first-order Taylor expansion of the filter model around the nominal design x0 f L ð xÞ ¼ f x 0 þ J f x0 x x 0 ,
ð3:31Þ
where Jf(x0) is an estimated Jacobian of f at the nominal design. The estimate can be obtained using evaluations of f at the perturbed designs xk. Figure 3.28 shows an S-parameter prediction obtained by evaluating a linear surrogate fL constructed from the filter responses evaluated for the same training set used for the feature-based model. The lack of accuracy coming from the very sharp responses (as functions of frequency) is reflected in underestimated yield predictions. This indicates that the feature-based yield estimation, although based on the same data set, is fundamentally different from simple linear modeling. It should be mentioned that a number of sophisticated methods for parametric macromodeling (e.g., Ferranti et al. 2009; Ferranti et al. 2011) or stochastic macromodeling (e.g., Sumant et al. 2010, 2012) can be found in the literature that 0
|S11|
-10 -20 -30 -40 10.2
10.4
10.6
10.8 11 11.2 Frequency (GHz)
11.4
11.6
11.8
Fig. 3.28 Fifth-order bandpass filter of Section III.A: the filter response at the nominal design (—) and the response obtained from a linear model (3.31) constructed using a perturbed design (- - -) (at the selected reference design) (Koziel and Bandler 2015). The spikes that appear due to the linear modeling of sharp responses lead to considerable yield underestimation (cf. Table 3.3)
92
3 Physics-Based Surrogate Modeling
Fig. 3.29 Fifth-order waveguide bandpass filter (Hauth et al. 1993)
allow for avoiding the presence of abnormal responses of the simple linear model (3.31) through, e.g., passivity enforcement. Here, the model (3.31) was only used in order to indicate that the “naive” utilization of the small data set exploited by the feature-based model leads to very poor predictions. Consider the X-band waveguide filter with nonsymmetrical irises (Hauth et al. 1993) shown in Fig. 3.29. The design variables are x ¼ [z1 z2 z3 d1 d2 d3 t1 t2 t3]T. The filter is simulated in CST Microwave Studio (CST 2018) (~140,000 tetrahedrons, simulation time about 8 min). The nominal design, x0 ¼ [12.08 14.21 14.69 13.98 11.69 10.81 1.55 3.07 2.46]Tmm, is a minimax optimum with respect to the following design specifications: jS11j –20 dB for 10.55 GHz ω 11.45 GHz and jS11j –1 dB for ω 10.4 GHz and ω 11.7 GHz. The minimax optimization is understood as minimization of a maximum violation of the aforementioned design specifications within the respective frequency sub-bands. Yield estimation has been carried out using four scenarios for geometry parameter deviations, including a uniform probability distribution with a maximum deviation equal to 0.01 and 0.02 mm (Cases 1 and 2) and a normal distribution with zero mean and standard deviation 0.01 mm and 0.02 mm (Cases 3 and 4). The deviations are taken as uncorrelated. The yield has been estimated as in (3.26 and 3.27), using the eight feature points shown in Fig. 3.26. For comparison, the yield was also estimated using conventional Monte Carlo (MC) analysis with 500 random samples (the number of samples is limited due to the computational cost of the EM simulation). The results are shown in Table 3.3. Figure 3.30 presents a visualization of the yield estimation for Case 2. The agreement between the yield estimation obtained using response features and conventional Monte Carlo analysis is excellent. As a matter of fact, the results obtained using the feature-based approach are more reliable than MC: the uncertainty in the latter is relatively large due to the small number of samples used in the process to keep the cost low. Feature-based yield estimation was executed for N ¼ 5,000.
3.5 Feature-Based Modeling
93
Table 3.3 Yield estimation: Fifth-order waveguide filter Case 1
Distribution Uniform (max. dev. 0.01 mm)
2
Uniform (max. dev. 0.02 mm)
3
Gaussian (std. dev. 0.01 mm)
4
Gaussian (std. dev. 0.02 mm)
Yield estimation method Feature points EM-based Monte Carlo Linear modelingb Feature points EM-based Monte Carlo Linear modelingb Feature points EM-based Monte Carlo Linear modelingb Feature points EM-based Monte Carlo Linear modelingb
Estimated yield 0.97 0.97 0.63 0.56 0.55 0.12 0.69 0.69 0.19 0.25 0.24 0.02
CPU costa 19 500 19 19 500 19 19 500 19 19 500 19
Estimation cost in number of EM analyses. Feature-based yield estimation utilizes N ¼ 5000 random samples b Estimation based on a linear model of the S-parameter response around the nominal design a
0
|S11|
-10 -20 -30 -40 10.2
10.4
10.6
10.8 11 11.2 Frequency (GHz)
11.4
11.6
11.8
Fig. 3.30 Fifth-order waveguide bandpass filter: yield estimation for Case 2 (Koziel and Bandler 2015). Gray lines correspond to 500 EM-simulated random samples for Monte Carlo analysis; circles represent corresponding feature points calculated using the approximation model (3.30)
3.5.2
Feature-Based Modeling of Antenna Input Characteristics
In this section, modeling reflection responses of antenna structures is considered. The high-fidelity model is obtained from full-wave electromagnetic (EM) simulation and denoted as f(x). It represents the reflection coefficient jS11j of the antenna evaluated at m frequencies, ω1 to ωm, thus f(x) ¼ [f(x, ω1) . . . f(x, ωm)]T. The objective is to build a replacement model (surrogate) s. The surrogate should represent the EM model over a given region X of the design space. The set of training samples is denoted as XB ¼ {x1, x2, . . . , xN} ⊂ X. The corresponding EM model responses f(xk) are acquired beforehand. According to
3 Physics-Based Surrogate Modeling
-10 -20 -30 9
13 8.5
8
|S11| [dB]
|S11| [dB]
94
7 15
-40 9
14
7.5 ax
-20 13 8.5
8
ay
|S11| [dB]
(a)
7.5 ax
14 7 15
ay
(b)
-10 -20 -30 9
13 8.5
8
7.5 ax
14 7 15
ay
(c) Fig. 3.31 Exemplary responses of the dielectric resonator antenna considered in Sect. 3.3 (reflection coefficient jS11j). The responses are evaluated in the region 7.0 ax 9.0 and 13.0 ay 15.0 at the frequencies of 5.3 GHz (a), 5.5 GHz (b), and 5.7 GHz (c). Other variables are fixed to the following values: az ¼ 9 ac ¼ 0 us ¼ 2 ws ¼ 10 ys ¼ 8 (all in mm) (Koziel and Bekasiewicz 2017a)
the conventional approach to data-driven surrogate construction, the responses f(x, ωj), j ¼ 1, . . . , m, are approximated directly (either separately for each frequency or by treating the frequency as an additional input parameter of the model). The fundamental problem is nonlinearity of the responses, particularly for narrowband antennas (Koziel and Bekasiewicz 2017a; Koziel et al. 2016). Typical responses of such structures are shown in Fig. 3.31 (reflection responses of a dielectric resonator antenna of Fig. 3.34). For clarity, the responses evaluated in the region corresponding to 7.0 ax 9.0 and 13.0 ay 15.0 (see Fig. 3.34 for parameter explanation) at three various frequencies 5.3 GHz, 5.5 GHz, and 5.7 GHz are shown. Accurate modeling of such nonlinear landscapes is only possible within limited parameter ranges and a large number of training samples. The primary objective is to reduce the number of training data samples necessary to construct an accurate surrogate model (thus to reduce the cost of training data acquisition). It is achieved by reformulating the modeling process and conducting it at the level of appropriately defined response features. Figure 3.32 clarifies the definition of the feature points in the case of narrowband antennas. The characteristic point set is constructed sequentially as follows: (i) identification of the primary point which corresponds to the center frequency (antenna resonance) and the response level at that frequency; (ii) allocation of the supplemental points (in this case, uniformly with respect to the level and separately on the left- and right-hand side of the primary point); and (iii) allocation of the infill points uniformly in frequency in between the supplemental points. Clearly, one needs to ensure that the number of characteristic points is sufficient so as to allow a reliable synthesis of the antenna response (through interpolation).
3.5 Feature-Based Modeling
95
0
|S11| [dB]
-5 -10 -15 -20 -25
4
4.5
5
6 5.5 Frequency [GHz]
6.5
7
Fig. 3.32 Definition of response features in the case of a narrowband antenna reflection characteristic: the primary point (corresponding to the antenna resonance) is represented as O; □ represent supplemental points distributed equally with respect to response level; (○) denote infill points distributed equally in frequency between the main and supplemental points (note that the number of points may be different for various intervals) (Koziel and Bekasiewicz 2017a)
On the other hand, although it is important that the major features of the response (e.g., antenna resonance) are accounted for, particular point allocation is not critical. The response features, once defined, can be easily extracted using simple postprocessing of the EM simulation results. For subsequent considerations, the jth feature point of f(xk) ( j ¼ 1, . . . , K, k ¼ 1, . . . , N ) will be denoted as fkj ¼ [ωkj λkj]. Let ωkj and λkj represent the frequency and the magnitude (level) components of fkj, respectively. For the sake of illustration, the frequency and level components of the selected feature points have been shown in Fig. 3.33. The considered design space region is the same as the one shown in Fig. 3.32. It is important that the functional landscapes of the feature points are not as nonlinear as those shown in Fig. 3.32. This is particularly the case of the frequency component. Clearly, it is to be expected that construction of a reliable surrogate model at the feature point level will require a smaller number of training samples than modeling the reflection response in a traditional manner. Having the response features defined, the surrogate modeling process works as follows. First, the data-driven models sω. j(x) and sλ. j(x), j ¼ 1, . . . , K, of the sets of corresponding feature points are constructed using available training designs, {f1j, f2j, . . . , fNj}, j ¼ 1, . . . , K (Koziel and Bekasiewicz 2017a). At this stage, kriging interpolation is utilized (Kleijnen 2009). The surrogate model itself is defined as sðxÞ ¼ sðx, ω1 Þ . . . sðx, ωm ÞT ,
ð3:32Þ
where its jth component is given as (Koziel and Bekasiewicz 2017a) s x, ωj ¼ I ΩðxÞ, ΛðxÞ, ωj :
ð3:33Þ
5.5 5 9
13 8
ax
7 15
|S11| [dB]
3 Physics-Based Surrogate Modeling
Frequency [GHz]
96
-11 -11.5 -12
13
9
14 ay
8
ax
14 7 15 ay
5.6 5.4 5.2 9
13 8
ax
7 15
|S11| [dB]
Frequency [GHz]
(a)
14 ay
-30 -40 9
13 8
ax
7 15
14 ay
5.8 5.6 5.4 9
13 8
ax
7 15
|S11| [dB]
Frequency [GHz]
(b)
14 ay
-12.5 -13 -13.5 9
13 8
ax
14 7 15 ay
(c) Fig. 3.33 Frequency (left panel) and level (right panel) components as functions of geometry parameters ax and ay (with other variables fixed) for the three selected feature points. The responses are evaluated over the same design space region as that considered in Fig. 3.32. Note that the functional landscapes of the feature point coordinates are considerably less nonlinear than those for original responses (cf. Fig. 3.32) (Koziel and Bekasiewicz 2017a)
In (3.33), Λ(x) ¼ [sλ.1(x) sλ.2(x) . . . sλ.K(x)] and Ω(x) ¼ [sω.1(x) sω.2(x) . . . sω.K(x)] are the predicted feature point locations corresponding to the evaluation design x. The aim is to evaluate the antenna response at a discrete set of frequencies ω1 through ωm; it is necessary to interpolate both the level vector Λ and frequency vector Ω into the response at the above set of frequencies. This interpolation is represented as I(Ω, Λ, ω). For demonstration purposes, let us consider a rectangular dielectric resonator antenna (Sim et al. 2014). The antenna geometry is shown in Fig. 3.34. The dielectric resonator is implemented using a material with permittivity εr ¼ 10. The substrate is Rogers RO4003 (h ¼ 0.5 mm, εr ¼ 3.3). The computational model of the antenna also includes polycarbonate housing with dielectric permittivity of 2.8. There are seven geometry parameters describing the DRA structure x ¼ [ax ay az ac us ws ys]T. Furthermore, there are four fixed parameters: dx ¼ dy ¼ dz ¼ 1 mm
3.5 Feature-Based Modeling
97
a
b z
x
y
c
d
dy
ax ay
dz
ys ac us
y x
dx
z
az x
ws w0
Fig. 3.34 Dielectric resonator antenna (Sim et al. 2014): (a) 3D view of the structure with housing, (b) visualization of the structure cross section with highlight on mesh density, (c) top and (d) front views with marked dimensions
and w0 ¼ 1.15 mm. Interested readers may find a comprehensive discussion of the dielectric resonator antennas in (Petosa 2007). The computational model of the DRA is implemented and simulated in a fullwave electromagnetic solver CST Microwave Studio (CST 2018). The physicsbased model of the structure denoted as f(x) has been implemented in the package and then discretized using tetrahedral mesh (~420,000 mesh cells). The discretization slice of the antenna is shown in Fig. 3.34b. The antenna response has been obtained using a time-domain EM solver. The average simulation time of the model on a dual Intel Xeon E5540 machine with 6 GB RAM is 19 min. The frequency range is 4 GHz to 7 GHz. The objective is to construct the surrogate model of the reflection coefficient jS11j. The model domain is defined as X ¼ [x(0) – δ, x(0) + δ], with the center x(0) ¼ [8.0 14.0 9.0 0.0 2.0 10.0 8.0]T mm and size δ ¼ [2.0 2.0 2.0 2.0 1.0 2.0 2.0]T mm. An important characteristic of any surrogate modeling technique is its scalability with respect to the number of training samples. For the sake of model construction, six training sets of various sizes are utilized. The sets contain from 20 to 800 samples. The design of experiment technique selected for sample allocation is modified Latin hypercube sampling (LHS) (Beachkofski and Grandhi 2002). The error measure utilized for model verification is defined as kf(x) – s(x)k/kf(x)k and expressed in percent. There is a separate set of 100 LHS-allocated testing samples for error estimation.
98
3 Physics-Based Surrogate Modeling
Table 3.4 Modeling results of the DRA (Koziel and Bekasiewicz 2017a) Model Number of training points Feature-based surrogate Kriging interpolationa a
Average error (%) 20 50 23.4 10.8 43.2 36.9
100 7.8 29.1
200 6.7 24.0
400 3.9 11.1
800 3.0 8.0
Direct kriging interpolation of high-fidelity model jS11j responses
0
|S11| [dB]
-5 -10 -15 -20 -25 4.5
5
5.5 Frequency [GHz]
6
6.5
Fig. 3.35 High-fidelity (—) and feature-based model (set up with 400 high-fidelity training points) (o) at the selected test designs
The model error values averaged over the testing set have been gathered in Table 3.4. For additional verification, the feature-based surrogate has been compared to conventional modeling of S11. Kriging interpolation has been selected as a representative benchmark method. The first observation is that the feature-based technique allows for a considerable reduction of the training data set while ensuring comparable predictive power of the surrogate. Furthermore, even for the largest considered training set (800 samples), accuracy of the conventional model is still insufficient for conducting practical design tasks such as parametric optimization. At the same time, accuracy of the feature-based model is already acceptable when constructed using 200 samples. Figure 3.35 shows the EM-simulated antenna responses and the surrogate model responses for several selected test designs. Visual inspection indicates that the alignment between the surrogate and the electromagnetic model is excellent. An additional verification has been conducted by using the feature-based model for design optimization of the DRA. The design objective is to minimize the antenna reflection (i.e., jS11j) in the frequency range of 0.2 GHz around a given operating frequency. The maximum acceptable level is –10 dB. The following three cases with three different operating frequencies are considered: 5.0 GHz, 5.5 GHz, and 6.0 GHz. The initial design is the same for all instances: xinit ¼ [8.0 12.0 8.0 0.0 1.0 11.0 7.0]T.
3.6 Physics-Based Surrogates for Optimization
b
0
-10 -15 -20
-10 -15
5 6 Frequency [GHz]
7
-25
-10 -15 -20
-20 4
0 -5
-5 |S11| [dB]
|S11| [dB]
-5
-25
c
0
|S11| [dB]
a
99
4
5 6 Frequency [GHz]
7
-25
4
5 6 Frequency [GHz]
7
Fig. 3.36 Application of feature-based surrogate for antenna optimization: responses of the highfidelity model at the initial design (- - -) and at the design obtained by optimizing the feature-based surrogate (—). The surrogate model response marked as (o). There are three cases considered corresponding to the operating frequency of 5.0 GHz (a), 5.5 GHz (b), and 6.0 GHz (c) (Koziel and Bekasiewicz 2017a)
The responses of the optimized antenna for all three cases are shown in Fig. 3.36. It can be observed that the responses of the EM simulation model satisfy the prescribed specifications and are well aligned with the responses of the surrogate model. Therefore, no further correction is necessary.
3.6
Physics-Based Surrogates for Optimization
One of the most important applications of physics-based models is (local) surrogateassisted optimization (Koziel et al. 2016; Feng et al. 2019; Koziel and Leifsson 2013a; Koziel et al. 2008; Robinson et al. 2008; Eldred and Dunlavy 2006; Leifsson et al. 2014b). The considered problem is of the form x ¼ arg min U ðf ðxÞÞ, x2X
ð3:34Þ
where f(x) is the (generally vector-valued) output of the high-fidelity model, U is a scalar objective function, whereas x 2 X ⊆ Rn. The surrogate-assisted algorithm is an iterative procedure producing a series x(i), i ¼ 0, 1, 2, . . . , of approximations to x as (a vector-valued version of (3.1)) xðiþ1Þ ¼ arg min U sðiÞ ðxÞ , x
ð3:35Þ
where s(i) is the surrogate model at the iteration i. The starting point for solving (3.35) is x(i), and the surrogate is supposed to be at least zero-order consistent with the high-fidelity model, i.e., s(i)(x(i)) ¼ f(x(i)) (Alexandrov et al. 1998). First-order consistency is normally required to ensure the algorithm convergence in classical terms; however, the problem-specific knowledge embedded in the underlying
100
3 Physics-Based Surrogate Modeling
low-fidelity model (the surrogate is constructed from) is often sufficient to ensure satisfactory performance (Koziel and Leifsson 2013a; Koziel 2010a; Robinson et al. 2008). This section gives a brief exposition of several modeling techniques in this particular context. Their common feature is that the surrogate is usually constructed using a single high-fidelity model sample (typically evaluated at the most recent design along the optimization path). The specific methods discussed here include space mapping (SM), approximation model management optimization (AMMO), manifold mapping (MM), shape-preserving response prediction (SPRP), adaptively adjusted design specifications (AADS), feature-based optimization (FBO), and adaptive response scaling (ARS).
3.6.1
Space Mapping
Space mapping (SM) (Bandler et al. 1994, 2004; Koziel et al. 2006a) refers to a family of design optimization techniques originally developed for solving expensive problems in computational electromagnetics, in particular, microwave engineering. In recent years, SM has also found applications across other engineering disciplines (Redhe and Nilsson 2004; Bandler et al. 2004; Priess et al. 2011; Marheineke et al. 2012; Koziel and Leifsson 2012; Tu et al. 2013; Echeverria and Hemker 2005; Feng et al. 2003; Feng and Huang 2003; Crevecoeur et al. 2009). The initial versions of SM were exclusively based on the transformation of the low-fidelity model domain (aggressive SM, input SM; Bandler et al. 1995, 1994), which was sufficient for handling many engineering problems, especially in electrical engineering, where the ranges of both the low- and high-fidelity model responses are similar. Other SM variations were developed to handle situations where the lowand high-fidelity models are severely misaligned in terms of the response levels (output SM; Bandler et al. 2003, 2004; Koziel et al. 2014, 2006a, 2008; Robinson et al. 2008, see also Sect. 3.4). Original SM assumes the existence of a mapping P between the high- and low-fidelity model domains (Bandler et al. 1994), such that xc ¼ P(xf) and c(P (xf)) f(xf). Given P, the direct solution of the original problem (3.1) can be replaced by finding xf# ¼ P–1(xc). Here, xc is the optimal design of c defined as xc ¼ argmin{xc : U(c(xc))}; xf# can be considered as a reasonable estimate of xf. In other words, the problem (3.34) can be reformulated as , xf ¼ arg min U c P xf xf
ð3:36Þ
where c(P(xf)) is a surrogate model. However, P is not given explicitly: it can only be evaluated at any xf by means of the parameter extraction (PE) procedure, P(xf) ¼ argmin{xc : kf(xf) – c(xc)k. One of practical issues is a possible nonuniqueness of the solution to (3.36) (Bandler et al. 1995). Another issue is the assumption on the similarity of high- and low-fidelity model ranges, which is a very
3.6 Physics-Based Surrogates for Optimization
101
strong assumption (Alexandrov and Lewis 2001). These and other issues led to numerous improvements, including parametric SM which is outlined later in this section. Aggressive SM (ASM) (Bandler et al. 1995) is one of the first versions of SM and probably the most popular one in microwave engineering (Sans et al. 2014; RayasSanchez 2016). Assuming uniqueness of xc, the solution to (3.36) is equivalent to reducing the residual vector f ¼ f(xf) ¼ P(xf) – xc to zero. The first step of the ASM algorithm is to find xc. Next, ASM iteratively solves the nonlinear system f(xf) ¼ 0 for xf. At the jth iteration, the calculation of the error vector f( j ) requires an evaluation of P( j )(xf( j )) ¼ P(xf( j )) ¼ arg min {xc : f(xf( j )) – c(xc)k}. The quasiNewton step in the high-fidelity model space is given by Bð jÞ hð jÞ ¼ f ð jÞ ,
ð3:37Þ
where B( j ) is the approximation of the space mapping Jacobian JP ¼ JP(xf) ¼ [∂PT/ ∂xf]T ¼ [∂(xcT)/∂xf]T. Solving (3.37) for h( j ) gives the next iterate xf( j + 1) ¼ xf( j ) + h( j ). The algorithm terminates if ||f( j )|| is sufficiently small. The output of the algorithm is an approximation to xf# ¼ P–1(xc). A popular way of obtaining the matrix B is through a rank one Broyden update (Broyden 1965) of the form h
Bð jþ1Þ
i f ð jþ1Þ f ð jÞ Bð jÞ hð jÞ hð jÞT ¼ Bð jÞ þ : 2 ð jÞ h
ð3:38Þ
Several improvements of the ASM algorithm have been proposed (Bakr et al. 1998, 1999). ASM has been the most popular SM approach in microwave engineering till now (Rayas-Sanchez 2016). Parametric SM is another type of space mapping. It is more generic and characterized by explicit analytical form of the surrogate model. In parametric SM, the optimization algorithm is an iterative process (3.35). The two simple examples of input and implicit SM were shown in Sect. 3.4. In general, the input SM surrogate model can take the form (Koziel et al. 2006a) sðiÞ ðxÞ ¼ c BðiÞ x þ qðiÞ :
ð3:39Þ
The matrices B(i) and q(i) are obtained by minimizing misalignment between the surrogate and the high-fidelity model as h i Xi ðk Þ ðk Þ w kf x þ q k: BðiÞ , qðiÞ ¼ arg min c B x i:k k¼0 ½B, q
ð3:40Þ
The problem (3.40) is, in fact, a nonlinear regression task with wi.k being the weighting factors. A common choice of wi.k is wi.k ¼ 1 for all i and all k (all previous designs contribute equally) or wi.1 ¼ 1 and wi.k ¼ 0 for k < i (the surrogate depends on the most recent design only).
102
3 Physics-Based Surrogate Modeling
In general, the SM surrogate model is constructed as follows: sðiÞ ðxÞ ¼ s x, pðiÞ ,
ð3:41Þ
where s is a generic SM surrogate model, i.e., the low-fidelity model c composed with suitable (usually linear) transformations. The parameters p are obtained in the extraction process similar to (3.40). More information about specific SM surrogates can be found in the literature (Bandler et al. 2004; Koziel et al. 2006a). Despite its simplicity and demonstrated efficiency, there are some practical issues that are still preventing SM from being widely accepted by industry designers. These include (i) finding an appropriate balance between the quality and the speed of the underlying low-fidelity model as well as (ii) selection of appropriate SM transformations (Koziel and Bandler 2007c, d). Resolving the above listed issues is not a trivial task (Koziel et al. 2008). As a matter of fact, similar issues are actually common to the majority of physics-based SBO methods. On the other hand, the efficacy of SM has been demonstrated by solving design problems in various engineering disciplines (Redhe and Nilsson 2004; Bandler et al. 2004; Marheineke et al. 2012; Echeverria and Hemker 2005; Feng and Huang 2003; Crevecoeur et al. 2009). A number of enhancements of SM algorithms have been suggested to alleviate some of the difficulties such as potential convergence problems (Koziel et al. 2010a, c).
3.6.2
Approximation Model Management Optimization
AMMO (Alexandrov and Lewis 2001) is a simple algorithmic framework with the surrogate model constructed through a response correction. It exploits sensitivity data to ensure first-order consistency conditions, in particular, s(i)(x(i)) ¼ f(x(i)) and ∇s(i)(x(i)) ¼ ∇ f(x(i)) (note that the scalar model outputs are assumed here). Additionally, AMMO utilizes the trust-region methodology (Conn et al. 2000) to guarantee convergence of the optimization process to the high-fidelity model optimum. Assuming that β(x) ¼ f(x)/c(x) is the correction function and T βi ðxÞ ¼ β xðiÞ þ ∇β xðiÞ x xðiÞ ,
ð3:42Þ
the surrogate model is defined as sðiÞ ðxÞ ¼ βi ðxÞcðxÞ:
ð3:43Þ
Equation (3.42) can be shown to satisfy the aforementioned consistency conditions. Obviously, (3.42) requires both the low- and high-fidelity derivatives.
3.6 Physics-Based Surrogates for Optimization
3.6.3
103
Manifold Mapping
Manifold mapping (MM) (Echeverria and Hemker 2005, 2008) is an interesting response correction technique that is capable of comprehensive exploitation of available high-fidelity model data. Here, the basic version of MM described in (Echeverria and Hemker 2005) is discussed. The MM surrogate model is defined as sðiÞ ðxÞ ¼ f xðiÞ þ SðiÞ cðxÞ c xðiÞ ,
ð3:44Þ
with S(i), being the m m correction matrix, defined as SðiÞ ¼ ΔF ΔC{ ,
ð3:45Þ
where h i ΔF ¼ f xðiÞ f xði1Þ . . . f xðiÞ f xð max fin,0gÞ , h i ΔC ¼ c xðiÞ c xði1Þ . . . c xðiÞ c xð max fin,0gÞ :
ð3:46Þ ð3:47Þ
The pseudoinverse, denoted by {, is defined as ΔC{ ¼ V ΔC Σ{ΔC UTΔC ,
ð3:48Þ
where UΔC, ΣΔC, and VΔC are the factors in the singular value decomposition of the matrix ΔC. The matrix ΣΔC{ is the result of inverting the non-zero entries in ΣΔC, leaving the zeros invariant. Figure 3.37 shows the effect of applying the mapping (3.44) to the low-fidelity model. Upon convergence, the linear correction S (being the limit of S(i) with i ! /) maps the point c(x) to f(x) and the tangent plane for c (x) at c(x) to the tangent plane for f(x) at f(x) (Echeverría and Hemker 2008). It should be noted that although MM does not explicitly use sensitivity information, the surrogate and the high-fidelity model Jacobians become more and more similar to each other toward the end of the MM optimization process (i.e., when kx(i) – x(i – 1)k ! 0 so that the surrogate (approximately) satisfies both zero- and firstorder consistency conditions (Alexandrov and Lewis 2001) with f. This allows for a more precise identification of the high-fidelity model optimum. On the other hand, the correction matrix S(i) can be defined using exact Jacobians of the low- and highfidelity models if available. The MM algorithm is illustrated using an ultra-wideband (UWB) monopole shown in Fig. 3.38. Design variables are x ¼ [h0 w0 a0 s0 h1 w1 lgnd ws]T. Other parameters are fixed: ls ¼ 25, wm ¼ 1.25, and hp ¼ 0.75 (all in mm). The microstrip input of the monopole is fed through an edge mount SMA connector (SMA 2013).
104
3 Physics-Based Surrogate Modeling
Fig. 3.37 The concept of the MM model alignment for a least-squares optimization problem: xc is the low-fidelity model minimizer, and y is the vector of design specifications. The straight lines denote the tangent planes for f and c at their optimal designs, respectively. By the linear correction S, the point c(x) is mapped to f(x), and the tangent plane for c(x) at c(x) to the tangent plane for f (x) at f(x) (Koziel and Ogurtsov 2011) Fig. 3.38 UWB monopole: top view, substrate shown transparent. Magneticsymmetry wall is shown with the dash-dot line (Koziel and Leifsson 2016)
Both the high- and low-fidelity models are implemented in CST Microwave Studio (CST 2018). The simulation time ratio for the models is around 20. The design specifications for antenna reflection are jS11j –10 dB for 3.1–10.6 GHz. The initial design is x(0) ¼ [18 12 2 0 5 1 15 40]T mm. Because the low-fidelity model is relatively expensive, the MM algorithm is using the underlying kriging interpolation model ckr created in the vicinity of the approximate optimum of c (obtained at the cost of 100 c evaluations). Optimization performed using the MM algorithm yields the final design x ¼ [19.13 20.13 1.95 1.33 1.79 6.32 15.03 36.36]T mm (jS11j < –15 dB in the frequency band of interest). The total design cost is about 21 high-fidelity model evaluations. Figure 3.39a shows reflection responses of the high- and low-fidelity models at the initial design as well as the high-fidelity model response at the final design. The convergence plot for the MM algorithm is shown in Fig. 3.39b.
3.6 Physics-Based Surrogates for Optimization
a
105
b
0
100
||x(i)-x(i-1)||
|S11| [dB]
-5 -10 -15
10-2
-20 -25
10-1
4
6
8
10
2
Frequency [GHz]
4
6
8
10
12
Iteration Index i
Fig. 3.39 UWB monopole: (a) high- (- - -) and low-fidelity ( ) model responses at the initial design, as well as high-fidelity model (—) at the final design x; (b) convergence plot of the MM algorithm (Koziel and Leifsson 2016)
3.6.4
Shape-Preserving Response Prediction
Shape-preserving response prediction (SPRP) is a parameter-less approach (Koziel 2010a, 2012), where the surrogate model is constructed assuming that the change of the high-fidelity model response due to the adjustment of the design variables can be predicted using the actual changes of the low-fidelity model response. Utilization of SPRP for quasi-global modeling has been discussed in Sect. 3.4.4. In SPRP-based optimization, it is critically important that the low-fidelity model is physics-based, which ensures that the effect of the design parameter variations on the model response is similar for both models. The change of the low-fidelity model output is described by the translation vectors corresponding to a certain (finite) number of characteristic points of the model’s response. These translation vectors are subsequently used to predict the change of the high-fidelity model response with the actual response of f at the current iteration point, f(x(i)), treated as a reference. Figure 3.40 explains the concept of SPRP using an example of a microstrip bandstop filter (Koziel 2010b). Figure 3.40a shows the example of the low-fidelity model response jS21j (transmission coefficient) in the frequency range from 8 GHz to 18 GHz, at the design x(i), as well as the low-fidelity model response at some other design x. The circles denote five characteristic points of c(x(i)) selected to represent jS21j ¼ –3 dB, jS21j ¼ –20 dB, and the local jS21j maximum (at about 13 GHz). The squares denote corresponding characteristic points for c(x), whereas small line segments represent the translation vectors that determine the “shift” of the characteristic points of c when changing the design variables from x(i) to x. Because the low-fidelity model is physics-based, the high-fidelity model response at the given design x can be predicted using the same translation vectors applied to the corresponding characteristic points of the fine model output at x(i), f(x(i)), cf. Figure 3.40b.
106
3 Physics-Based Surrogate Modeling
b
0
0
-10
-10
-20
-20
|S21| [dB]
|S21| [dB]
a
-30
-30
-40
-40
-50
-50 10
12 14 16 Frequency [GHz]
c
10
12 14 16 Frequency [GHz]
0
|S21| [dB]
-10 -20 -30 -40 -50 10
16 14 12 Frequency [GHz]
Fig. 3.40 SPRP concept (Koziel 2010a): (a) example low-fidelity model response at the design x(i), c(x(i)) (—) and at another design x, c(x) (), characteristic points of c(x(i)) (○) and c(x) (□), and the translation vectors (short lines); (b) high-fidelity model response at x(i), f(x(i)) (—) and the predicted fine model response at x () obtained using SPRP based on characteristic points of Fig. 3.40a; characteristic points of f(x(i)) (○) and the translation vectors (short lines) were used to find the characteristic points (□) of the predicted high-fidelity model response; low-fidelity model responses c(x(i)) and c(x) are plotted using thin solid and dotted line, respectively; (c) predicted () and actual (—) high-fidelity model response at x
As indicated in Fig. 3.40c, the predicted high-fidelity model response at the design x is in a very good agreement with the actual response, f(x). A rigorous formulation of SPRP can be found in the literature (Koziel 2010a, b). It should be mentioned that an important assumption of SPRP is that the shapes of the high- and low-fidelity model responses are similar in overall. This implies that the characteristic points of the coarse model c and the fine model f responses are in one-to-one correspondence. If this assumption does not hold, the SPRP surrogate cannot be evaluated because the translation vectors are not well-defined. Generalizations of SPRP that relax the above listed assumption can be found in some cases
3.6 Physics-Based Surrogates for Optimization
107
(Koziel 2010a). In the context of optimization of high-frequency structures, handling characteristics of narrowband and multiband structures as well as array radiation pattern optimization (e.g., for sidelobe reduction) are examples of problems SPRP is well suited for.
3.6.5
Adaptively Adjusted Design Specifications
The physics-based surrogates discussed so far exploit the idea of low-fidelity model correction. Adaptively adjusted design specifications (AADS; Koziel 2010c) offer an alternative way of exploiting the system-specific knowledge embedded in the low-fidelity model by modifying the design specifications so as to account for the model discrepancies. AADS is not universally applicable, but it is extremely simple to implement as no changes of the low-fidelity model are required. AADS consists of the two basic steps: 1. Modify the design specifications of the original problem to account for the differences between the responses of the high-fidelity model f and the low-fidelity model c at their characteristic points. 2. Obtain a new design by optimizing the low-fidelity model with respect to the modified specifications. The way AADS is formulated allows for handling the minimax type of specifications (Koziel 2010c) so that the characteristic points of the responses should correspond to the relevant design specification levels. The characteristic points may also include local maxima/minima of the respective responses at which the specifications may not be satisfied. Figure 3.41a shows the high- and low-fidelity model responses at the optimal design of c, corresponding to a bandstop filter example, the same as considered in Sect. 3.6.4; design specifications are indicated using horizontal lines, with jS21j –30 dB for 12 GHz—14 GHz, jS21j –3 dB for 8 GHz—9 GHz, and 17 GHz—18 GHz. Figure 3.41b shows the characteristic points of f and c, i.e., the points corresponding to –3 dB and –30 dB levels as well to the local maxima of the responses. In the first step of the AADS optimization procedure, the design specifications are modified so that the level of satisfying/violating the modified specifications by the low-fidelity model response corresponds to the satisfaction/violation levels of the original specifications by the high-fidelity model response. In the example of Fig. 3.41, for each edge of the specification line, the edge frequency is shifted by the difference of the frequencies of the corresponding characteristic points, e.g., the left edge of the –30 dB specification line is moved 0.7 GHz to the right. This shift is equal to the length of the line connecting the corresponding characteristic points in Fig. 3.41b. Similarly, the specification levels are shifted by the difference between the local maxima/minima values for the respective points, e.g., the –30 dB level is shifted 8.5 dB down because of the
108
3 Physics-Based Surrogate Modeling
b
0
0
-10
-10
-20
-20
|S21|[dB]
|S21|[dB]
a
-30
-30
-40
-40
-50
-50 8
10 12 14 16 Frequency [GHz]
c
18
8
10 12 14 16 Frequency [GHz]
18
0
|S21|[dB]
-10 -20 -30 -40 -50 8
10 12 14 16 Frequency [GHz]
18
Fig. 3.41 AADS concept (responses of f and c are marked with solid and dashed line, respectively): (a) high- and low-fidelity model responses at the initial design (optimum of c) as well as the original design specifications, (b) characteristic points of the responses corresponding to the specification levels (–3 dB and –30 dB) and to the local response maxima, (c) high- and low-fidelity model responses at the initial design and the modified design specifications (Koziel 2010c)
difference of the local maxima of the corresponding characteristic points of f and c. Modified design specifications are shown in Fig. 3.41c. Subsequently, the low-fidelity model is optimized with respect to the modified specifications, and the new design obtained this way is treated as an approximated solution to the original design problem. Steps 1 and 2 can be iterated if necessary. If the correlation between the low- and high-fidelity models is good, a substantial design improvement is typically observed after the first iteration; however, additional iterations may bring further design improvements (Koziel 2010c). Figure 3.42 illustrates an AADS iteration applied to microstrip-to-SIW transition design (Ogurtsov and Koziel 2011). Note that optimizing the low-fidelity model with respect to the modified specifications results in improving the high-fidelity model
3.6 Physics-Based Surrogates for Optimization
b
0
|S11|, |S22| [dB]
|S11|, |S22| [dB]
a
-10 -20 -30 6
7
8 9 10 11 Frequency [GHz]
d
0 -10 -20 -30 6
7
8 9 10 11 Frequency [GHz]
0 -10 -20 -30
12
|S11|, |S22| [dB]
|S11|, |S22| [dB]
c
109
12
6
7
8 9 10 11 Frequency [GHz]
12
6
7
8 9 10 11 Frequency [GHz]
12
0 -10 -20 -30
Fig. 3.42 AADS for optimization of microstrip-to-SIW transition (Ogurtsov and Koziel 2011): high- and low-fidelity model response denoted as solid and dashed lines, respectively. jS22j distinguished from jS11j using circles. Design specs denoted by thick horizontal lines. (a) Model responses at the beginning of the iteration and original design specifications; (b) model responses and modified design specifications that reflect the differences between the responses; (c) low-fidelity model optimized with respect to the modified specifications; (d) high-fidelity model at the low-fidelity model optimum shown versus original specifications
design with respect to the original specifications. Because the model discrepancies may change somehow from one design to another, a few iterations may be necessary to find an optimal high-fidelity design.
3.6.6
Feature-Based Optimization
Feature-based optimization (FBO) is another approach, already discussed in Sect. 3.5 in the context of quasi-global modeling. Again, instead of handling complete responses, FBO processes only selected response features (Glubokov and Koziel 2014a, b; Koziel and Bekasiewicz 2015). Dependence of the point coordinates on the designable parameters is usually less nonlinear than that for the original responses; and lower nonlinearity of the features speeds up the optimization process. A concept of response features is explained in Fig. 3.43 showing a family of reflection responses of an example antenna considered in (Koziel 2015). The response is evaluated along a line segment in the design space, x ¼ txa + (1 t) xb, 0 t 1, where xa and xb are arbitrarily selected vectors. Figure 3.43a shows the
110
3 Physics-Based Surrogate Modeling
|S11| [dB]
a
0 -10 -20 -30
c
-5 -10 -15 -20 -25
1.8 2 2.2 Frequency [GHz]
Frequency [GHz]
|S11| [dB]
b
1.6
0
0.5 t
1
2.4
2.6
d
2.5
-15 |S11| [dB]
1.4
2
1.5
0.5 t
0
1
-20 -25 -30 -35
0
0.5 t
1
Fig. 3.43 Response features: (a) family of reflection responses of the example antenna structure, evaluated along certain line segment in the design space, example features corresponding to 15 dB levels (o) and center frequency (□); (b) response variability at selected frequencies, 1.9 GHz (—), 2.0 GHz (- - -), and 2.1 GHz (. . .), indicating their high nonlinearity; (c) variability of the response features (frequency components), here, corresponding to 15 dB levels as well as the center frequency; (d) variability of the level components of the response features of (c) (Koziel 2015)
plots of the corresponding features as a function of the parameter t. Figure 3.43b shows the reflection characteristic jS11j versus the line segment parameter t for various frequencies, for 1.9 GHz, 2.0 GHz, and 2.1 GHz. As indicated in the plots, the original responses are highly nonlinear, whereas the dependence of the response features on the system parameters is much less nonlinear. Consequently, the feature point coordinates are easier to handle. This results in faster convergence of the optimization algorithm working at the level of features. Let fk(x) and λk(x) denote the frequency and the level of the kth feature, k ¼ 1, . . . , p, where p is the number of features for a given problem. The response features should be selected in accordance to given specifications so that the original formulation of the design problem (3.1) can be replaced by an equivalent problem formulated for the features xf ¼ arg min U F ðf F ðxÞÞ, x
ð3:49Þ
where fF 2 R2p denotes the feature-based model, UF is the objective function at the feature level, and fF(x) ¼ [ω1(x) ω2(x) . . . ωp(x) λ1(x) λ2(x) . . . λp(x)]T.
3.6 Physics-Based Surrogates for Optimization
111
Figure 3.44 shows a typical situation for a narrowband antenna. In this specific example, the goal is to maximize the fractional bandwidth at the 15 dB level. For illustration, three response features are utilized corresponding to the 15 dB level and to the response minimum. It can be observed that one has a very good prediction of the feature location obtained using the feature model gradients, ∇ωk(x) and ∇λk(x) (i.e., ωk(x + dx) ωkx + ∇ ωk(x)dx, and λk(x + dx) λk(x) + ∇ λk(x)dx) for a specific search step size dx, versus the actual feature location (verified by EM simulation). This is not the case when the same search step is utilized for the firstorder Taylor model constructed for the frequency-based response, i.e., f(x + dx) f (x) + Jf(x)dx, where Jf denotes the model Jacobian. The FBO algorithm is formulated as an iterative process generating a series x(i) of approximations to x (solution to the original problem (3.34)) as follows: xðiþ1Þ ¼ arg min U F λðiÞ ðxÞ , x
ð3:50Þ
where λ(i) is a linear model of the response features set up at the current design x(i) using finite differentiation; the model λ(i) is defined as 3 3 ðiÞ ω1 xd:1 ω1 xðiÞ 7 62 3 6 7 7 6 ω1 xðiÞ 7 6 ðiÞ 7 6 7 6 d1 7 66 7 6 7 ðiÞ 7 66 ... 7 þ 6 7 . . . ∘ xx 7 64 7 6 5 7 7 6 6 iÞ 7 6 ω xðiÞ 6 ωp xðd:p ωp xðiÞ 7 p 7 6 5 4 7 6 ðiÞ 7 6 dp 7 6 ðiÞ 3 2 λ ð xÞ ¼ 6 7 ðiÞ 7 6 λ1 xd:1 λ1 xðiÞ 7 6 7 7 6 2 ðiÞ 3 6 7 6 7 6 λ1 x ðiÞ 7 6 7 6 6 d 1 7 7 6 7 6 6 ðiÞ 7 7 6 . . . 6 4 ... 7 ∘ xx 7 5þ6 7 6 7 6 7 6 iÞ 6 λp xðd:p λp xðiÞ 7 7 6 λp xðiÞ 5 4 5 4 ðiÞ dp 2
2
ð3:51Þ
where x(i) ¼ [x1(i)x2(i) . . . xn(i)]T is a current design, whereas xd.k(i) ¼ [x1(i) . . . xk(i) + dk(i) . . . xn(i)]T are the perturbed designs. Here, d(i) ¼ [d1(i) d2(i) . . . dn(i)]T is the perturbation size. The symbol denotes component-wise multiplication. Frequency and level coordinates of the feature points are modeled independently. To ensure convergence, the algorithm is embedded in the trust-region framework (Conn et al. 2000), thus xðiþ1Þ ¼ arg
min
kxxðiÞ k δðiÞ
U F λðiÞ ðxÞ ,
ð3:52Þ
112
3 Physics-Based Surrogate Modeling
0
|S11| [dB]
-10 -20 -30 1.4
1.6
2.2 2 1.8 Frequency [GHz]
2.4
2.6
Fig. 3.44 Feature prediction using estimated feature gradients: high-fidelity model at the reference design x (. . . .) and corresponding feature points (), high-fidelity model at the perturbed design x + dx (—) and the corresponding feature points (o), linear expansion model constructed at the frequency-based response level and evaluated at x + dx (- - -) together with the corresponding feature points (), as well as the high-fidelity feature points predicted using feature gradients (□). Feature point prediction obtained using linear expansion (at the frequency-based response level) is inaccurate; feature prediction using feature gradients is much more reliable
where λ(i) is optimized within the trust region defined as kx x(i)k δ(i); δ(i) is the trust-region size updated using conventional rules (Conn et al. 2000). The new design x(i+1) produced by (3.52) is only accepted if it leads to the improvement of (i + 1) ) < fF(x(i)). F(x), i.e., fF(x In order to speed up the optimization process, the gradients of the features can be estimated using the low-fidelity model evaluations rather than the high-fidelity ones. This is justified as long as the low- and high-fidelity models are sufficiently well correlated; then their absolute discrepancies are not critical (Koziel 2015). For the sake of illustration, consider the planar inverted-F antenna (PIFA) (Koziel 2015) shown in Fig. 3.45. The design variables are x ¼ [v0 v1 v2 v3 v4 v5 v6]T. The initial design is x(0) ¼ [2 8 5 2 10 25 45]T mm. Fixed parameters are [u0 u1 u2 u3 u4 u5 u6 u7 w0 r0]T ¼ [6.15 50.0 15.0 10.5 29.35 11.65 5.0 1.0 0.5]T mm; 0.508 mm substrate (Fig. 3.45b on the right) and the box (on the left) are of Rogers TMM4 and TMM 6. The high-fidelity model f is evaluated in CST Microwave Studio (~650,000 mesh cells, simulation time 13 min). The low-fidelity model c is also evaluated in CST Microwave Studio (~22,000 mesh cells, simulation time 50 s). The antenna was optimized using the FBO algorithm to obtain as wide fractional bandwidth as possible at jS11j ¼ 15 dB (symmetrically) around 2.0 GHz. The final design x ¼ [2.028 8.226 6.587 2.618 9.134 22.87 45.24]T mm was obtained in 6 iterations with the total cost corresponding to about 11 evaluations of the highfidelity model (8 f and 50 c). Figure 3.46 shows the high-fidelity model responses at the initial and the final designs, as well as the evolution of the fractional bandwidth and the convergence plot. It should be noted that the presented approach is also efficient in terms of lowfidelity model evaluations, which is important when the time evaluation ratio between the f and c is low as in the present example.
3.6 Physics-Based Surrogates for Optimization
113
Fig. 3.45 Planar inverted-F antenna: (a) top and side views; (b) perspective view. Substrate is shown transparent (Koziel 2015)
0 -10 -20 -30 1.2
20
Center frequency [GHz]
Fractional bandwidth [%]
b
1.4
15 10 5 0
2
4
Iteration index
6
1.6
1.8 2 2.2 Frequency [GHz]
c
2.5
||x(i)-x(i-1)||
|S11| [dB]
a
2
1.5
2
4
Iteration index
6
2.4
2.6
102 101 100 10-1
1
2
3
4
5
6
Iteration index
Fig. 3.46 PIFA optimization (Case 1): (a) initial (- - -) and optimized (—) response (fractional bandwidth around 2 GHz of 15 percent); (b) evolution of the fractional bandwidth and the center frequency vs. iteration index; (c) convergence plot
For the sake of comparison, the antenna was also optimized using the two benchmark methods: (i) space mapping (SM) (Koziel et al. 2008) and (ii) the pattern search algorithm (Kolda et al. 2003). The SM algorithm exploits the low-fidelity model c as the underlying coarse model and two types of model correction, specifically, frequency scaling (Koziel and Ogurtsov 2014) and additive response correction (Koziel and Leifsson 2012). The SM surrogate is reset at every iteration and re-optimized using pattern search. The pattern search algorithm is used in both direct
114
3 Physics-Based Surrogate Modeling
Table 3.5 Planar inverted-F antenna—design optimization cost CPU cost Algorithm Direct f optimization (pattern search) Space mapping
Feature-based optimization
Algorithm component Evaluating f
Number of model evaluations 205 f
Absolute 2665 min
Relative to f 205.0
Evaluating c Evaluating f Total cost Evaluating c Evaluating f Total cost
345 c 5f N/A 50 c 8f N/A
288 min 65 min 353 min 42 min 104 min 146 min
22.2 5.0 27.2 4.0 9.0 13.0
search and space mapping procedure based on the implementation described in (Koziel 2010d). The final designs obtained using all three methods are comparable; however, the design costs differ considerably (cf. Table 3.5). An important prerequisite of FBO is that the feature point set is consistent along the optimization path. In particular, it is necessary to establish the linear model (3.51) as well as to compare the quality of the designs at subsequent iterations. For certain problems, the above consistency may be difficult to maintain due to considerable changes of the system response during the optimization process. In some situations, it is possible to work around this issue, e.g., by neglecting the feature points that are not present in one (or more) of the designs that are being compared (Glubokov and Koziel 2014b).
3.6.7
Adaptive Response Scaling
The last technique outlined in this section is adaptive response scaling (ARS; Koziel and Bekasiewicz 2016a). ARS has been designed to fully exploit the knowledge about the high-frequency structure of interest embedded in its low-fidelity model yet to be as generic as possible (as opposed to, e.g., SPRP whose operation depends on satisfaction of rather strict requirements concerning the shape of the model responses). ARS preserves zero-order consistency s(i)(x(i)) ¼ f(x(i)) and exhibits good generalization by accounting for both frequency and amplitude changes of the low-fidelity model responses during the optimization process. Figure 3.47 shows the high- and low-fidelity model responses (here, jS11j) of an exemplary sixth-order microstrip filter at the reference design x(i) and another design x. Note that the models are significantly misaligned yet relatively well correlated. ARS attempts to explore these correlations as described below. The response scaling is realized at the level of complex S-parameter responses, separately for the respective real and imaginary parts. First, the frequency relationships between the low- and
3.6 Physics-Based Surrogates for Optimization
115
|S11| (dB)
0 -20 -40 4
4.5
5
6 5.5 Frequency (GHz)
6.5
7
4
4.5
5
5.5 6 Frequency (GHz)
6.5
7
|S11| (dB)
0 -20 -40
Fig. 3.47 Responses of the sixth-order microstrip bandpass filter: jS11j at a reference design x(i) (top plots) and at another design x (bottom plots); high- and low-fidelity models shown using solid and dashed lines, respectively (Koziel and Bekasiewicz 2016a)
high-fidelity model at the reference design x(i) (i.e., the design being the starting point for the subsequent iteration of (3.35)) are retrieved by solving the following nonlinear regression problem Z
ðiÞ
F ðωÞ ¼ arg min F
ωmax
ωmin
r f xðiÞ , ω r c xðiÞ , F ðωÞ dω,
ð3:53Þ
where F is a nonlinear frequency scaling function, in this case implemented using cubic splines with 20 control points within the frequency range of interest; rf /rc are the high-/low-fidelity model responses of interest (e.g., Re(S11), etc.). The problem (3.53) is solved for all relevant responses, separately for the real and imaginary part. That is why in (3.53) and the following equations the symbols rf /rc are used rather than f/c, i.e., to account for the fact that the operations are performed independently for each response. The problem (3.53) is solved with the control point locations along the frequency axis unknown. Upon extraction, F(i) minimizes the discrepancy between the responses within the frequency range of interest ωmin to ωmax. As mentioned before, the main purpose is the identification of the frequency-wise model misalignment. As indicated in Fig. 3.48, the frequency-scaled low-fidelity model at the reference design is well aligned with the high-fidelity one in terms of the frequency location of the response minima/maxima. At the prediction stage, the objective is to account for the changes of the low-fidelity model between the reference design x(i) and the current design x (encountered in the course of the surrogate model optimization run (3.49)).
116
3 Physics-Based Surrogate Modeling
a F(i)(w) (GHz)
7 6 5 4
Re(S11)
b
4
4.5
5 5.5 6 Frequency w (GHz)
6.5
7
0.2
0
-0.2
4
6 5.5 5 Frequency (GHz)
4.5
6.5
7
Fig. 3.48 Adaptive response scaling (ARS), Stage I (reference scaling): (a) frequency scaling function F(i)(ω) (—) extracted using (3.53) for the sixth-order filter at the design shown in Fig. 3.47. The plot restricted to the range of 4 GHz to 7 GHz. For comparison, the identity function shown as (); (b) Re(S11) of the high- (—) and low-fidelity () model at the reference design x(i) as well as the frequency-scaled low-fidelity model response (○). Good frequency alignment of the response minima/maxima can be observed (Koziel and Bekasiewicz 2016a)
In order to do this, first, the scaling function similar to (3.50) is computed to identify the frequency changes of the low-fidelity model response as follows Z F ðx, ωÞ ¼ arg min F
ωmax ωmin
r c ðx, ωÞ r c xðiÞ , F ðωÞ dω:
ð3:54Þ
In other words, (3.54) allows for determination of the nonlinear frequency scaling function between rc responses evaluated at the designs x(i) and x. Both (3.53 and 3.54) are solved with respect to the scaling coefficients as the unknowns. The response of the frequency scaled low-fidelity model in (3.53 and 3.54) is obtained by interpolating the known response at the original frequency sweep so that the cost of solving both (3.53 and 3.54) is negligible (in practice, a fraction of a second). Furthermore, the amplitude changes of the low-fidelity model at the design x and design x(i) scaled by (4) are calculated as h i Aðx, ωÞ ¼ ½r c ðx, ωÞ þ 1 r c xðiÞ , F ðx, ωÞ þ 1 :
ð3:55Þ
Here, denotes component-wise division with respect to frequency. The shift by +1 is introduced in order to avoid division by zero (for frequencies for which rc(x(i),
3.6 Physics-Based Surrogates for Optimization
117
F(x, ω)) ¼ 0) and to avoid too large values of |A| (for rc close to zero). The choice of this particular value is not critical, although the shift should be sufficiently large to ensure that rc(x(i), F(x, ω)) + 1 > 0 for the entire range of frequencies of interest. The prediction state of the surrogate model is then realized as follows. First, the reference high-fidelity model response rf(x(i), ω) is scaled in frequency using F(x, ω) in order to account for the changes of the low-fidelity model while moving from x(i) to x (the low- and high-fidelity models are assumed to be well correlated although perhaps misaligned in the absolute terms). Then, the amplitude scaling function A (x, ω) is scaled in frequency using F(i)(x, ω) in order to accommodate the frequency relationships between the low- and high-fidelity model at the reference design. Finally, it is applied to correct the surrogate response in amplitude. Formally, the surrogate model is defined as h i r s ðx, ωÞ ¼ A x, F ðiÞ ðωÞ ∘ r f xðiÞ , F ðx, ωÞ þ 1 1,
ð3:56Þ
where denotes component-wise multiplication. In other words, in (3.56), the amplitude changes determined by (3.54) are rescaled by (3.53) to account for frequency misalignment between the high- and low-fidelity models. Note that the scaling function F(i) is only calculated once per iteration (3.35). At the same time, (3.54 and 3.55) are computed for each evaluation of the surrogate model which allows for better utilization of the knowledge embedded in the low-fidelity model through tracking its changes on the optimization path. Figure 3.49 shows the plots of the scaling function F(x, ω) corresponding to the model responses of Figs. 3.47 and 3.48, the effect of the low-fidelity model scaling with F(x, ω), as well as the amplitude scaling function A(x, ω). Figure 3.50 shows Re (S11) of the responses presented in Fig. 3.47, as well as the response of the surrogate model constructed according to ARS, i.e., using (3.56). It can be observed that the predictive power of the model is very good, especially given significant misalignment between the low- and high-fidelity models as well as considerable change of the responses between the designs x(0) and x. Figure 3.51 shows how this translates to jS11j prediction. At the same time, the quality of the conventional output SM prediction is poor as it does not account for the model response changes in frequency. Figures 3.52 and 3.53 show the flowcharts of the ARS modeling process. ARS is illustrated using the eight-order bandpass filter shown in Fig. 3.54. The structure is implemented on a 0.762 mm thick Taconic RF-35 dielectric substrate (εr ¼ 3.5, tanδ ¼ 0.0018). The design variables are x ¼ [w1 w2 w3 w4 d1 d2 d3 d4 l1 l2 l3 l4]T. The EM model is implemented in CST Microwave Studio (~160,000 mesh cells, simulation time 6 min). The low-fidelity model is an equivalent circuit model implemented in Keysight ADS. The design specifications are jS11j –20 dB for 4 GHz ω 7 GHz, and jS11j –3 dB for ω 3.92 GHz and ω 7.08 GHz. The initial design is x(0) ¼ [1.0 1.0 1.0 1.0 5.0 5.0 5.0 5.0 17.0 17.0 17.0 17.0]T. The final design x ¼ [0.61 1.16 1.49 1.97 6.90 3.14 3.03 2.38 15.51 15.53 16.87 16.87]T. The filter response at the initial design and the design optimized using ARS are shown in Fig. 3.55.
118
3 Physics-Based Surrogate Modeling
a
F(x,w)
7 6 5 4
Re(S11)
b
4.5
4
5
6 5.5 Frequency w (GHz)
7
6.5
0.2
0
-0.2
4
4.5
5
6 5.5 Frequency (GHz)
6.5
7
4
4.5
5
5.5 6 Frequency (GHz)
6.5
7
Amplitude Scaling A(x,w)
c 1.1 1 0.9 0.8
Fig. 3.49 Adaptive response scaling (ARS), Stage II (prediction): (a) frequency scaling function F(x, ω) (—) extracted using (3.54) and the identity function (); (b) Re(S11) of the low-fidelity model responses at the reference design x(i) (), at the design x (—), and the low-fidelity model response at x(i) scaled using F(x, ω) (○); (c) amplitude scaling function A(x, ω) (—) computed with (3.55) (Koziel and Bekasiewicz 2016a)
ARS algorithm has been compared with various SM-based algorithms in terms of reliability and computational cost. The benchmark routines include implicit space mapping (ISM), a combination of frequency SM and output SM, and input SM. For the sake of fair comparison, each SBO process has been carried out in an unattended manner. The results are gathered in Table 3.6. The only techniques that converged are ARS and ISM with 16 preassigned parameters. Nonetheless, the latter
3.6 Physics-Based Surrogates for Optimization
119
Re(S11)
0.2 0 -0.2 4
4.5
6 5.5 5 Frequency (GHz)
6.5
7
4
4.5
5 5.5 6 Frequency (GHz)
6.5
7
Re(S11)
0.2 0 -0.2
Fig. 3.50 Responses of the sixth-order filter at x(i) (top plots) and at x (bottom plots) (cf. Figure 3.47): f (—) and c (- - -); surrogate model response determined by (3.56) shown using circles (Koziel and Bekasiewicz 2016a)
|S11| (dB)
0 -20 -40 4
4.5
5 5.5 6 Frequency (GHz)
6.5
7
Fig. 3.51 High-fidelity model response at x (thick line) and surrogate model responses obtained using adaptive response scaling (○) and conventional output SM () for the example sixth-order microstrip filter (Koziel and Bekasiewicz 2016a)
required more iterations and was over two times more expensive than ARS. The reason is that ARS does not require a parameter extraction step which is mandatory for ISM. It should be noted that the ISM with eight preassigned parameters, as well as the combination of frequency and output SM, have been both prematurely terminated due to divergence. The input SM converged, but its final response violates design specifications. Apart from limited reliability and higher computational cost, there are other issues related to the benchmark techniques reported here. In particular, performance of SM-based algorithms is problem-specific and heavily depends on selected SM transformations (or their combinations) (Koziel and Bandler 2007c). Due to this ambiguity, any particular SM model selection may or may not be successful. In particular, inappropriate setup being a result of, e.g., utilization of inadequate model or selection of insufficient (or too large) number of degrees of freedom for parameter
x(i)
EM Solver Evaluate highfidelity model
Evaluate lowfidelity model
Rf (x(i))
Circuit Simulator
Rc(x(i))
Extract F(i)(w)
Rf (x(i))
Rc(x(i))
F(i)(.)
Fig. 3.52 Adaptive response scaling, Stage I (reference design scaling). The frequency relationships between the high- and low-fidelity model responses at the reference design x(i), represented by the scaling function F(i), are obtained as in (3.52), separately for the real and imaginary parts of each response of interest (S11, S21, etc.). Stage I is only executed once for each iteration of the SBO algorithm (3.35)
x
F(i)(.) Rf (x(i))
Circuit Simulator
Evaluate lowfidelity model Rc(x)
Rc(x(i))
Extract F(x,w) F(x,w) Scale highfidelity response Rf (x(i),F(x,w))
F(x,w) Compute A(x,w) A(x,w)
Compute surrogate model response
Rs(i)(x) Fig. 3.53 Adaptive response scaling, Stage II (model prediction). The input arguments are the current design x, as well as the high- and low-fidelity model responses at the reference design and the reference scaling function F(i). The frequency and amplitude relations between the low-fidelity models at x(i) and x are obtained using (3.54 and 3.55), respectively. Finally, the scaling functions F(x, ω) and A(x, ω) are applied to compute the surrogate model response as in (3.56)
3.6 Physics-Based Surrogates for Optimization Fig. 3.54 Geometry of the eight-order microstrip bandpass filter (Koziel and Bekasiewicz 2016a)
121
w4 l1
l2
l3
w3
w2
w1
l4 d4
w0
d3 d3
d2 d2
d1
0
|S11| (dB)
-10 -20 -30 -40 -50 3.5
4
4.5
5 5.5 6 Frequency (GHz)
6.5
7
7.5
Fig. 3.55 Eight-order microstrip bandpass filter: initial (- - -) and final (—) filter responses obtained using the adaptive response scaling technique (Koziel and Bekasiewicz 2016a) Table 3.6 Eight-order filter: ARS vs. benchmark methods Optimization algorithm ARS Implicit SMc Implicit SMe Frequency + output SM Input SMf
Design optimization costa Number of EM simulations 6 9d 8 4d 9g
Total costb 10.9 26.2 23.7 13.8 31.7
a
Algorithm terminated upon satisfying design specifications Cost including low-fidelity model evaluations, expressed in terms of the equivalent number of EM filter simulations c Eight implicit parameters related to substrate thickness d Algorithm terminated due to divergence e Sixteen implicit parameters related to substrate thickness and permittivity f Eight additive and eight multiplicative parameters g Algorithm converged but the final response violates specifications b
extraction cannot be identified until algorithm is executed. In practice, several variations have to be tried out in order to find the one that is suitable for a given problem. As indicated by the data gathered in Table 3.6, this results in wasting of computational resources for performing optimizations which end with divergence of the algorithm or produce designs with unsatisfactory characteristics. On the contrary, ARS does not exhibit this sort of problem because it is parameter less.
122
3 Physics-Based Surrogate Modeling
References Abdel-Malek, H. L., & Bandler, J. W. (1978). Yield estimation for efficient design centering assuming arbitrary statistical distributions. International Journal of Circuit Theory and Applications, 6(3), 289–303. ADS (Advanced Design System). (2019). Keysight Technologies, Fountaingrove Parkway 1400, Santa Rosa, CA 95403–1799. Alexandrov, N. M., & Lewis, R. M. (2001). An overview of first-order model management for engineering optimization. Optical Engineering, 2(4), 413–430. Alexandrov, N. M., Dennis, J. E., Lewis, R. M., & Torczon, V. (1998). A trust-region framework for managing the use of approximation models in optimization. Structural Optimization, 15(1), 16–23. Altair FEKO. (2018). Altair HyperWorks, 1820 E Big Beaver Rd, Troy, MI 48083, USA. Bakr, M. H., Bandler, J. W., Biernacki, R. M., Chen, S. H., & Madsen, K. (1998). A trust region aggressive space mapping algorithm for EM optimization. IEEE Transactions on Microwave Theory and Techniques, 46(12), 2412–2425. Bakr, M. H., Bandler, J. W., Georgieva, N. K., & Madsen, K. (1999). A hybrid aggressive spacemapping algorithm for EM optimization. IEEE Transactions on Microwave Theory and Techniques, 47(12), 2440–2449. Bakr, M. H., Bandler, J. W., Madsen, K., Rayas-Sanchez, J. E., & Sondergaard, J. (2000). Spacemapping optimization of microwave circuits exploiting surrogate models. IEEE Transactions on Microwave Theory and Techniques, 48(12), 2297–2306. Bandler, J. W., & Chen, S. H. (1988). Circuit optimization: The state of the art. IEEE Transactions on Microwave Theory and Techniques, 36(2), 424–443. Bandler, J. W., Liu, P. C., & Tromp, H. (1976a). A nonlinear programming approach to optimal design centering, tolerancing and tuning. IEEE Transactions on Circuits and Systems, CAS-23 (3), 155–165. Bandler, J. W., Liu, P. C., & Tromp, H. (1976b). Integrated approach to microwave design. IEEE Transactions on Microwave Theory and Techniques, MTT-24(9), 584–591. Bandler, J. W., Biernacki, R. M., Chen, S. H., Grobelny, P. A., & Hemmers, R. H. (1994). Space mapping technique for electromagnetic optimization. IEEE Transactions on Microwave Theory and Techniques, 42(12), 2536–2544. Bandler, J. W., Biernacki, R. M., Chen, S. H., Hemmers, R. H., & Madsen, K. (1995). Electromagnetic optimization exploiting aggressive space mapping. IEEE Transactions on Microwave Theory and Techniques, 41(12), 2874–2882. Bandler, J. W., Cheng, Q. S., Gebre-Mariam, D. H., Madsen, K., Pedersen, F., & Søndergaard, J. (2003). EM-based surrogate modeling and design exploiting implicit, frequency and output space mappings (pp. 1003–1006). Philadelphia: IEEE International Microwave Symposium Digest. Bandler, J. W., Cheng, Q. S., Dakroury, S. A., Mohamed, A. S., Bakr, M. H., Madsen, K., & Søndergaard, J. (2004). Space mapping: The state of the art. IEEE Transactions on Microwave Theory and Techniques, 52(1), 337–361. Bandler, J. W., Cheng, Q. S., Nikolova, N. K., & Ismail, M. A. (2004). Implicit space mapping optimization exploiting preassigned parameters. IEEE Transactions on Microwave Theory and Techniques, 52(11), 378–385. Bandler, J. W., Koziel, S., & Madsen, K. (2008). Editorial—Surrogate modeling and space mapping for engineering optimization. Optimization and Engineering, 9(4), 307–310. Baumann, D., Fumeaux, C., Leuchtmann, P., & Vahldieck, R. (2004). Finite-volume time-domain (FVTD) modelling of a broadband double-ridged horn antenna. International Journal of Numerical Modelling, 17(3), 285–298. Beachkofski, B., & Grandhi, R. (2002). Improved distributed hypercube sampling, American Institute of Aeronautics and Astronautics, Paper AIAA, 2002–1274. Bekasiewicz, A., Koziel, S., & Zieniutycz, W. (2014). Design space reduction for expedited multiobjective design optimization of antennas in highly-dimensional spaces. In S. Koziel,
References
123
L. Leifsson, & X.-S. Yang (Eds.), Solving computationally expensive engineering problems: Methods and applications (pp. 113–147). New York: Springer. Bekasiewicz, A., & Koziel, S. (2016). Cost-efficient design optimization of compact patch antennas with improved bandwidth. IEEE Antennas and Wireless Propagation Letters, 15, 270–273. Biernacki, R., Chen, S., Estep, G., Rousset, J., & Sifri, J. (2012). Statistical analysis and yield optimization in practical RF and microwave systems. IEEE MTT-S International Microwave Symposium Digest. Montreal. pp. 1–3. Broyden, C. G. (1965). A class of methods for solving nonlinear simultaneous equations. Mathematics of Computation, 19, 577–593. Caratelli, D., & Yarovoy, A. (2010). Unified time- and frequency-domain approach for accurate modeling of electromagnetic radiation processes in ultrawideband antennas. IEEE Transactions on Antennas and Propagation, 58(10), 3239–3255. Cheng, Q. S., Koziel, S., & Bandler, J. W. (2006). Simplified space mapping approach to enhancement of microwave device models. International Journal of RF and Microwave Computer-Aided Engineering, 16(5), 518–535. Cheng, Q. S., Bandler, J. W., & Koziel, S. (2008). Combining coarse and fine models for optimal design. IEEE Microwave Magazine, 9, 79–88. Conn, A. R., Gould, N. I. M., & Toint, P. L. (2000). Trust region methods, MPS-SIAM Series on Optimization, Philadelphia, MPS-SIAM. Crevecoeur, G., Hallez, H., Dupre, L., Van de Walle, R., Boon, P., & Lemahieu, I. (2009). Validation of the two-level approach for the solution of the EEG inverse problem in an anisotropic realistic head model. IEEE Transactions on Magnetics, 45(3), 1670–1673. CST Microwave Studio. (2018). CST AG, Bad Nauheimer Str. 19, D-64289 Darmstadt, Germany. Dorica, M., & Giannacopoulos, D. D. (2006). Response surface space mapping for electromagnetic optimization. IEEE Transactions on Magnetics, 42(4), 1123–1126. Echeverria, D., & Hemker, P. W. (2005). Space mapping and defect correction. Computational Methods in Applied. Mathematics, 5(2), 107–136. Echeverría, D., & Hemker, P. W. (2008). Manifold mapping: A two-level optimization technique. Computing and Visualization in Science, 11(4–6), 193–206. Eldred, M. S., & Dunlavy, D. M. (2006). Formulations for surrogate-based optimization with data fit, multifidelity, and reduced-order models. 11th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference. Portsmouth. AIAA–2006–7117. em™ Version 16.56 (2018). Sonnet Software, Inc., Sonnet Software, Inc., 126 N. Salina Street, Syracuse, NY 13202, USA. Feng, N.-N., & Huang, W.-P. (2003). Modeling and simulation of photonic devices by generalized space mapping technique. Journal of Lightwave Technology, 21(6), 1562. Feng, N.-N., Zhou, G.-R., & Huang, W.-P. (2003). Space mapping technique for design optimization of antireflection coatings in photonic devices. Journal of Lightwave Technology, 21(1), 281–285. Feng, F., Zhang, C., Na, W., Zhang, J., Zhang, W., & Zhang, Q. (2019). Adaptive feature zero assisted surrogate-based EM optimization for microwave filter design. IEEE Microwave and Wireless Components Letters, 29(1), 2–4. Ferranti, F., Deschrijver, D., Knockaert, L., & Dhaene, T. (2009). Hybrid algorithm for compact and stable macromodelling of parameterized frequency responses. IEEE Electronics Letters, 45 (10), 493–495. Ferranti, F., Knockaert, L., & Dhaene, T. (2011). Passivity-preserving parametric macromodelling by means of scaled and shifted state-space systems. IEEE Transactions on Microwave Theory and Techniques, 59(10), 2394–2403. Fernández-Godino, M. G., Park, C., Kim, N. H., & Haftka, R. T. (2019). Issues in deciding whether to use multifidelity surrogates. AIAA Journal, 57(5), 2039–2054. Forrester, A. I. J., & Keane, A. J. (2009). Recent advances in surrogate-based optimization. Progress in Aerospace Sciences, 45(1), 50–79. Giunta, A. A., Wojtkiewicz, S. F., & Eldred, M. S. (2003). Overview of modern design of experiments methods for computational simulations. Paper AIAA. pp. 2003–0649.
124
3 Physics-Based Surrogate Modeling
Glubokov, O., & Koziel, S. (2014a). Substrate integrated waveguide microwave filter tuning through variable-fidelity feature space optimization. International Review of Progress in Applied Computational Electromagnetics. Glubokov, O., & Koziel, S. (2014b). EM-driven tuning of substrate integrated waveguide filters exploiting feature-space surrogates. IEEE International Microwave Symposium Digest (IMS). Tampa. pp. 1–3. Goudos, S. (Ed.). (2017). Microwave systems and applications. London: IntechOpen. Guan, X., Ma, Z., Cai, P., Anada, T., & Hagiwara, G. (2008). A microstrip dual-band bandpass filter with reduced size and improved stopband characteristics. Microwave and Optical Technology Letters, 50, 618–620. Hauth, W., Keller, R., Papziner, U., Ihmels, R., Sieverding, T., & Arndt, F. (1993). Rigorous CAD of multipost coupled rectangular waveguide components. Proceeding of 23rd European Microwave Conference. Madrid. pp. 611–614. Hazaveh, P. K., Bergstrom, P. L., & Jaszczak, J. A. (2017). Efficient physics-based modeling of a representative semiconducting quantum dot single electron device. IEEE 17th International Conference on Nanotechnology (IEEE-NANO). Pittsburgh. pp. 739–744. Hong, J.-S., & Lancaster, M. (2001). Microstrip filters for RF/microwave applications. Hoboken: Wiley. Hosder, S. (2012). Stochastic response surfaces based on non-intrusive polynomial chaos for uncertainty quantification. International Journal of Mathematical Modelling and Numerical Optimisation, 3(1/2), 117–139. Hsieh, L. H., & Chang, K. (2003). Tunable microstrip bandpass filters with two transmission zeros. IEEE Transactions on Microwave Theory and Techniques, 51(2), 520–525. Huang, C. L., Chen, Y. B., & Tasi, C. F. (2008). New compact microstrip stacked slotted resonators bandpass filter with transmission zeros using high-permittivity ceramics substrate. Microwave and Optical Technology Letters, 50(5), 1377–1379. Kim, S., Alonso, J., & Jameson, A. (2000). Two-dimensional high-lift aerodynamic optimization using the continuous adjoint method, Paper AIAA 2000–4741. Kleijnen, J. P. C. (2009). Kriging metamodeling in simulation: A review. European Journal of Operational Research, 192(3), 707–716. Kleijnen, J. P. C. (2018). Design and analysis of simulation experiments. In J. Pilz, D. Rasch, V. Melas, & K. Moder (Eds.), Statistics and simulation. IWS 2015. Springer Proceedings in Mathematics & Statistics (Vol. 231). Cham: Springer. Kolda, T. G., Lewis, R. M., & Torczon, V. (2003). Optimization by direct search: New perspectives on some classical and modern methods. SIAM Review, 45(3), 385–482. Kolundzija, B., & Sumic, D. (2004). Hierarchical conjugate gradient method applied to MoM analysis of electrically large structures. IEEE Antennas and Propagation Society International Symposium (APS), 2004. Monterey, (Vol. 4). pp. 4455–4458. Koziel, S. (2010a). Shape-preserving response prediction for microwave design optimization. IEEE Transactions on Microwave Theory and Techniques, 58(11), 2829–2837. Koziel, S. (2010b). Shape-preserving response prediction for microwave circuit modeling. IEEE MTT-S International Microwave Symposium Digest. Anaheim. pp. 1660–1663. Koziel, S. (2010c). Adaptively adjusted design specifications for efficient optimization of microwave structures. Progress In Electromagnetics Research B, 21, 219–234. Koziel, S. (2010d). Computationally efficient multi-fidelity multi-grid design optimization of microwave structures. Applied Computational Electromagnetics Society Journal, 25(7), 578–586. Koziel, S. (2012). Accurate low-cost microwave component models using shape-preserving response prediction. International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, 25(2), 152–162. Koziel, S. (2015). Fast simulation-driven antenna design using response-feature surrogates. International Journal of RF and Microwave Computer-Aided Engineering, 25(5), 394–402. Koziel, S. (2017). Space mapping: Performance, reliability, open problems and perspectives. IEEE MTT-S International Microwave Symposium (IMS). Honololu. pp. 1512–1514.
References
125
Koziel, S., & Bandler, J. W. (2006). Space-mapping-based modeling utilizing parameter extraction with variable weight coefficients and a data base. IEEE MTT-S International Microwave Symposium Digest. San Francisco. pp. 1763–1766. Koziel, S., & Bandler, J. W. (2007a). Microwave device modeling using space-mapping and radial basis functions. IEEE MTT-S International Microwave Symposium Digest. Honolulu. pp. 799–802. Koziel, S., & Bandler, J. W. (2007b). A space-mapping approach to microwave device modeling exploiting fuzzy systems. IEEE Transactions on Microwave Theory and Techniques, 55(12), 2539–2547. Koziel, S., & Bandler, J. W. (2007c) Coarse and surrogate model assessment for engineering design optimization with space mapping. IEEE MTT-S International Microwave Symposium Digest. Honolulu. pp. 107–110. Koziel, S., & Bandler, J. W. (2007d). Space-mapping optimization with adaptive surrogate model. IEEE Transactions on Microwave Theory and Techniques, 55(3), 541–547. Koziel, S., & Bandler, J. W. (2012). Accurate modeling of microwave devices using krigingcorrected space mapping. International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, 25(1), 1–4. Koziel, S., & Bandler, J. W. (2015). Rapid yield estimation and optimization of microwave structures exploiting feature-based statistical analysis. IEEE Transactions on Microwave Theory and Techniques, 63(1), 107–114. Koziel, S., & Bekasiewicz, A. (2015). Fast simulation-driven feature-based design optimization of compact dual-band microstrip branch-line coupler. International Journal of RF and Microwave Computer-Aided Engineering, 26(1), 13–20. Koziel, S., & Bekasiewicz, A. (2016a). Rapid microwave design optimization in frequency domain using adaptive response scaling. IEEE Transactions on Microwave Theory and Techniques, 64 (9), 2749–2757. Koziel, S., & Bekasiewicz, A. (2016b). Multi-objective design of antennas using surrogate models. Singapore: World Scientific. Koziel, S., & Bekasiewicz, A. (2017a). Computationally feasible narrow-band antenna modeling using response features. International Journal of RF and Microwave Computer-Aided Engineering, 27(4), e21077. Koziel, S., & Bekasiewicz, A. (2017b). Comprehensive comparison of compact UWB antenna performance by means of multiobjective optimization. IEEE Transactions on Antennas and Propagation, 65(7), 3427–3436. Koziel, S., & Bekasiewicz, A. (2018). Simulation-driven size-reduction-oriented design of multiband antennas by means of response features. IET Microwaves, Antennas & Propagation, 12(7), 1093–1098. Koziel, S., & Kurgan, P. (2015). Rapid design of miniaturized branch-line couplers through concurrent cell optimization and surrogate-assisted fine-tuning. IET Microwaves, Antennas and Propagation, 9(9), 957–963. Koziel, S., & Leifsson, L. (2012). Generalized shape-preserving response prediction for accurate modeling of microwave structures. IET Microwaves, Antennas and Propagation, 6, 1332–1339. Koziel, S., & Leifsson, L. (Eds.). (2013a). Surrogate-based modeling and optimization. Applications in engineering. New York: Springer. Koziel, S., & Leifsson, L. (2013b). Multi-level airfoil shape optimization with automated low-fidelity model selection. International Conference on Computer Science. Barcelona. Koziel, S., & Leifsson, L. (2016). Simulation-driven design by knowledge-based response correction techniques. Cham: Springer. Koziel, S., & Ogurtsov, S. (2011). Simulation-driven design in microwave engineering: application case studies. In X. S. Yang & S. Koziel (Eds.), Computational optimization and applications in engineering and industry (Series: Studies in Computational Intelligence). Berlin: Springer. Koziel, S., & Ogurtsov, S. (2012). Model management for cost-efficient surrogate-based optimization of antennas using variable-fidelity electromagnetic simulations. IET Microwaves, Antennas and Propagation, 6, 1643–1650.
126
3 Physics-Based Surrogate Modeling
Koziel, S., & Ogurtsov, S. (2013). Multi-level microwave design optimization with automated model fidelity adjustment. International Journal of RF and Microwave Computer-Aided Engineering, 24(3), 281–288. Koziel, S., & Ogurtsov, S. (2014). Antenna design by simulation-driven optimization. Berlin: Springer. Koziel, S., & Ogurtsov, S. (2019). Simulation-based optimization of antenna arrays. London: World Scientific. Koziel, S., & Szczepanski, S. (2011). Accurate modeling of microwave structures using shapepreserving response prediction. IET Microwaves, Antennas & Propagation, 5(9), 1116–1122. Koziel, S., Bandler, J. W., & Madsen, K. (2006a). A space mapping framework for engineering optimization: Theory and implementation. IEEE Transactions on Microwave Theory and Techniques, 54(10), 3721–3730. Koziel, S., Bandler, J. W., & Madsen, K. (2006b). Theoretical justification of space-mapping-based modeling utilizing a data base and on-demand parameter extraction. IEEE Transactions on Microwave Theory and Techniques, 54(12), 4316–4322. Koziel, S., Cheng, Q. S., & Bandler, J. W. (2008). Space mapping. IEEE Microwave Magazine, 9 (6), 105–122. Koziel, S., Bandler, J. W., & Madsen, K. (2008). Quality assessment of coarse models and surrogates for space mapping optimization. Optical Engineering, 9, 375–391. Koziel, S., Cheng, Q. S., & Bandler, J. W. (2010). Implicit space mapping with adaptive selection of preassigned parameters. IET Microwaves, Antennas and Propagation, 4, 361–373. Koziel, S., Bandler, J. W., & Cheng, Q. S. (2010a). Robust trust-region space-mapping algorithms for microwave design optimization. IEEE Transactions on Microwave Theory and Techniques, 58(8), 2166–2174. Koziel, S., Bandler, J. W., & Cheng, Q. S. (2010b). Adaptively constrained parameter extraction for robust space mapping optimization of microwave circuits. IEEE MTT-S International Microwave Symposium Digest. Anaheim. pp. 205–208. Koziel, S., Ciaurri, D. E., & Leifsson, L. (2011). Surrogate-based methods. In S. Koziel & X. S. Yang (Eds.), Computational optimization, methods and algorithms (Studies in Computational Intelligence) (Vol. 356, pp. 33–59). Berlin/Heidelberg: Springer. Koziel, S., Yang, X. S., & Zhang, Q. J. (Eds.). (2013). Simulation-driven design optimization and modeling for microwave engineering. London: Imperial College Press. Koziel, S., Bekasiewicz, A., & Kurgan, P. (2014). Rapid EM-driven design of compact RF circuits by means of nested space mapping. IEEE Microwave and Wireless Components Letters, 24(4), 364–366. Koziel, S., Ogurtsov, S., Zieniutycz, W., & Sorokosz, L. (2014). Simulation-driven design of microstrip antenna subarrays. IEEE Transactions on Antennas and Propagation, 62(7), 3584–3591. Koziel, S., Bekasiewicz, A., & Leifsson, L. (2016). Cost-efficient modeling of input characteristics of narrow-band antennas using response features. 10th European Conference on Antennas and Propagation (EuCAP). Davos. pp. 1–4. Koziel, S., Bekasiewicz, A., Kurgan, P., & Bandler, J. W. (2016). Rapid multi-objective design optimisation of compact microwave couplers by means of physics-based surrogates. IET Microwaves Antennas & Propagation, 10(5), 479–486. Koziel, S., & Leifsson, L. (2013b). Multi-level airfoil shape optimization with automated lowfidelity model selection. International Conference on Computer Science. Barcelona. Lee, J. R., Cho, J. H., & Yun, S. W. (2000). New compact bandpass filter using microstrip λ/4 resonators with open stub inverter. IEEE Microwave and Guided Wave Letters, 10(12), 526–527. Leifsson, L., & Koziel, S. (2015a). Variable-resolution shape optimization: Low-fidelity model selection and scalability. International Journal of Mathematical Modelling and Numerical Optimisation, 6, 1–21. Leifsson, L., & Koziel, S. (2015b). Simulation-driven aerodynamic design using variable-fidelity models. London: Imperial College Press. Leifsson, L., & Koziel, S. (2016). Surrogate modelling and optimization using shape-preserving response prediction: A review. Engineering Optimization, 48(3), 476–496.
References
127
Leifsson, L., Koziel, S., & Ogurtsov, S. (2012). Low-fidelity model mesh density and the performance of variable-resolution shape optimization algorithms. Procedia Computer Science, 9, 842–851. Leifsson, L., Koziel, S., Zhang, Y., & Hosder, S. (2013). Low-cost robust airfoil optimization by variable-fidelity models and stochastic expansions, 51st AIAA Aerospace Sciences Meeting incl. New Horizons Forum Aerospace Exp., Grapevine. Leifsson, L., Koziel, S., & Kurgan, P. (2014a). Automated low-fidelity model setup for surrogatebased aerodynamic optimization. In S. Koziel, L. Leifsson, & X. S. Yang (Eds.), Solving computationally extensive engineering problems: Methods and applications (pp. 87–112). New York: Springer. Leifsson, L., Koziel, S., & Hosder, S. (2014b). Aerodynamic design optimization: Physics-based surrogate approaches for airfoil and wing design, 52nd Aerospace Sciences Meeting AIAA SciTech Forum, AIAA 2014–0572. Liu, G., & Gedney, S. D. (2000). Perfectly matched layer media for an unconditionally stable threedimensional ADI-FDTD method. IEEE Microwave and Guided Wave Letters, 10(7), 261–263. Lophaven, S. N., Nielsen, H. B., & Søndergaard, J. (2002). DACE: A Matlab kriging toolbox. Lyngby: Technical University of Denmark. Lund, T. S., Wu, X., & Squires, K. D. (1998). Generation of turbulent inflow data for spatiallydeveloping boundary layer simulations. Journal of Computational Physics, 140(2), 233–258. Manshari, S., Koziel, S., & Leifsson, L. (2019). A wideband corrugated ridged horn antenna with enhanced gain and stable phase center for X- and Ku-band applications. IEEE Antennas and Wireless Propagation Letters, 18(5), 1031–1035. Marheineke, N., Pinnau, R., & Reséndiz, E. (2012). Space mapping-focused control techniques for particle dispersions in fluids. Optical Engineering, 13(1), 101–120. Nosrati, M., & Tavassolian, N. (2017). Miniaturized circularly polarized square slot antenna with enhanced axial-ratio bandwidth using an antipodal Y-strip. IEEE Antennas and Wireless Propagation Letters, 16, 817–820. Obaidat, M. S., Ören, T., & De Floriano, R. (Eds.) (2019). Simulation and modeling methodologies, technologies and applications. 7th International Conference on SIMULTECH 2017, (Advances in Intelligent Systems and Computing). Madrid: Springer. Ogurtsov, S., & Koziel, S. (2011). Design of microstrip to substrate integrated waveguide transitions with enhanced bandwidth using protruding vias and EM-driven optimization. In: Proc. International Review of Progress in Applied Computational Electromagnetics, ACES, 91–96. Perdikaris, P., Raissi, M., Damianou, A., Lawrence, N. D., & Karniadakis, G. E. (2017). Nonlinear information fusion algorithms for data-efficient multi-fidelity modelling. Proceedings of the Royal Society A, 473(20160751), 1–16. Petosa, A. (2007). Dielectric resonator antenna handbook. Norwood: Artech House. Petrides, S., & Demkowicz, L. F. (2017). An adaptive DPG method for high frequency time-harmonic wave propagation problems. Computers & Mathmatics with Applications, 74(8), 1999–2017. Pozar, D. M. (2012). Microwave engineering (4th ed.). Hoboken: Wiley. Priess, M., Koziel, S., & Slawig, T. (2011). Surrogate-based optimization of climate model parameters using response correction. Journal of Computational Science., 2(4), 335–344. Rayas-Sanchez, J. E. (2016). Power in simplicity with ASM: Tracing the aggressive space mapping algorithm over two decades of development and engineering applications. IEEE Microwave Magazine, 17(4), 64–76. Rayas-Sanchez, J. E., Chávez-Hurtado, J. L., & Brito-Brito, Z. (2017). Optimization of full-wave EM models by low-order low-dimension polynomial surrogate functionals. International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, 30(3–4), e2094. Redhe, M., & Nilsson, L. (2004). Optimization of the new Saab 9-3 exposed to impact load using a space mapping technique. Structural and Multidisciplinary Optimization, 27, 411–420. Robinson, T. D., Eldred, M. S., Willcox, K. E., & Haimes, R. (2008). Surrogate-based optimization using multifidelity models with variable parameterization and corrected space mapping. AIAA Journal, 46(11), 2814–2822. Salleh, M. K. M., Prigent, G., Pigaglio, O., & Crampagne, R. (2008). Quarter-wavelength sidecoupled ring resonator for bandpass filters. IEEE Transactions on Microwave Theory and Techniques, 56(1), 156–162.
128
3 Physics-Based Surrogate Modeling
Sans, M., Selga, J., Rodriguez, A., Bonache, J., Boria, V. E., & Martin, F. (2014). Design of planar wideband bandpass filters from specifications using a two-step aggressive space mapping (ASM) optimization algorithm. IEEE Transactions on Microwave Theory and Techniques, 62 (12), 3341–3350. Sarkar, T. K., Chen, H., Palma, M. S., & Zhu, M. (2019). Lessons learned using a physics based macro model for analysis of radio wave propagation in wireless transmission. IEEE Transactions on Antennas and Propagation, 67(4), 2150–2157. Schmidthausler, D., & Clemens, M. (2012). Low-order electroquasistatic field simulations based on proper orthogonal decomposition. IEEE Transactions on Magnetics, 48(2), 567–570. Scotti, G., Tommasino, P., & Trifiletti, A. (2005). MMIC yield optimization by design centering and off-chip controllers. IET Proceedings – Circuits Devices and Systems, 152(1), 54–60. Shah, H., Hosder, S., Leifsson, L., Koziel, S., & Tesfahunegn, Y. (2015). Multi-fidelity robust aerodynamic design optimization under mixed uncertainty. Aerospace Science and Technology, 45, 17–29. Siegler, J., Ren, J., Leifsson, L., Koziel, S., & Bekasiewicz, A. (2016). Supersonic airfoil shape optimization by variable-fidelity models and manifold mapping. Procedia Computer Science, 80, 1103–1113. Sim, C. Y. D., Chang, M. H., & Chen, B. Y. (2014). Microstrip-fed ring slot antenna design with wideband harmonic suppression. IEEE Transactions on Antennas and Propagation, 62(9), 4828–4832. SMA Edge Mount P.C. Board Receptacles. (2013). Catalog. New Haven: Applied Engineering Products. Smith, R. C. (2014). Uncertainty quantification: Theory, implementation, and applications. New York: Society for Industrial & Applied Mathematics. Styblinski, M. A., & Opalski, L. J. (1986). Algorithms and software tools for IC yield optimization based on fundamental fabrication parameters. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 5(1), 79–89. Sumant, P. S., Wu, H., Cangellaris, A. C., & Aluru, N. R. (2010). A sparse grid based collocation method for model order reduction of finite element approximations of passive electromagnetic devices under uncertainty. IEEE MTT-S International Microwave Symposium Digest, pp. 1652–1655. Sumant, P. S., Wu, H., Cangellaris, A. C., & Aluru, N. R. (2012). Reduced-order models of finite element approximations of electromagnetic devices exhibiting statistical variability. IEEE Transactions on Antennas and Propagation, 60(1), 301–309. Swidzinski, J. F., & Chang, K. (2000). Nonlinear statistical modeling and yield estimation technique for use in Monte Carlo simulations. IEEE Transactions on Microwave Theory and Techniques, 48(12), 2316–2324. Tannehill, J. A., Anderson, D. A., & Pletcher, R. H. (1997). Computational fluid mechanics and heat transfer (2nd ed.). Thames: Taylor & Francis. Tesfahunegn, Y. A., Koziel, S., & Leifsson, L. (2015). Surrogate-based airfoil design with multilevel optimization and adjoint sensitivities, 53rd AIAA Aerospace Sciences Meeting, Science and Technology Forum. Kissimee. Tu, S., Cheng, Q. S., Zhang, Y., Bandler, J. W., & Nikolova, N. K. (2013). Space mapping optimization of handset antennas exploiting thin-wire models. IEEE Transactions on Antennas and Propagation, 61(7), 3797–3807. Zhang, Y., Hosder, S., Leifsson, L., & Koziel, S. (2012). Robust airfoil optimization under inherent and model-form uncertainties using stochastic expansions, AIAA-Paper 2012-0056. 50th AIAA Aerospace Sciences Meeting including the New Horizon Forum and Aerospace Exposition. Nashville. p. 212. Zhang, C., Feng, F., Zhang, Q., & Bandler, J. W. (2018). Enhanced cognition-driven formulation of space mapping for equal-ripple optimisation of microwave filters. IET Microwaves, Antennas and Propagation, 12(1), 82–91. Zhu, J., Bandler, J. W., Nikolova, N. K., & Koziel, S. (2007). Antenna optimization through space mapping. IEEE Transactions on Antennas and Propagation, 55(3), 651–658.
Chapter 4
Design-Oriented Modeling of High-Frequency Structures
Surrogate modeling, as demonstrated in this book so far, offers a practical way of handling computationally expensive simulation models. This might be especially convenient when massive evaluations thereof are required, for example, for the purpose of design optimization or uncertainty quantification (Bandler et al. 2008; Koziel and Bandler 2015). Chapters 2 and 3 outlined a number of modeling approaches concerning both data-driven and physics-based surrogates. Their major advantages include low evaluation cost and versatility (approximation models) as well as good generalization (physics-based models). Yet, as mentioned on various occasions, surrogate modeling exhibits some fundamental issues that limit its applicability in a significant manner. These include the curse of dimensionality, difficulties in constructing the models over wide ranges of parameters, or—in the context of physics-based surrogates—potential problems in finding and setting up low-fidelity models. In this book, we are mostly interested in modeling of high-frequency structures, routinely featuring highly nonlinear and vector-valued responses, handling of which incurs additional challenges. Several methods of alleviating the difficulties pertinent to conventional modeling have been discussed (e.g., high-dimensional model representation, HDMR; Foo and Karniadakis 2010; various model order reduction (MOR) methods; Baur et al. 2014; or the orthogonal matching pursuit, OMP; Tropp 2004), but these techniques are typically designed for handling particular classes of problems (e.g., underdetermined regression tasks in the case of OMP). Here, the main focus is on constructing reusable surrogates that are valid for wide ranges of parameters and that can be utilized for a variety of purposes including optimization (also multi-objective) or robust design. In order to achieve this goal, a specific approach is taken, the main component of which is identification of a region of the parameter space containing the designs that are “good” in a particular sense. The latter is determined by the set of figures of interest, which might be the operating conditions of the structure at hand, its material parameters, or other factors that are of interest for the designer. The rationale behind it is that good-quality designs normally occupy a very small portion of the parameter space. Identification of such a subset and restricting the modeling © Springer Nature Switzerland AG 2020 S. Koziel, A. Pietrenko-Dabrowska, Performance-Driven Surrogate Modeling of High-Frequency Structures, https://doi.org/10.1007/978-3-030-38926-0_4
129
130
4 Design-Oriented Modeling of High-Frequency Structures
process to it allow for a significant reduction of the number of training data samples required to set up a reliable surrogate without formally reducing the parameter ranges. Clearly, the fundamental question is how the aforementioned “promising” region can be found. In this book, this is achieved by considering a set of reference designs pre-optimized with respect to the performance figures that are of interest in a given context. This chapter discusses several rather straightforward techniques following the concept outlined above. The remaining chapters of this book describe more systematic approaches to performance-driven modeling as well as their design applications.
4.1
Data-Driven Modeling by Constrained Sampling
Here, a simple approach to surrogate modeling via constrained sampling is presented and illustrated using two examples of antenna structures.
4.1.1
Uniform Versus Constrained Sampling
In general, data-driven models are most often constructed based on uniform sampling of the design space (Couckuyt et al. 2010), the latter being an interval delimited using the lower and upper bounds on the parameters. On the other hand, in most practical cases, the parameter sets corresponding to the designs of sufficient quality with respect to the typical performance specifications (such as good matching at the specific frequency bands in the case of antennas) exhibit a relatively high level of correlation. In other words, usable designs are normally allocated along a specific path, a surface (or, more generally, along a manifold) within the design space. Consequently, uniform (and unconstrained) sampling leads to wasting majority of the samples because they correspond to designs that are of poor quality. Here, for the sake of reducing the cost of training data acquisition, the aim is to focus the modeling process only on these parts of the design space that correspond to potentially useful designs. A particular approach to defining the interesting part of the space is described below (Koziel 2017). Let us denote by x(1), . . . , x(K ), the set of the reference designs of the system at hand. In this section, for illustration purposes, the focus is on high-frequency structures designed for various operating frequencies, which is a quite typical situation in the areas of microwave and antenna engineering. In this case, the reference designs correspond to the system optimized for particular values of the operating frequency within the range of interest. Let d ¼ [d1 . . . dn]T be the deviation vector. The samples are only allocated in the part Xs of the original design space X that is within the distance d (component-wise) from the piecewise linear path connecting the reference designs, i.e.,
4.1 Data-Driven Modeling by Constrained Sampling
|S11| [dB]
a
131
0 -10 -20 -30 3
4
8
4 ac [mm]
ax [mm]
b 12 10 4
6 5 Frequency [GHz]
0 4
6 5 Frequency [GHz]
7
4
6 5 Frequency [GHz]
7
ys [mm]
8
4 2 0
2
7
6 us [mm]
7
6 5 Frequency [GHz]
4
6 5 Frequency [GHz]
7 6
7
Fig. 4.1 (a) Reference designs of the DRA of Fig. 4.3 (Sect. 4.1.3) optimized for 4 GHz (), 5.5 GHz (- - -), and 7 GHz (—); (b) geometry parameter values for selected dimensions (—) with the region of interest marked using the dashed line (Koziel 2017)
vðkÞ ¼ αxðkÞ þ ð1 αÞxðkþ1Þ ,
0 α 1,
ð4:1Þ
k ¼ 1, . . . , K – 1. As an example, K ¼ 3 is assumed. This is not critical but a lower K is computationally cheaper because it requires a smaller number of optimizations of the system of interest. Figure 4.1 shows the responses of a dielectric resonator antenna (DRA) considered as one of the illustration examples (Sect. 4.1.3) optimized for three operating frequencies of 4 GHz, 5.5 GHz, and 7 GHz, as well as selected geometry parameters versus frequency. Figure 4.2 shows the uniform sampling versus constrained sampling for four selected two-dimensional projections. The size of the constrained design space is considerably smaller than that of the original space, and the benefits are increasing with the increase of the problem dimensionality. For n ¼ 7 (which is
4 Design-Oriented Modeling of High-Frequency Structures
18
18
16
12
10 12 14 ax [mm]
6
4
4
us [mm]
6
5
10 az [mm]
12
10
10
8
4
10 12 14 ax [mm]
d
2
2
12
10 az [mm]
4
15 20 ay [mm]
6
6
4
4
15 20 ay [mm]
2
2 5
8 6
6
us [mm]
12
us [mm]
16 14
14
c
b
us [mm]
20
az [mm]
20 ay [mm]
ay [mm]
a
az [mm]
132
0
2 ac [mm]
4
0
2 ac [mm]
4
Fig. 4.2 Uniform versus constrained sampling for the DRA of Fig. 4.3 shown for four selected projections onto (a) ax–ay plane, (b) ay–az plane, (c) az–us plane, and (d) ac–us plane (see Sect. 4.1.3 for symbol explanation) (Koziel 2017)
the case for both illustration cases considered in Sect. 4.1.3), the volume-wise reduction of the design space is over three orders of magnitude.
4.1.2
Modeling Procedure
The overall modeling flow follows the typical data-driven surrogate modeling procedures (cf. Chap. 2) except the particular definition of the model domain as outlined in Sect. 4.1.1. It can be summarized as follows: 1. Obtain K reference designs as described in Sect. 4.1.1 (surrogate-based optimization methods are used for the sake of computational efficiency; Koziel and Ogurtsov 2014). 2. Define the constrained design space Xs (cf. Sect. 4.1.1). 3. Sample Xs and acquire EM simulation data. 4. Identify surrogate model s within Xs. In the examples of Sect. 4.1.3., the surrogate model is constructed using kriging interpolation (Queipo et al. 2005).
4.1 Data-Driven Modeling by Constrained Sampling
a
b
c
z x
133
dy
ax
y ay
dz
ys ac us
y x
dx
z
az x
ws w0
Fig. 4.3 DRA: (a) 3D view of its housing and top (b) and front (c) views (Koziel and Bekasiewicz 2015)
4.1.3
Illustration Examples
Two verification examples are presented, a DRA (Koziel and Bekasiewicz 2015) and a planar PIFA (Volakis 2007). Both antennas are narrowband, described by seven geometry parameters each and with highly nonlinear responses. Furthermore, we are interested in setting the surrogate models that would cover a wide range of antenna operating frequencies. All of these make the modeling problem challenging for conventional data-driven methods.
4.1.3.1
Dielectric Resonator Antenna
The first example is a dielectric resonator antenna (DRA) shown in Fig. 4.3 (Koziel and Bekasiewicz 2015). The structure consists of a dielectric resonator (εr ¼ 10 and tan δ ¼ 0.0001) situated on the ground plane. The resonator is fed through the ground plane slot by a microstrip line. The substrate material is 0.5-mm-thick Rogers RO4003 (εr ¼ 3.3). The DRA is covered by polycarbonate housing (εr ¼ 2.8). Design variables are x ¼ [ax ay az ac us ws ys]T mm. The fixed parameters are dx ¼ dy ¼ dz ¼ 1 mm, w0 ¼ 1.15 mm. The EM model f(x) of the antenna is implemented in CST (~420,000 cells, simulation time 19 minutes). The antenna of Fig. 4.3 has been modeled using kriging with constrained sampling. The lower and upper bounds for design variables are l ¼ [9 12 4 0 0.5 7.5 6]T and u ¼ [14 20 12 4 6 12.5 9]T, all in mm. The design space defined by these bounds contains the three DRA designs optimized for the operating frequencies 4 GHz, 5.5 GHz, and 7 GHz that form the path for the constrained sampling plan. The reference designs have been obtained using feature-based optimization (Koziel 2015), and the average computational cost of finding each design corresponds to 40 EM simulations of the antenna structure (~760 min of the CPU time). The distance vector is d ¼ [1 1 1 1 1 1 1]T. The volume-wise reduction of the design space is almost three orders of magnitude. The number of training samples is 800. Table 4.1 shows the modeling accuracy, which is compared to the conventional sampling in the entire space (error measure is
134
4 Design-Oriented Modeling of High-Frequency Structures
Table 4.1 Modeling results of DRA Surrogate modeling technique Kriging interpolation (uniform sampling in the original space, 500 samples) Kriging interpolation (sampling in the constrained space, 500 samples) Kriging interpolation (sampling in the constrained space, 80 samples)
Relative RMS error 12.2% 4.3% 12.5%
|S11|[dB]
0 -10 -20 3
4
5 6 Frequency [GHz]
7
8
Fig. 4.4 DRA responses at the selected test designs: high-fidelity EM model (—), proposed surrogate model (o) (Koziel 2017)
|S11| [dB]
0
-20
-40 4.5
5
6 5.5 Frequency [GHz]
6.5
7
Fig. 4.5 EM-simulated DRA responses at the designs obtained by optimizing the proposed surrogate model for operating frequencies 5.1 GHz (), 5.8 GHz (- - -), and 6.5 GHz (—)
RMS averaged over 100 random test points). The minimum number of samples required to achieve the same accuracy (within the constrained space) as that of the conventional model has be found to be only 80 (the cost reduction factor is over 6). Figure 4.4 shows the responses of the DRA at the selected test designs. Figure 4.5 shows the responses of the antenna optimized (using the surrogate) for the operating frequencies 5.1 GHz, 5.8 GHz, and 6.5 GHz. The optimization routine was Matlab’s fmincon (Matlab Optimization Toolbox, R2016a), and the optimization time is negligible (a few seconds). The quality of the optimized designs is excellent with no further design tuning necessary. Figure 4.6 shows the constrained model domain using four reference designs. The modeling error for this case is 4.1% (very similar to that in Table 4.1; using the same number of training samples), which indicates robustness of the approach.
4.1 Data-Driven Modeling by Constrained Sampling
135
ac [mm]
ax [mm]
4 12 10
0 4
5 6 Frequency [GHz]
7
4
6 5 Frequency [GHz]
7
4
6 5 Frequency [GHz]
7
8 ys [mm]
us [mm]
6 4 2 0
2
4
6 5 Frequency [GHz]
7
7 6
Fig. 4.6 An alternative constrained surrogate model domain setup for the DRA example (here using four instead of three reference designs). The average modeling error for this case is 4.1% (similar to that in Table 4.1, thus indicating robustness of the proposed modeling scheme)
The recommended size of the deviation vector d is a fraction of the range of design variables (corresponding to the reference designs), say 10–20% of these ranges. Here, for simplicity, the vector of ones has been utilized. To check the effect of the vector d setup, the example has been reworked using deviation vectors of 1.5d and d/1.5. The average modeling errors obtained for these two cases, 5.7% and 3.9%, are noticeably different yet comparable to 4.3% obtained for the original vector d.
4.1.3.2
Planar Inverted-F Antenna
The second example is a planar PIFA (Volakis 2007) shown in Fig. 4.7. The design variables are x ¼ [v0 v1 v2 v3 v4 v5 v6]T. Fixed parameters are [u0 u1 u2 u3 u4 u5 u6 u7 w0 r0]T ¼ [6.15 –50.0 –15.0 10.5 29.35 11.65 5.0 1.0 0.5]T mm; 0.508 mm substrate, and the boxes are of Rogers TMM4 and TMM6. The EM model f is evaluated in CST (~650,000 mesh cells, simulation time 13 min). The lower and upper bounds defining the initial design space are l ¼ [–5 2 4.5 2 6 –30 –40]T and u ¼ [0 10 7 8 15 –8 –30]T, all in mm. Similarly, as for the first example, three reference designs are used, optimized for the operating frequencies of 1.5 GHz, 2.5 GHz, and 3.5 GHz. The reference designs have been obtained using the method of (Koziel 2015); the average computational cost of finding each design corresponds to 40 EM simulations of the antenna structure (~500 min of the CPU time). The distance vector is d ¼ [0.5 0.5 0.5 1.0 1.5 1.5
136
4 Design-Oriented Modeling of High-Frequency Structures
a
b
v0 u4
v1 u v 0 2 w0
r0 Y Z
X
v3 v5
u5
v4 v6
u2 u1
u3
u6
Fig. 4.7 PIFA geometry: (a) top and side view, substrate shown transparent; (b) perspective view (Koziel 2017) Table 4.2 Modeling results of PIFA Surrogate modeling technique Kriging interpolation (uniform sampling in the original space, 800 samples) Kriging interpolation (sampling in the constrained space, 800 samples) Kriging interpolation (sampling in the constrained space, 100 samples)
Relative RMS error 11.2% 3.6% 10.9%
|S11| [dB]
0
-10
-20
1
1.5
2
2.5 3 Frequency [GHz]
3.5
4
Fig. 4.8 PIFA responses at the selected test designs: high-fidelity EM model (—), proposed surrogate model (o) (Koziel 2017)
1.0]T. The volume-wise reduction of the design space is over three orders of magnitude. The number of training samples is 800. Table 4.2 shows the modeling accuracy of the proposed scheme, which is also compared to conventional sampling in the entire space. The minimum number of samples required to achieve comparable accuracy (within the constrained space) as that of the conventional model is only 100 (cf. Table 4.2). Figure 4.8 shows the responses of the PIFA at the selected test designs. As an application example, the antenna was optimized for the operating frequencies of 1.85 GHz, 2.45 GHz, and
4.2 Design-Oriented Constrained Modeling for Operating Frequency and Substrate. . .
137
|S11| [dB]
0
-10
-20
1
1.5
2
3 2.5 Frequency [GHz]
3.5
4
Fig. 4.9 EM-simulated PIFA responses at the designs obtained by optimizing the proposed surrogate model for operating frequencies of 2.85 GHz (), 2.45 GHz (- - -), and 1.85 GHz (—) (150 MHz bandwidth required in each case) (Koziel 2017)
2.85 GHz (300 MHz bandwidth required in all cases). The EM-simulated antenna responses at the optimized designs are shown in Fig. 4.9.
4.2
Design-Oriented Constrained Modeling for Operating Frequency and Substrate Parameters
In this section, constrained modeling of antenna structures with respect to both operating conditions (center frequency) and material parameters (relative permittivity of the dielectric substrate) is considered (Koziel and Bekasiewicz 2017). This is a slight generalization of the concept introduced in Sect. 4.1, which will be further generalized in the subsequent chapters of the book.
4.2.1
Modeling Procedure
For the purpose of presentation, operating frequency f and relative dielectric permittivity εr of the substrate are considered as the operating condition and material parameter of interest, respectively. The surrogate model is to be reliable for the range of operating frequencies fmin f fmax and the range of permittivity εmin εr εmax. Let f(x) represent a response of an EM-simulated antenna model, where x is a vector of antenna parameters (in general, both geometry and material). The symbol x( f, εr) will denote the design optimized for the operating frequency f and the substrate dielectric permittivity εr. The region of the surrogate model validity is defined as a vicinity of the manifold spanned by nine reference designs covering the aforementioned ranges of the operating frequency and εr, fmin f fmax and εmin εr εmax. These are x( f #, εr#), for all combinations of f # 2 {fmin, f0, fmax} and εr# 2 fεmin , εr0 , εmax g , cf. Fig. 4.10.
138
a
4 Design-Oriented Modeling of High-Frequency Structures
b
e
x3
emax
x*( fmax, emax) x*( fmin, emin) x*( f0, e0)
e0 x*( fmin, emax)
emin
fmin
f0
x*( fmin, emin)
x1
f
fmax
x2
Fig. 4.10 Reference designs: (a) distribution on the f/ε plane and (b) designs allocated in a threedimensional space. The shaded area is a manifold that determines the region of interest for surrogate model construction (Koziel and Bekasiewicz 2017)
a
b v4
z
v5 v6 v2
v3
v7 v8
vk
v1 x*(f0, e0)
Pk(z) vk+1
Fig. 4.11 Auxiliary components of the region of validity of the surrogate model: (a) the manifold of Fig. 4.10b with the spanning vectors vk marked, with: v1 ¼ x( fmin, εmin) – x( f0, εr0), v2 ¼ x( fmin, εr0) – x( f0, εr0), v3 ¼ x( fmin, εmax) – x( f0, εr0), . . . , v8 ¼ x( f0, εmin) – x( f0, εr0); (b) manifold Mk with its spanning vectors and a point z and its projection onto the hyperplane containing Mk (Koziel and Bekasiewicz 2017)
Let us define vectors v1 ¼ x( fmin, εmin) – x( f0, εr0), v2 ¼ x( fmin, εr0) – x ( f0, εr0), v3 ¼ x( fmin, εmax) – x( f0, εr0), v4 ¼ x( f0, εmax) – x( f0, εr0), v5 ¼ x( fmax, εmax) – x( f0, εr0), v6 ¼ x( fmax, εr0) – x( f0, εr0), v7 ¼ x( fmax, εmin) – x( f0, εr0), and v8 ¼ x( f0, εmin) – x( f0, εr0) (see also Fig. 4.11a). In addition, let us define a manifold M, which is spanned by eight pairs of vectors [v1,v2], [v2,v3], . . . , [v8,v1], as
8
8
k¼1
k¼1
M ¼ [ M k ¼ [ fy ¼ x ð f 0 , ε0 Þ þ αvk þ βvkþ1 : α, β 0, α þ β 1g: ð4:2Þ For consistency of notation, let us also define v9 ¼ v1. Figure 4.11b shows a point z and its projection Pk(z) onto the hyperplane containing Mk. Pk is defined in a conventional sense (i.e., as the point on the hyperplane that is the closest to z). It corresponds to the expansion coefficients w.r.t. vk and vk + 1:
4.2 Design-Oriented Constrained Modeling for Operating Frequency and Substrate. . .
2 arg min z x ð f 0 , εr0 Þ þ αvk þ βv#kþ1 , α, β
139
ð4:3Þ
where vkþ1 # ¼ vkþ1 pk vk with pk ¼ vkTvk + 1(vkTvk). Thus, vk + 1# is a component of vk + 1 that is orthogonal to vk. Let us consider
T vk v#kþ1 α β ¼ z x ð f 0 , εr0 Þ:
ð4:4Þ
The least squares solution to (4.4) (equivalent to (4.3)) is given as
α β
T
¼ V Tk V k
1
V Tk ðz x ð f 0 , εr0 ÞÞ,
ð4:5Þ
where V k ¼ ½vk vkþ1 # . For practical reasons, more convenient are the expansion coefficients with respect to vk and vk + 1, which are given as α ¼ α pk β,
β ¼ β:
ð4:6Þ
Note that Pk(z) 2 Mk if and only if α 0, β 0, and α + β 1. Let us define xmax ¼ max {x( f0, εr0) + v1, . . . , x( f0, εr0) + v8} and xmin ¼ min {x( f0, εr0) + v1, . . . , x( f0, εr0) + v8}. The vector dx ¼ xmax – xmin is the range of variation of antenna geometry parameters within the manifold M. The surrogate model domain Xs is defined as a vector y 2 Xs if and only if: 1. The set K(y) ¼ {k 2 {1, . . . , 8} : Pk(y) 2 Mk} is not empty; 2. min{k(y – Pk(y)) dxk : k 2 K(y)} dmax, where denotes component-wise division (dmax is a user-defined parameter). The first condition ensures that y is sufficiently close to M in a “horizontal” sense. In the second condition, the user-defined dmax is compared to the normalized distance between y and its projection onto that Mk to which the distance is the shortest. Due to normalization w.r.t. the parameter ranges dx, dmax determines the “perpendicular” size of the surrogate model domain (as compared to the “tangential” size given by dx). Therefore, a typical value of dmax would be 0.2 or so. By definition, all the reference designs and the manifold M belong to Xs. The size of Xs is dramatically smaller (volume-wise) than the size of the hypercube containing the reference designs (i.e., x such that xmin x xmax). It should be mentioned that because the number of reference designs grow very quickly (exponentially) with the number of operating conditions considered (in this case, two), a practical application of the propose approach is limited to a few operating conditions. The surrogate model is constructed using kriging interpolation of the EM model response f based on the training data sampled within Xs (Queipo et al. 2005). A separate kriging model is constructed for each frequency in the frequency spectrum at which the response f is evaluated. The design of experiment technique is random sampling within the interval [xmin, xmax] assuming uniform probability distribution. The samples allocated outside [xmin, xmax] are rejected.
140
4.2.2
4 Design-Oriented Modeling of High-Frequency Structures
Case Study: Ring Slot Antenna
The modeling technique is demonstrated using a ring slot antenna shown in Fig. 4.12 (Sim et al. 2014). The structure comprises a microstrip line that feeds a circular ground plane slot with defected ground structure (DGS). The low-pass properties of the DGS allows for suppression of the antenna harmonic frequencies. The thickness and loss tangent of the substrate are 0.762 and 0.0018, respectively. The parameter set is x ¼ [lf ld wd r s sd o g εr]T; εr represents relative permittivity of the substrate. The feed line width wf is computed for each εr to ensure 50 ohm input impedance. The computational model of the antenna is implemented in CST (~300,000 cells, simulation 90 s). The modeling problem is already difficult due to a large number of parameters. To make it even more challenging, a wide range of operating frequencies of interest, fmin ¼ 2.5 GHz to fmax ¼ 6.5 GHz, and a wide range of substrate permittivity, εmin ¼ 2.0 to εmax ¼ 5.0, were assumed. The reference designs have been obtained by optimizing the structure of Fig. 4.12 for all combinations of f 2 {2.5, 4.5, 6.5} GHz and εr 2 {2.0, 3.5, 5.0} using feature-based optimization (FBO) (Koziel 2015). Optimization is understood as minimizing the antenna reflection at f0. In general, in case of possible nonuniqueness of the optimization result, regularization can be used, e.g., by introducing a penalty factor that enforces extension of the antenna bandwidth. The cost of FBO was 40–50 antenna simulations (per design). Antenna responses at all nine reference designs are shown in Fig. 4.13. The modeling approach has been verified for dmax ¼ 0.2 by setting up the kriging surrogate with 100, 200, 500, and 1000 random samples. The test set contained 100 random points. For benchmarking, the kriging model was also constructed using 1000 training points allocated in a conventional (unconstrained) manner. Table 4.3 shows the average RMS errors for all considered models. Selected two-dimensional projections of the training sets for uniform and Fig. 4.12 Geometry of the ring slot antenna with a microstrip feed (dashed line) (Sim et al. 2014)
lf
s sd
wf
wd
r
g ld o
4.2 Design-Oriented Constrained Modeling for Operating Frequency and Substrate. . .
|S11| [dB]
a
0 -20 -40
|S11| [dB]
b
|S11| [dB]
2
3
5 4 Frequency [GHz]
6
7
2
3
5 4 Frequency [GHz]
6
7
2
3
4 5 Frequency [GHz]
6
7
0 -20 -40
c
141
0 -20 -40
Fig. 4.13 Reflection responses of the antenna of Fig. 4.12 for nine reference designs: (a) εr ¼ 2.0, (b) εr ¼ 3.5, (c) εr ¼ 5.0; () f0 ¼ 2.5 GHz, (- - -) f0 ¼ 4.5 GHz, (—) f0 ¼ 6.5 GHz
Table 4.3 Ring slot antenna: modeling results Design space sampling and surrogate modeling techniquea Uniform sampling in the original space, N ¼ 1000 Constrained sampling, N ¼ 100 Constrained sampling, N ¼ 200 Constrained sampling, N ¼ 500 Constrained sampling, N ¼ 1000 a
Average relative RMS error 7.3% 7.8% 5.5% 3.3% 2.1%
In all cases, the surrogate model constructed using kriging (cf. Sect. 4.1.2)
142
4 Design-Oriented Modeling of High-Frequency Structures
b 1.5
1.5
wd
5
4
4
1
1
3
3
0.5
0.5
2
2
24 lf
26
24 lf
26
c
d
4
5 s
5.8 5.6 5.4 5.2 5 4.8 4.6
24 lf
5
5
4
4
er
o
5.8 5.6 5.4 5.2 5 4.8 4.6
26
3
4
6
5 s
6
26
er
24 lf
o
5 sd
2 wd
2
sd
a
2 4.5
3
5
5.5
6
2 4.5
o
5
5.5
6
o
Fig. 4.14 Uniform versus constrained sampling for selected two-dimensional projections onto (a) lf -wd plane, (b) lf -sd plane, (c) s-o plane, and (d) o-εr plane (Koziel and Bekasiewicz 2017)
0 |S11| [dB]
-5 -10 -15 -20 -25
2
3
4 5 Frequency [GHz]
6
7
Fig. 4.15 Responses of the ring slot antenna of Fig. 4.14 at the selected test designs for N ¼ 1000: high-fidelity EM model (—), proposed surrogate model (o) (Koziel and Bekasiewicz 2017)
constrained sampling are shown in Fig. 4.14. It can be observed that the latter allows for 3.5-fold improvement of the predictive power of the surrogate. At the same time, comparable modeling error is achieved with tenfold reduction of the number of training samples. Figure 4.15 shows the surrogate and EM model responses at the selected test designs.
4.3 Constrained Feature-Based Modeling of Compact Microwave Structures
4.2.3
143
Application Examples and Experimental Validation
To demonstrate practical application of the proposed modeling approach, the antenna of Fig. 4.12 was designed—by optimizing the devised surrogate—for various substrate permittivity and operating frequencies (see Table 4.4). Figure 4.16 shows the optimization results for the designs of Table 4.4. An excellent agreement between the surrogate and EM model can be observed. Also, the antenna responses are well centered at the requested operating frequencies. The antenna designs with εr ¼ 2.2 and f0 ¼ 3.4 GHz, as well as with εr ¼ 3.5 and f0 ¼ 5.8 GHz, have been fabricated and measured. Figure 4.17 shows the photographs of the manufactured prototypes. The agreement between the simulated and measured characteristics is very good as indicated in Fig. 4.18. Slight discrepancies between the responses are due to electrically large measurement setup which was not accounted for in the EM simulation model.
4.3
Constrained Feature-Based Modeling of Compact Microwave Structures
In this section, a technique of Sect. 4.1 is combined with response features as well as nonuniform sampling approach in order to construct a low-cost design-oriented surrogate of a miniaturized microwave coupler (Koziel and Bekasiewicz 2016).
4.3.1
Case Study. RRC and Response Features
The modeling concept will be explained and demonstrated using the example compact equal-split rat-race coupler (RRC) composed of two vertical and four horizontal slow-wave resonant structures (Bekasiewicz et al. 2015) as shown in Fig. 4.19. The RRC is implemented on a Taconic RF-35 dielectric substrate (h ¼ 0.762 mm, εr ¼ 3.5, tanδ ¼ 0.0018). The design variables are Table 4.4 Ring slot antenna: optimized antenna designs εr 2.2 2.6 4.3 4.1 2.6 3.5
f0 [GHz] 3.4 4.8 3.75 5.3 5.8 5.8
Antenna dimensions lf ld wd 23.55 5.70 1.08 22.04 5.04 0.23 24.06 5.33 0.60 22.31 4.58 0.23 22.02 4.39 0.51 22.14 4.19 0.40
r 12.85 9.93 10.97 8.58 7.66 7.66
s 5.21 3.34 3.91 3.53 3.31 3.06
sd 3.33 4.75 4.12 5.04 4.39 4.39
o 5.47 5.80 5.42 5.30 5.18 4.78
g 0.96 1.20 0.87 2.00 1.93 2.03
144
4 Design-Oriented Modeling of High-Frequency Structures
0 |S11| [dB]
|S11| [dB]
0 -10 -20 -30
3
4 5 6 Frequency [GHz]
|S11| [dB]
|S11| [dB]
-20 3
4 5 6 Frequency [GHz]
4 5 6 Frequency [GHz]
7
3
4 5 6 Frequency [GHz]
7
3
4 5 6 Frequency [GHz]
7
-10 -20 -30
7
0
0 |S11| [dB]
| |S11| [dB]
3
0
-10
-10 -20 -30
-20 -30
7
0
-30
-10
3
4 5 6 Frequency [GHz]
7
-10 -20 -30
Fig. 4.16 Surrogate (- - -) and EM-simulated responses (—) of the ring slot antenna of Fig. 4.12 at the designs obtained by optimizing the proposed surrogate model for (a) εr ¼ 2.2 and f0 ¼ 3.4 GHz, (b) εr ¼ 2.6 and f0 ¼ 4.8 GHz, (c) εr ¼ 4.3 and f0 ¼ 3.75 GHz, (d) εr ¼ 4.1 and f0 ¼ 5.3 GHz, (e) εr ¼ 2.6 and f0 ¼ 5.8 GHz, and (f) εr ¼ 3.5 and f0 ¼ 5.8 GHz. Requested operating frequencies marked using vertical lines (Koziel and Bekasiewicz 2017)
Fig. 4.17 Antenna prototypes: εr ¼ 2.2, f0 ¼ 3.4 GHz (left) and εr ¼ 2.6, f0 ¼ 5.8 GHz (right) (Koziel and Bekasiewicz 2017)
4.3 Constrained Feature-Based Modeling of Compact Microwave Structures
145
0o0 dB
0 45o
-5
-45o -20
S11 [dB]
-10 -15
90o
-90o
-20 -20 -25 -30
135o 2
2.5
3 3.5 4 4.5 Frequency [GHz]
5
0 dB 180o
-135o
0o0 dB
0 45o
-5
-45o -20
S11 [dB]
-10 -15
90o
-90o
-20 -20 -25 -30
135o 4
7 6 5 Frequency [GHz]
8
0 dB 180o
-135o
Fig. 4.18 Simulated (gray) and measured (black) characteristics of the antennas of Fig. 4.17. Solid and dashed lines in radiation pattern plots represent H- and E-plane responses, respectively (Koziel and Bekasiewicz 2017)
x ¼ [l1 l2 l3 w1 l4 l5 l6]T, whereas the dimension w0 ¼ 1.7 is kept constant in order to ensure 50 ohm input impedance. The dependent variables are w2 ¼ w1, w3 ¼ 20w1 + 19l1, and w4 ¼ 6w2 + 7l5. All dimensions are in mm. The RRC is implemented in CST Microwave Studio (CST 2018) and simulated using its frequency domain solver with ~800,000 mesh cells. The simulation time is about 75 min. Thus, the considered coupler is a representative example of a topologically complex structure with expensive computational model. A typical response of the RRC of Fig. 4.19 is shown in Fig. 4.20. The responses correspond to the design optimized for bandwidth at the operating frequency of around 0.8 GHz. It can be observed that S-parameters are highly nonlinear functions of frequency and therefore difficult to be modeled. At the same time, only a few
146
4 Design-Oriented Modeling of High-Frequency Structures
Fig. 4.19 Microstrip rat-race coupler constructed of slow-wave resonant structures—geometry (Bekasiewicz et al. 2015)
3
w1
4 l2 l3
l5
l6
w2
w4 w3
l4
l1
1
w0
2
S-parameters [dB]
0 -10
|S11|
-20
|S21|
-30
|S31|
-40
|S41|
-50 0.5
1 Frequency [GHz]
1.5
Fig. 4.20 Compact RRC: typical responses of the structure tuned to around 0.8 GHz operating frequency and the characteristic points as described in the text
points extracted from the response are necessary for design purposes, specifically to determine the figures of interest such as the operating frequency of the circuit, the power split error, or the 20 dB bandwidth. These points include the frequency and levels of the minima of jS11j, jS31j, and jS41j, maximum of jS21j, as well as the points corresponding to 20 dB level of jS11j, and jS41j, all shown in Fig. 4.20. As indicated in Fig. 4.21, the dependence of these characteristic point coordinates on the geometry parameters of the coupler is only slightly nonlinear and therefore easier to model. Thus, we choose to model only the characteristic points rather than the entire responses as long as the figures of interest the coupler is designed for can be restored from these points.
4.3.2
Modeling Methodology
In the case of high-frequency structures, including compact microwave circuits, the design variables and their changes have to be well correlated in order to produce designs that are acceptable with respect to typical performance specifications
4.3 Constrained Feature-Based Modeling of Compact Microwave Structures
Level of min |S11| [GHz]
Frequency of min |S11| [GHz]
1.3 1.2 1.1 1 0.9 0.8
147
-2.8
-2.9
-3
-3.1 0.2
0.7 0.2
3 0.3 l1 [mm]
0.3
2.5 0.4
2
l2 [mm]
l1 [mm]
0.4
3
2.5 l2 [mm]
2
Fig. 4.21 Compact RRC: dependence of the selected characteristic points on geometry parameters l1 and l2 of the structure: (a) frequency of jS11j minimum, (b) level of jS21j maximum (corresponding to the circuit operating frequency) (Koziel and Bekasiewicz 2016)
3 l5
0.4
2
0.2 0.5
0.8
l2
l1
0.6
1 1.5 f0 [GHz]
2
0.5
0.6 1 1.5 f0 [GHz]
2
0.5
1 1.5 f0 [GHz]
2
Fig. 4.22 Dependence of geometry parameters (here, l1, l2, and l5) on the operating frequency f0 of the RRC (here, for bandwidth-optimized designs) (Koziel and Bekasiewicz 2016)
(cf. Fig. 4.22). This means that uniform sampling across an interval-type of a domain (Koziel et al. 2018; Koziel 2017) leads to a situation where majority of the samples are useless and the resources utilized to acquire the corresponding EM data are essentially wasted. Similarly, sequential sampling schemes (Crombecq et al. 2011; Liu et al. 2016) are of not much help either because the infill criteria are normally based on model accuracy rather than the quality of the samples from the design requirement standpoint. Here, the modeling procedure similar to that of Sect. 4.1 is adopted, enhanced by nonlinear domain scaling and employment of response features (Koziel and Bekasiewicz 2016). Let x(1), . . . , x(K ), be the set of the reference (e.g., bandwidthoptimized) designs of the structure at hand that correspond to a range of operating frequencies being of interest for the modeling purposes. Furthermore, let d ¼ (u – l)/M be a deviation vector where l ¼ min {x(1), . . . , (K ) x }, u ¼ max {x(1), . . . , x(K )} (lower and upper bounds for the design variables).
148
4 Design-Oriented Modeling of High-Frequency Structures
b
3.5
3 l3
3 l3
c
3.5
2.5
2.5 2
2 3
2 l2
4
3.5 3
l3
a
2.5 2
3
2
4
l2
3
2
4
l2
Fig. 4.23 Design space sampling: (a) uniform sampling in the entire space, (b) uniform sampling restricted to the region Xs, and (c) nonuniform sampling in Xs. The solid line denotes the piecewise linear path connecting the reference designs (Koziel and Bekasiewicz 2016)
The samples are only allocated in the part of the space Xs that is within the distance d from the piecewise linear path connecting the reference designs, i.e., v(k) ¼ αx(k) + (1 – α)x(k + 1), 0 α 1, k ¼ 1, . . . , K – 1. In this case, K ¼ 3 and M ¼ 5 have been assumed (neither is critical but lower K reduces the cost of reference design acquisition, cf. Sect. 4.1). Because of a nonlinear relationship between the circuit’s operating frequency and its dimensions (cf. Fig. 4.22), uniform sampling in the constrained domain, as described in the previous paragraph, would lead to having majority of samples corresponding to lower operating frequencies. From the point of view of designoriented modeling, we are more interested in obtaining the training set in which all operating frequencies are represented in a relatively uniform fashion. Toward this end, the following nonuniform probability distribution is employed: xtmp ¼ l þ ðu lÞ∘r ρ ,
ð4:7Þ
where xtmp is a candidate sample (accepted if it is in Xs or rejected otherwise), r is a vector of uniformly distributed random numbers from [0,1] interval, and ρ ¼ [ρ1 . . . ρn] is a scaling vector so that ρk > 1 (typically between 1.2 and 3). Here, multiplication and r ρ are understood as component-wise operations. Setting ρk > 1 implies that designs corresponding to higher values of the operating frequencies will be better represented in the training pool. The specific values of the coefficients can be estimated based on the reference designs (details are omitted here for the sake of brevity). Figure 4.23 shows the examples of uniform sampling in the entire design space, restricted to Xs, and nonuniform sampling as described above. The plots are for the RRC of Fig. 4.19 and correspond to the 2D projections onto the l1–l2 plane. Overall, both space restriction and nonuniform sampling lead to considerable reduction of the number of samples required for the surrogate model setup as demonstrated in Sect. 4.3.3.
4.3 Constrained Feature-Based Modeling of Compact Microwave Structures
149
Table 4.5 Modeling results of compact microstrip rat-race Surrogate modeling technique Kriging interpolation (uniform sampling in original space) Kriging interpolation (nonuniform sampling in constrained space) Feature-based modeling (nonuniform sampling in constrained space)
Relative error 38.0% 17.6% 2.1%
The surrogate model is constructed using kriging interpolation (Queipo et al. 2005). As mentioned before, only the characteristic points (as described in Sect. 4.3.1) are being modeled. This means that the surrogate cannot be used to restore the entire coupler response upon evaluating the models. However, it contains sufficient information to carry out the design optimization process (in particular, information about the center frequency, corresponding levels of matching and isolation, bandwidth, as well as power split).
4.3.3
Numerical Verification and Application Case Studies
The training set for constructing the surrogate model contains 1000 samples allocated according to the procedure of Sects. 3.2 and 3.3 with l ¼ [0.11 2.4 2.0 0.18 0.25 3.8 3.8]T and u ¼ [0.6 4.2 3.4 0.28 0.68 11.5 8.3]T, which contains the coupler designs corresponding to the wide range of operating frequencies from 0.5 to 2.0 GHz. The scaling vector p determining the data sampling nonuniformity (cf. Sect. 4.3.2) is p ¼ [1.4 1.2 1.6 1.1 2.2 1.3 1.5]T. Table 4.5 shows the relative least squares error ks(x) – f(x)k/kf(x)k (averaged for 100 random test designs; s stands for the surrogate model) for the surrogate model described in this section and the standard kriging interpolation of the S-parameter responses. It can be observed that the sampling scheme utilized here has a major impact on the model accuracy and that the proposed model outperforms the conventional one in terms of the predictive power. As explained earlier, the main purpose of the surrogate model considered here is to expedite the design process. For the sake of illustration, the surrogate model was utilized to design the RRC of Fig. 4.19 for two different scenarios: (i) widening the 20 dB bandwidth for matching jS11j and isolation jS41j and (ii) reducing the coupler size while maintaining the minimum level of jS11j and jS41j below 20 dB at the operating frequency. In both cases, equal power split (i.e., jS21j ¼ jS31j) at the operating frequency is also requested. Figures 4.24, 4.25, and 4.26 show the designs obtained by optimizing the surrogate model under the above scenarios for three different operating frequencies of 0.9 GHz, 1.1 GHz, and 1.5 GHz, respectively. Note that all the designs are obtained by direct optimization of the surrogate model with no further corrections necessary.
150
b
0
S-parameters [dB]
S-parameters [dB]
a
4 Design-Oriented Modeling of High-Frequency Structures
-10
-20
-30
0.8 1 Frequency [GHz]
0
-10
-20
-30
1.2
0.8 1 Frequency [GHz]
1.2
Fig. 4.24 RRC designs obtained using the feature-based surrogate model for the operating frequency 0.9 GHz: (a) bandwidth enhancement (BW ¼ 137 MHz, size 520 mm2) and (b) size minimization (BW ¼ 67 MHz, size 482 mm2). jS11j, jS21j, jS31j, and jS41j are marked using (–––), (– –), (– ∙ –), and (∙ ∙ ∙), respectively (Koziel and Bekasiewicz 2016)
b
0
S-parameters [dB]
S-parameters [dB]
a
-10
-20
-30
0
-10
-20
-30 0.8
1 1.2 1.4 Frequency [GHz]
0.8
1 1.2 1.4 Frequency [GHz]
Fig. 4.25 RRC designs obtained using the feature-based surrogate model for the operating frequency 1.1 GHz: (a) bandwidth enhancement (BW ¼ 231 MHz, size 407 mm2) and (b) size minimization (BW ¼ 75 MHz, size 378 mm2). jS11j, jS21j, jS31j, and jS41j are marked using (–––), (– –), (– ∙ –), and (∙ ∙ ∙), respectively (Koziel and Bekasiewicz 2016)
References
b
0
S-parameters [dB]
S-parameters [dB]
a
151
-10
-20
-30
0
-10
-20
-30 1
1.5 Frequency [GHz]
2
1
1.5 Frequency [GHz]
2
Fig. 4.26 RRC designs obtained using the feature-based surrogate model for the operating frequency 1.5 GHz: (a) bandwidth enhancement (BW ¼ 275 MHz, size 299 mm2) and (b) size minimization (BW ¼ 136 MHz, size 267 mm2). jS11j, jS21j, jS31j, and jS41j are marked using (–––), (– –), (– ∙ –), and (∙ ∙ ∙), respectively (Koziel and Bekasiewicz 2016)
References Bandler, J. W., Koziel, S., & Madsen, K. (2008). Editorial—Surrogate modeling and space mapping for engineering optimization. Optimization and Engineering, 9(4), 307–310. Baur, U., Benner, P., & Feng, L. (2014). Model order reduction for linear and nonlinear systems: A system-theoretic perspective. Archives of Computational Methods in Engineering, 21(4), 331–358. Bekasiewicz, A., Koziel, S., & Pankiewicz, B. (2015). Accelerated simulation-driven design optimization of compact couplers by means of two-level space mapping. IET Microwaves, Antennas and Propagation, 9(7), 618–626. Couckuyt, I., Declercq, F., Dhaene, T., Rogier, H., & Knockaert, L. (2010). Surrogate-based infill optimization applied to electromagnetic problems. International Journal of RF and Microwave Computer-Aided Engineering, 20(5), 492–501. Crombecq, K., Laermans, E., & Dhaene, T. (2011). Efficient space-filling and non-collapsing sequential design strategies for simulation-based modeling. European Journal of Operational Research, 214(3), 683–696. CST Microwave Studio. (2018). CST AG, Bad Nauheimer Str. 19, D-64289 Darmstadt, Germany. Foo, J., & Karniadakis, G. E. (2010). Multi-element probabilistic collocation method in high dimensions. Journal of Computational Physics, 229(5), 1536–1557. Koziel, S. (2015). Fast simulation-driven antenna design using response-feature surrogates. International Journal of RF and Microwave Computer-Aided Engineering, 25(5), 394–402. Koziel, S. (2017). Low-cost data-driven surrogate modeling of antenna structures by constrained sampling. IEEE Antennas and Wireless Propagation Letters, 16, 461–464. Koziel, S., & Bandler, J. W. (2015). Rapid yield estimation and optimization of microwave structures exploiting feature-based statistical analysis. IEEE Transactions on Microwave Theory and Techniques, 63(1), 107–114.
152
4 Design-Oriented Modeling of High-Frequency Structures
Koziel, S., & Bekasiewicz, A. (2015). Fast EM-driven size reduction of antenna structures by means of adjoint sensitivities and trust regions. IEEE Antennas and Wireless Propagation Letters, 14, 1681–1684. Koziel, S., & Bekasiewicz, A. (2016). Accurate design-oriented simulation-driven modeling of miniaturized microwave structures. International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, 29(6), 1028–1035. Koziel, S., & Bekasiewicz, A. (2017). On reduced-cost design-oriented constrained surrogate modeling of antenna structures. IEEE Antennas and Wireless Propagation Letters, 16, 1618–1621. Koziel, S., & Ogurtsov, S. (2014). Antenna design by simulation-driven optimization. Berlin: Springer. Koziel, S., Sigurðsson, A. T., & Szczepanski, S. (2018). Uniform sampling in constrained domains for low-cost surrogate modeling of antenna input characteristics. IEEE Antennas and Wireless Propagation Letters, 17(1), 164–167. Liu, Z., Yang, M., & Li, W. (2016). A sequential Latin hypercube sampling method for metamodeling. In L. Zhang, X. Song, & Y. Wu (Eds.), Theory, methodology, tools and applications for modeling and simulation of complex systems (AsiaSim 2016, Communication in Computer and Information Science) (Vol. 643, pp. 176–185). New York: Springer. Queipo, N. V., Haftka, R. T., Shyy, W., Goel, T., Vaidynathan, R., & Tucker, P. K. (2005). Surrogate-based analysis and optimization. Progress in Aerospace Sciences, 41(1), 1–28. Sim, C. Y. D., Chang, M. H., & Chen, B. Y. (2014). Microstrip-fed ring slot antenna design with wideband harmonic suppression. IEEE Transactions on Antennas and Propagation, 62(9), 4828–4832. Tropp, J. A. (2004). Greed is good: Algorithmic results for sparse approximation. IEEE Transactions on Information Theory, 50(10), 2231–2242. Volakis, J. L. (Ed.). (2007). Antenna engineering handbook. New York: McGraw-Hill.
Chapter 5
Triangulation-Based Constrained Modeling
Design of high-frequency structures, as mentioned on various occasions in this book, is heavily based on full-wave electromagnetic (EM) simulation tools. They provide accuracy but are CPU-intensive. Surrogate modeling offers a way to reduce the computational cost of EM-driven design procedures. Unfortunately, standard modeling techniques are unable to ensure sufficient predictive power for real-world highfrequency structures featuring multiple parameters, wide parameter ranges, and highly nonlinear responses. Chapter 4 introduced a concept of performance-driven modeling, where the process of constructing the surrogate is restricted to a particular region of the parameter space. More specifically, we were interested in a subset containing the designs that were of sufficient quality from the point of view of the considered figures of interest such as operating frequencies or bandwidth, as well as material parameters, e.g., substrate permittivity. Because the parameter sets corresponding to such quality designs exhibit high level of correlations, the volume of the resulting model domain is typically dramatically smaller than that of the original (interval-like) set. Consequently, considerable savings can be achieved in terms of the number of training data samples necessary to set up a reliable surrogate. Chapter 4 described several basic ways of implementing the above concept for one and two figures of interest. In this chapter, a more systematic approach is discussed, also following the constrained modeling concept, but allowing arbitrary allocation of the reference designs as well as, in principle, arbitrary number of performance figures (Koziel and Sigurðsson 2018a; Koziel et al. 2018). Here, the surrogate model domain is spanned by the simplexes, obtained from triangulating of the reference designs, further extended into their orthogonal complements. As before, restricting the model domain permits dramatic reduction of the number of training data samples necessary to build a reliable model, in comparison to the conventional approach. The discussed modeling framework is demonstrated using several examples. Comprehensive benchmarking and application studies and experimental verification are also provided.
© Springer Nature Switzerland AG 2020 S. Koziel, A. Pietrenko-Dabrowska, Performance-Driven Surrogate Modeling of High-Frequency Structures, https://doi.org/10.1007/978-3-030-38926-0_5
153
154
5.1
5 Triangulation-Based Constrained Modeling
Reference Designs
Let us use the symbols Fk, k ¼ 1, . . . , N, to denote the figures of interest that are to be considered in the design process. Typical examples include an operating frequency of the structure, 10 dB bandwidth, relative permittivity or height of the substrate material, etc. The surrogate model will be constructed in the region spanned by the reference designs x( j ), j ¼ 1, . . . , p, which are optimized for selected values of the figures of interest F( j ) ¼ [F1( j ) . . . FN( j )]. The reference designs can be obtained specifically for the purpose of building the model or be available beforehand (e.g., from previous design cases of a structure at hand). The latter is of particular interest because it allows us to reuse already existing designs. Also, it should be noted that the reference designs do not have to be exactly optimal because the surrogate model domain is spanned by these designs but it is not restricted to them (as explained later, they are allocated in the domain interior). Given the reference designs, a set of simplexes is created as elementary cells utilized in the model domain definition. Assignment of the reference designs to the simplexes is realized using Delaunay triangulation (Borouchaki et al. 1996) which ensures possibly largest angles between the simplex vertices. The sets of vertices of the simplex S(k), k ¼ 1, . . . , NS, is denoted as S(k) ¼ {x(k.1), . . . , x(k.N + 1)}, in which x(k. j ) 2 {x(1), . . . , x(N )}, j ¼ 1, . . . , N + 1, are individual vertices. In other words, the vertices x(k.j) of the kth simplex are certain reference designs. A particular selection is an outcome of the triangulation process. The concept of reference designs, triangulation, and forming the simplexes has been shown in Fig. 5.1. In the example given, simplex S(1) is composed of the reference designs x(1), x(2), and x(4), i.e., we have x(1.1) ¼ x(1), x(1.2) ¼ x(2), and x(1.3) ¼ x(4).
5.2
Surrogate Model Domain Definition
In this section, a mathematical formalism is introduced that allows us to define the surrogate model domain XS using the reference designs described in Sect. 5.1. The domain is determined as a vicinity of the manifold M being the union of the simplexes S(k), i.e. (Koziel and Sigurðsson 2018a), M¼
o XNþ1 XNþ1 [n ðk:jÞ y¼ α x : 0 α 1, α ¼ 1 : j j j j¼1 j¼1
ð5:1Þ
k
The vicinity is determined by the distance from M in the orthogonal complements of the subspaces (or hyperplanes) containing the simplexes S(k). In particular, in order to determine whether a given point z is within the model domain or not, one needs to find the distance between z and the manifold M in the sense highlighted above. In the remaining part of this section, a rigorous definition of the domain XS is given.
5.2 Surrogate Model Domain Definition
155
x3
F2 x
x
(2)
(5)
(4) S (6)
(1)
S
x (5) S
S x
x
(1)
S
(2)
x
S
(9)
x2
(9)
F
F
(9)
F
(6)
(5)
(7)
S
(6)
S
(8)
(4)
x (3) S (3)
S(1) = [x(1) x(2) x(4)] S(2) = [x(1) x(3) x(4)] S(3) = [x(2) x(5) x(6)]
x
x
F
(8)
F
(2)
F
(1)
(7)
F x1
S(4) = [x(2) x(5) x(7)] S(5) = [x(2) x(4) x(6)] S(6) = [x(4) x(6) x(7)]
(4)
F F
(8)
(7)
(3)
F1
S(7) = [x(5) x(6) x(8)] S(8) = [x(6) x(7) x(8)] S(9) = [x(5) x(8) x(9)]
Fig. 5.1 Conceptual illustration of the reference designs and their triangulation. Reference designs in an example three-dimensional design space shown in the left panel; figures of interest vectors corresponding to the reference designs plotted in the two-dimensional feature space as well as their triangulation shown in the right panel. In the considered case, there are nine simplexes formed by the reference designs. Referring to the notation used in Sect. 5.1, here, we have N ¼ 2 (two figures of interest), p ¼ 9 (nine reference designs), and NS ¼ 9 (nine simplexes) (Koziel and Sigurðsson 2018a) Fig. 5.2 Example simplex S(k) with its anchor x(0) and the spanning vectors v(1) and v(2) as well as a point z and its projection onto the hyperplane Hk containing S(k) (Koziel and Sigurðsson 2018a)
z Pk(z)
v(2)
v(1) x(0)
As a first step, let us consider a projection Pk(z) of a point z onto the hyperplane Hk containing the simplex S(k). The projection is defined as the point on the hyperplane that is the closest to z. For a convenience of notation, let us define the simplex anchor x(0) ¼ x(k.1) and the spanning vectors v( j ) ¼ x(k. j + 1) – x(0), j ¼ 1, . . . , N. Explanation of these terms can be found in Fig. 5.2. The projection corresponds to the following expansion coefficients with respect to the vectors v( j ) (Koziel and Sigurðsson 2018a) arg
min
½αð1Þ , ..., αðN Þ
h i2 XN ð jÞ ð jÞ α v z xð0Þ þ , j¼1
ð5:2Þ
where the vectors vð jÞ are obtained from v( j ) by orthogonalization (i.e., vð1Þ ¼ vð1Þ , vð2Þ ¼ vð2Þ a12 vð1Þ where a12 ¼ v(1)Tv(2)(v(1)Tv(1), etc.). In general, we have
156
5 Triangulation-Based Constrained Modeling
h i h i V ¼ vð1Þ vð2Þ . . . vðN Þ ¼ vð1Þ vð2Þ . . . vðN Þ A,
ð5:3Þ
where A is an upper-triangular matrix of coefficients obtained as a result of the above orthogonalization procedure. The problem (5.2) is equivalent to h
vð1Þ vð2Þ
2 ð1Þ 3 i α 6 7 . . . vðN Þ 4 ⋮ 5 ¼ z xð0Þ : αðN Þ
ð5:4Þ
Because the dimension of the simplex is normally lower than the dimension of the design space, the expansion coefficients can be found as follows 2
3 αð1Þ 1 T T 6 7 V z xð0Þ : 4 ⋮ 5¼ V V
ð5:5Þ
αðN Þ In order to determine whether Pk(z) is within the convex hull of the simplex S(k), one needs the expansion coefficients α( j ) of z with respect to the original vectors v( j ). These are given as 2
3 2 ð1Þ 3 αð1Þ α 6 7 6 7 4 ⋮ 5 ¼ A 4 ⋮ 5: α
ðN Þ
α
ð5:6Þ
ðN Þ
The projection Pk(z) 2 S(k) if and only if it is a convex combination of the vectors v , i.e., if the following two conditions are satisfied: ( j)
1. α( j ) 0 for j ¼ 1, . . . , N. 2. α(1) + . . . + α(N ) 1. In the next step, we define xmax ¼ max {x(k), k ¼ 1, . . . , p} and xmin ¼ min {x(k), k ¼ 1, . . . , p}. The vector dx ¼ xmax – xmin determines the range of variation of geometry parameters within M. The domain XS of the surrogate is defined by the following two conditions: a vector y 2 XS if and only if 1. The set K(y) ¼ {k 2 {1, . . . , NS} : Pk(y) 2 S(k)} is not empty. 2. min{k(y – Pk(y))//dxk/ : k 2 K(y)} D, where // denotes component-wise division, and D is a user-defined parameter. The above conditions describe the following situation. First, the point z has to be sufficiently close to at least one of the simplexes in the “tangential” sense. Second, the point z has to be sufficiently close to at least one of the simplexes in the
5.2 Surrogate Model Domain Definition
157
x3 x2
x1 Fig. 5.3 Graphical illustration of the meaning of the thickness parameter D for a three-dimensional design space. Reference designs are marked with black squares; simplexes are marked using solid lines. There are two surrogate model domains shown, corresponding to the smaller (dashed line) and larger (dotted line) values of D. Small number of reference designs shown for picture clarity (Koziel and Sigurðsson 2018a)
“orthogonal” sense. The distance here is measured as a fraction of vector dx. Changing the value of D allows for convenient control of the (volume-wise) size of the surrogate model domain as compared to the size of the unconstrained design space. Particular values of D used in numerical experiments are given in Sect. 5.4. Graphical illustration of the meaning of the parameter D has been shown in Fig. 5.3. A remark should be made on the number of reference designs needed. In general, more designs permit more precise definition of the surrogate model domain (in the sense of selecting only the relevant part of the design space). On the other hand, computational cost of identifying these designs (unless they are already available) calls for reducing the number. A rule of thumb would be to have at least as many designs as required to detect the model domain “curvature.” Roughly speaking, this would correspond to a star distribution supplemented by the corners of the region defined by the ranges of the figures of interest considered (so, 3 designs for 1 figure of interest, around 9 designs for 2 figures, and 15 for 3 figures). Still, any additional reference design gives an extra information about appropriate allocation of the training samples. By definition, we have M ⊂ XS. Furthermore, given typical values of D of 0.1 to 0.2, the (volume-wise) size of XS is significantly smaller than the hypercube defined by the vectors xmin and xmax (which would be used for training data sampling in conventional modeling). This is of fundamental importance because it allows for considerable reduction of the number of samples necessary for surrogate model construction. At the same time, the set XS contains the optimum designs for given sets of figures of interest, and, assuming sufficient regularity of the system responses w.r.t. its geometry parameters, optimum designs for all combinations of the same figures of interest within the convex hull of F( j ), j ¼ 1, . . . , N. This means that using a fraction of samples (required by the conventional model), it is possible to build a surrogate over a wide range of geometry/material parameters of the structure of interest as demonstrated in Sect. 5.4.
158
5.3
5 Triangulation-Based Constrained Modeling
Surrogate Model Construction
Definition of the surrogate domain is the most important component of the modeling process. Having defined the domain, the training data is sampled within XS, and the surrogate is constructed using kriging interpolation (Kleijnen 2009; see also Sect. 2.3.3). In the case of complex responses (e.g., S-parameters of antenna or microwave components), real and imaginary parts of the respective coefficients are modeled independently. The training data set is assigned using a simple design of experiments strategy that iteratively generates random samples within the interval [xmin, xmax] and accepts the ones that are within the model domain. The iterations are continued until the required number of samples has been found. It should be mentioned that this sampling technique also allows for estimating the ratio of the original (unconstrained) parameter space and the constrained surrogate model domain. For the examples considered in Sect. 5.4, the ratio is a few orders of magnitude (from 104 to over 106). As mentioned before, this ratio primarily depends on D, and the value of the latter should be adjusted to keep it at the aforementioned level; D ¼ 0.1 seems to be a reasonable starting value.
5.4
Demonstration Case Studies
In this section, we discuss several examples illustrating the operation and performance of the triangulation-based modeling methodology. Three structures are considered, a UWB monopole, a uniplanar dual-band dipole, and a miniaturized microstrip coupler.
5.4.1
UWB Monopole Antenna
Our first example is an ultra-wideband monopole antenna shown in Fig. 5.4 (Koziel and Sigurðsson 2018a). The structure consists of a rectangular radiator with elliptical corner cuts as well as stepped-impedance feed line. The antenna is implemented on a 0.76-mm-thick substrate. The design parameters are x ¼ [Lg L1 L2 W1 Lp Wp a b]T (all dimensions in mm). The feeding line width W0 is adjusted for a given substrate permittivity to ensure 50 ohm input impedance. The EM antenna model R is implemented in CST Microwave Studio (CST 2018) (~900,000 mesh cells, simulation time 2 minutes). The model includes the SMA connector to ensure reliability of antenna evaluation. The objective is to construct a surrogate model of the antenna input characteristic assuming various dielectric permittivities εr of the substrate. The reference designs are optimized for minimum in-band reflection at εr ¼ 1.8, 3.0, 4.5, and 6.0. We have x(1) ¼ [9.86 4.17 6.46 2.08 21.1 29.7 0.52 0.44]T, x(2) ¼ [9.45 4.02 6.33
5.4 Demonstration Case Studies Fig. 5.4 Geometry of the planar UWB antenna: (a) top view and (b) 3D view. The ground plane marked with light gray shade (Koziel and Sigurðsson 2018a)
159
a
b
Wp
Lp b a
W1
L2
W0
L1
Lg
Fig. 5.5 Constrained sampling for selected two-dimensional projections onto Lg -b plane, L2 -a plane, Lp-Wp plane, and Wp-a plane (Koziel and Sigurðsson 2018a)
1.41 19.9 26.7 0.57 0.41]T, x(3) ¼ [9.17 3.54 6.55 1.26 19.6 26.8 0.58 0.39]T, and x(4) ¼ [9.91 5.04 5.91 1.05 18.5 24.9 0.62 0.38]T. Based on these designs, the lower and upper bounds for design variables are established as l ¼ [8.5 3.0 5.5 1.0 18.0 24.0 0.5 0.35]T and u ¼ [10.0 5.5 7.0 2.5 22.0 30.0 0.65 0.45]T. In this case, the triangulation of the reference designs yields three simplexes (intervals): {x(1), x(2)}, {x(2), x(3)}, and {x(3), x(4)}. For the sake of computational efficiency, the reference designs have been optimized using trust-region gradient search with variable-fidelity EM simulation models (Koziel and Sigurðsson 2018a). The optimization criteria were (i) the allocation of the antenna resonances at the required operating frequencies and (ii) maximization of the fractional bandwidths. The constrained sampling technique has been verified for D ¼ 0.15 by setting up the kriging interpolation surrogate model with 100, 200, 400, 800, and 1600 random samples. Figure 5.5 shows selected two-dimensional projections of the constrained sampling.
160
5 Triangulation-Based Constrained Modeling
For the sake of verification, 100 independent test points were allocated in the model domain, and the average relative RMS errors have been calculated. For comparison, the conventional kriging model was also constructed using training sets of the same sizes, allocated in the unconstrained domain determined by the aforementioned bounds l and u. The errors have been reported in Table 5.1. Note that the error level for conventional models (unconstrained sampling) is very high, i.e., it is not possible to obtain an acceptable prediction power due to design space dimensionality and parameter ranges. On the other hand, the constrained model ensures good accuracy, which is below 10 percent for 400 and more training samples. Figure 5.6 shows the surrogate and EM model responses at the selected test designs. Figure 5.7 shows comparison of conventional kriging surrogate (in an unconstrained space) and EM model responses for the selected test points. It is clear that the quality of the conventional model is indeed very poor. As an application, the antenna of Fig. 5.4 has been optimized for various values of substrate permittivity (2.2, 3.5, 4.3, and 5.6) using the surrogate established in the Table 5.1 Modeling results for UWB antenna Number of training samples 100 200 400 800 1600
a
Relative RMS errorb Unconstrained sampling (%) 68.3 68.9 68.5 67.8 68.2
Constrained sampling (%) 15.1 11.6 10.0 9.9 8.5
The cost of finding the reference designs for constrained modeling is about 150 evaluations of the EM antenna model b In all cases, the surrogate model constructed using kriging interpolation a
Fig. 5.6 Responses of the antenna of Fig. 5.4 at the selected test designs for N ¼ 1600: high-fidelity EM model (—) and triangulation-based surrogate model (o)
5.4 Demonstration Case Studies
161
Fig. 5.7 Responses of the antenna of Fig. 5.4 at the selected test designs for N ¼ 1600: high-fidelity EM model (—) and conventional kriging surrogate (o)
a
b
c
d
Fig. 5.8 Surrogate (o) and EM-simulated responses (—) of the antenna of Fig. 5.4 at the designs obtained by optimizing the constrained surrogate model for (a) ε ¼ 2.2, (b) ε ¼ 3.5, (c) ε ¼ 4.3, and (d) ε ¼ 5.6; 10 dB level for UWB frequency range marked using a horizontal line (Koziel and Sigurðsson 2018a)
constrained domain. Comparison of the surrogate and EM-simulated responses of the optimized antenna is shown in Fig. 5.8. Antenna designs for the two of the considered verification cases (εr ¼ 2.2 and 3.5) have been fabricated and measured. For the first case, the TLP-5 substrate was used (tanδ ¼ 0.0018, h ¼ 0.76 mm); for the second case, the antenna has been implemented on Taconic RF-35 (εr ¼ 3.5, tanδ ¼ 0.0018, h ¼ 0.762 mm). Figure 5.9
162
5 Triangulation-Based Constrained Modeling
Fig. 5.9 Photographs of the optimized antennas fabricated on (a) TLP-5 substrate (εr ¼ 2.2), and (b) RF-35 (εr ¼ 3.5) (Koziel and Sigurðsson 2018a)
a
b
Fig. 5.10 Reflection responses of the antennas of Fig. 5.9: (a) εr ¼ 2.2 and (b) εr ¼ 3.5; simulation results (- - -) and measurements (—) (Koziel and Sigurðsson 2018a) Table 5.2 Modeling results for UWB antenna for various values of D Number of training samples 100 200 400 800 1600 a
Relative RMS errora D ¼ 0.2 (%) 18.9 14.4 13.3 11.4 9.7
D ¼ 0.15 (%) 15.1 11.6 10.0 9.9 8.5
D ¼ 0.1 (%) 15.0 10.1 7.8 6.5 5.8
In all cases, the surrogate model constructed using kriging interpolation
shows the photographs of the antenna prototypes. The agreement between simulation and measurements is good as shown in Fig. 5.10. Additional tests have been conducted in order to assess the effect of the model domain “thickness” controlled by the parameter D. Clearly, it is expected that reducing the value of D would lead to improving the model predictive power because the domain will be reduced considerably. On the other hand, excessive reduction may result in an inability of capturing the optimum antenna designs for the values of the figure(s) of interest (here, the substrate permittivity) that are allocated between those corresponding to the reference designs. Table 5.2 shows the modeling errors obtained for the constrained model and three values of D: 0.2, 0.15 (used above), and 0.1. The error is defined as relative in the following
5.4 Demonstration Case Studies
163
a
b
Fig. 5.11 Antenna of Fig. 5.4 at the designs obtained by optimizing the constrained surrogate model for (a) ε ¼ 3.5 and (b) ε ¼ 5.6; surrogate model response (—) versus EM model response (). Left- and right-hand side panels are for D ¼ 0.1 and 0.2, respectively; 10 dB level for UWB frequency range marked using a horizontal line (Koziel and Sigurðsson 2018a)
sense, kR(x) – Rkr(x)k/kR(x)k, where Rkr(x) refers to the response of the kriging surrogate model. As expected, reducing the domain results in better prediction power. Now, in order to test the model applicability for design purposes, the verification designs have been optimized for the same values of permittivity as those listed in Fig. 5.8 (2.2, 3.5, 4.5, and 5.6). Figure 5.11 shows the comparison of the selected designs for two values of D: 0.1 and 0.2. It can be observed that lower D results, as expected (cf. Table 5.2), in better agreement with EM simulation. On the other hand, the maximum in-band reflection of the optimized antenna is slightly better for higher D (because the model is valid over a larger region). However, the difference is minor (typically, a fraction of dB).
5.4.2
Uniplanar Dipole Antenna
The second example is a dual-band uniplanar dipole antenna shown in Fig. 5.12 (Chen et al. 2006). The antenna is implemented on RF-35 substrate (h ¼ 0.762 mm, εr ¼ 3.5, tanδ ¼ 0.0018). The structure consists of two narrow ground plane slits interconnected through a thick slot. It is fed by a 50 ohm coplanar waveguide
164
5 Triangulation-Based Constrained Modeling
w3 w1 l0
l3 w2
l2
l1
w0
o
s0
Fig. 5.12 Geometry of a dual-band uniplanar dipole antenna (Chen et al. 2006)
Fig. 5.13 Frequency allocation of the reference designs for the antenna of Fig. 5.12 and their triangulation (Koziel and Sigurðsson 2018a)
(CPW). The variables are x ¼ [l1 l2 l3 w1 w2 w3]T, whereas l0 ¼ 30, w0 ¼ 3, s0 ¼ 0.15, and o ¼ 5 are fixed (all dimensions in mm). The EM antenna model R (~100,000 cells; 60 s simulation on a dual Xeon E5540 machine) is implemented in CST Microwave Studio (CST 2018). We are interested in modeling the antenna for the following ranges of operating frequencies 2.0 GHz f1 4.0 GHz (lower band) and 4.5 GHz f2 6.5 GHz (upper band). There are 12 reference designs selected, corresponding to the antenna optimized for the pairs of operating frequencies as shown in Fig. 5.13. The lower and upper bounds for design variables were set using the reference designs as l ¼ [25.0 6.0 14.0 0.2 1.6 0.5]T and u ¼ [35.0 15.0 21.0 0.55 4.0 2.0]T. For the sake of computational efficiency, the reference designs have been obtained using feature-based optimization framework with variable-fidelity EM simulation models. The optimization criterion was reduction of the maximum in-band reflection within the UWB frequency range (3.1–10.6 GHz). The constrained sampling technique has been verified for D ¼ 0.05 by setting up the kriging interpolation surrogate model with 100, 200, 400, 800, and 1600 random samples. Figure 5.14 shows selected two-dimensional projections of the constrained
5.4 Demonstration Case Studies
165
Fig. 5.14 Constrained sampling for selected two-dimensional projections onto l1–l2 plane, l1–w1 plane, l1–w3 plane, and l3–w2 plane (Koziel and Sigurðsson 2018a) Table 5.3 Modeling results for dipole antenna Number of training samples 100 200 400 800 1600
a
Relative RMS errorb Unconstrained sampling (%) 17.2 12.7 9.3 6.9 5.7
Constrained sampling (%) 4.6 3.5 2.8 2.6 2.3
The cost of finding the reference designs for constrained modeling is about 400 evaluations of the EM antenna model b In all cases, the surrogate model constructed using kriging interpolation a
sampling. The model accuracy has been verified using 100 independent test points. The average RMS errors for the constrained sampling and conventional kriging models have been reported in Table 5.3. The error is defined as in Sect. 5.4.1. It can be observed that the constrained model exhibits much better accuracy than the conventional one. Figure 5.15 shows the surrogate and EM model responses at the selected test designs. Figure 5.16 shows a similar comparison for the conventional surrogate and the EM model, indicating inferior performance in comparison to the constrained surrogate. To demonstrate the practical applications, the antenna of Fig. 5.12 has been optimized for several pairs of operating frequencies as indicated in Fig. 5.17. The same figure shows a comparison between the surrogate and EM model responses at the optimized designs.
166
5 Triangulation-Based Constrained Modeling
Fig. 5.15 Responses of the antenna of Fig. 5.12 at the selected test designs for N ¼ 1600: highfidelity EM model (—) and triangulation-based surrogate model (o)
Fig. 5.16 Responses of the antenna of Fig. 5.12 at the selected test designs for N ¼ 1600: highfidelity EM model (—) and conventional kriging surrogate (o)
All the verification designs have been fabricated and measured. Figure 5.18 shows the photographs of the antenna prototypes. Reflection responses shown in Fig. 5.19 indicate good agreement between simulation and measurements. Slight frequency shifts are mostly due to not including SMA connectors in the EM antenna model.
5.4.3
Miniaturized Microstrip Coupler
As the last example, let us consider the compact rat-race coupler (RRC) shown in Fig. 5.20 (Koziel et al. 2015). The structure is also implemented on RF-35 substrate (cf. antennas of Sects. 5.4.1 and 5.4.2). The designable parameters are given by
5.4 Demonstration Case Studies
167
a
b
c
d
Fig. 5.17 Surrogate (o) and EM-simulated responses (—) of the antenna of Fig. 5.12 at the designs obtained by optimizing the constrained surrogate model for (a) f1 ¼ 2.9 GHz, f2 ¼ 5.8 GHz, (b) f1 ¼ 3.9 GHz, f2 ¼ 6.3 GHz, (c) f1 ¼ 3.2 GHz, f2 ¼ 4.8 GHz, and (d) f1 ¼ 2.8 GHz, f2 ¼ 6.2 GHz. Required operating frequencies are marked using vertical lines
Fig. 5.18 Photographs of the optimized antennas: (a) f1 ¼ 2.9 GHz, f2 ¼ 5.8 GHz, (b) f1 ¼ 3.9 GHz, f2 ¼ 6.3 GHz, (c) f1 ¼ 3.2 GHz, f2 ¼ 4.8 GHz, f2 ¼ 4.8 GHz, and (d) f1 ¼ 2.8 GHz, f2 ¼ 6.2 GHz (Koziel and Sigurðsson 2018a)
168
5 Triangulation-Based Constrained Modeling
a
b
c
d
Fig. 5.19 Reflection responses of the antennas of Fig. 5.18: (a) f1 ¼ 2.9 GHz, f2 ¼ 5.8 GHz, (b) f1 ¼ 3.9 GHz, f2 ¼ 6.3 GHz, (c) f1 ¼ 3.2 GHz, f2 ¼ 4.8 GHz, and (d) f1 ¼ 2.8 GHz, f2 ¼ 6.2 GHz; simulation results (- - -) and measurements (—) (Koziel et al. 2018)
Fig. 5.20 Layout of the compact folded RRC (Koziel et al. 2018)
w
3
4 w1
d1
l2
l1
1
w
d l3
2
Fig. 5.21 Triangulation of the reference designs for folded RRC of Fig. 5.20 (Koziel et al. 2018)
5.4 Demonstration Case Studies
169
Fig. 5.22 Constrained sampling for selected two-dimensional projections onto l1 -w1 plane, l2-l3 plane, l3-d plane, and d-w1 plane (Koziel et al. 2018)
Table 5.4 Modeling results for miniaturized RRC Number of training samples 100 200 400 800 a
Relative RMS errora Unconstrained sampling (%) 15.1 11.7 9.2 7.1
Constrained sampling (%) 4.7 4.2 3.0 2.6
In all cases, the surrogate model constructed using kriging interpolation
x ¼ [l1 l2 l3 d w w1]T, with relative variable d1 ¼ d + j w – w1j and dimensions d ¼ 1.0, w0 ¼ 1.7, l0 ¼ 15 fixed (all in mm). The goal is to model the RRC within the region covering optimum designs corresponding to operating frequencies f0 from 1 to 2 GHz and power split ratios K from 6 to 0 dB (equal power split). There are 12 reference designs optimized for the following pairs of {f0,K}: {1.0,0}, {1.0,–2}, {1.0,–6}, {1.2,–4}, {1.3,0}, {1.5,– 2}, {1.5,–5}, {1.7,0}, {1.7,–6}, {1.8,–3}, {2.0,0}, and {2.0,–6}. Figure 5.21 shows the triangulation of the reference designs. In the construction of the model, D ¼ 0.1 is used. The surrogate is set up using 100, 200, 400, and 800 random samples. Selected two-dimensional projections of the constrained sampling have been shown in Fig. 5.22. Table 5.4 shows the RMS error of the constrained surrogate as well as the conventional kriging model constructed over an unconstrained domain (interval [xmin, xmax]). The error has been averaged over 100 random test points. It should be noted that the constrained model exhibits very good predictive power (error below 5%) even if the number of training points is as low as 100. In general, it improves the accuracy w.r.t. the standard approach by a factor of 3. Comparison of the responses of the constrained surrogate and the EM model for the selected test designs has been shown in Fig. 5.23.
170
5 Triangulation-Based Constrained Modeling
Fig. 5.23 Responses of the RRC of Fig. 5.20 at the selected test designs for N ¼ 800: high-fidelity EM model (—) and triangulation-based surrogate model (o)
a
b
c
d
Fig. 5.24 Surrogate (o) and EM-simulated responses (—) of the RRC of Fig. 5.20 at the designs obtained by optimizing the constrained surrogate model for (a) f0 ¼ 1.2 GHz and K ¼ – 1.5 dB, (b) f0 ¼ 1.4 GHz and K ¼ – 4.2 dB, (c) f0 ¼ 1.6 GHz and K ¼ – 2.5 dB, and (d) f0 ¼ 1.9 GHz and K ¼ – 3.5 dB (Koziel et al. 2018)
5.4 Demonstration Case Studies
171
Fig. 5.25 Photographs of the fabricated coupler prototypes: (a) f0 ¼ 1.2 GHz and K ¼ – 1.5 dB, (b) f0 ¼ 1.4 GHz and K ¼ – 4.2 dB, (c) f0 ¼ 1.6 GHz and K ¼ – 2.5 dB, and (d) f0 ¼ 1.9 GHz and K ¼ – 3.5 dB
As an application, the RRC of Fig. 5.20 has been optimized for various pairs of f0 and K. Comparison of the surrogate and EM-simulated responses of the optimized RRC is shown in Fig. 5.24. The responses of the optimized couplers are well centered at the required operating frequencies; the power split ratios are 1.45, 4.23, 2.52, and 3.54 dB versus required values of 1.5, 4.2, 2.5, and 3.5 dB, respectively (power split errors