The Sequential Quadratic Hamiltonian Method: Solving Optimal Control Problems 9780367715526, 9780367715601, 9781003152620

The sequential quadratic hamiltonian (SQH) method is a novel numerical optimization procedure for solving optimal contro

179 25 7MB

English Pages 267 Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Contents
Preface
Author
1. Optimal Control Problems with ODEs
1.1. Formulation of ODE Optimal Control Problems
1.2. The Controlled ODE Model
1.3. Existence of Optimal Controls
1.4. Optimality Conditions
1.5. The Pontryagin Maximum Principle
1.6. The PMP and Path Constraints
1.7. Sufficient Conditions for Optimality
1.8. Analytical Solutions via PMP
2. The Sequential Quadratic Hamiltonian Method
2.1. Successive Approximations Schemes
2.2. The Sequential Quadratic Hamiltonian Method
2.3. Mixed Control and State Constraints
2.4. Time-Optimal Control Problems
2.5. Analysis of the SQH Method
3. Optimal Relaxed Controls
3.1. Young Measures and Optimal Relaxed Controls
3.2. The Sequential Quadratic Hamiltonian Method
3.3. The SQH Minimising Property
3.4. An Application with Two Relaxed Controls
4. Differential Nash Games
4.1. Introduction
4.2. PMP Characterisation of Nash Games
4.3. The SQH Method for Solving Nash Games
4.4. Numerical Experiments
5. Deep Learning in Residual Neural Networks
5.1. Introduction
5.2. Supervised Learning and Optimal Control
5.3. The Discrete Maximum Principle
5.4. The Sequential Quadratic Hamiltonian Method
5.5. Wellposedness and Convergence Results
5.6. Numerical Experiments
6. Control of Stochastic Models
6.1. Introduction
6.2. Formulation of Ensemble Optimal Control Problems
6.3. The PMP Characterisation of Optimal Controls
6.4. The Hamilton-Jacobi-Bellman Equation
6.5. Two SQH Methods
6.6. Numerical Experiments
7. PDE Optimal Control Problems
7.1. Introduction
7.2. Elliptic Optimal Control Problems
7.3. The Sequential Quadratic Hamiltonian Method
7.4. Linear Elliptic Optimal Control Problems
7.5. A Problem with Discontinuous Control Costs
7.6. Bilinear Elliptic Optimal Control Problems
7.7. Nonlinear Elliptic Optimal Control Problems
7.8. A Problem with State Constraints
7.9. A Nonsmooth Problem with L1 Tracking Term
7.10. Parabolic Optimal Control Problems
7.11. Hyperbolic Optimal Control Problems
8. Identification of a Diffusion Coefficient
8.1. Introduction
8.2. An Inverse Diffusion Coefficient Problem
8.3. The SQH Method
8.4. Finite Element Approximation
8.5. Numerical Experiments
A. Results of Analysis
A.1. Some Function Spaces
A.1.1. Spaces of Continuous Functions
A.1.2. Spaces of Integrable Functions
A.1.3. Sobolev Spaces
A.2. The Grönwall Inequality
A.3. Derivatives in Banach Spaces
A.4. The Implicit Function Theorem
A.5. L∞ Estimates
Bibliography
Index
Recommend Papers

The Sequential Quadratic Hamiltonian Method: Solving Optimal Control Problems
 9780367715526, 9780367715601, 9781003152620

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

The Sequential Quadratic Hamiltonian Method The sequential quadratic hamiltonian (SQH) method is a novel numerical optimization procedure for solving optimal control problems governed by differential models. It is based on the characterisation of optimal controls in the framework of the Pontryagin maximum principle (PMP). The SQH method is a powerful computational methodology that is capable of development in many directions. The Sequential Quadratic Hamiltonian Method: Solving Optimal Control Problems discusses its analysis and use in solving nonsmooth ODE control problems, relaxed ODE control problems, stochastic control problems, mixed-integer control problems, PDE control problems, inverse PDE problems, differential Nash game problems, and problems related to residual neural networks. This book may serve as a textbook for undergraduate and graduate students, and as an introduction for researchers in sciences and engineering who intend to further develop the SQH method or wish to use it as a numerical tool for solving challenging optimal control problems and for investigating the Pontryagin maximum principle on new optimisation problems. Features • Provides insight into mathematical and computational issues concerning optimal control problems, while discussing many differential models of interest in different disciplines. • Suitable for undergraduate and graduate students and as an introduction for researchers in sciences and engineering. • Accompanied by codes which allow the reader to apply the SQH method to solve many different optimal control and optimisation problems. Alfio Borzì, born 1965 in Catania (Italy), is Professor and Chair of Scientific Computing at the Institute for Mathematics of the University of Würzburg, Germany. He studied Mathematics and Physics in Catania and Trieste where he received his PhD in Mathematics from Scuola Internazionale Superiore di Studi Avanzati (SISSA). He served as Research Officer at the University of Oxford (UK) and as Assistant Professor at the University of Graz (Austria) where he completed his Habilitation and was appointed as Associate Professor. Since 2011 he has been Professor of Scientific Computing at the University of Würzburg. Alfio Borzì is author of 4 mathematics books and numerous articles in scientific journals. The main topics of his research and teaching activities are modelling and numerical analysis, optimal control, optimisation, and scientific computing. He is member of the editorial board for SIAM Review.

Numerical Analysis and Scientific Computing Series Series Editors: Frederic Magoules, Choi-Hong Lai About the Series This series, comprising of a diverse collection of textbooks, references, and handbooks, brings together a wide range of topics across numerical analysis and scientific computing. The books contained in this series will appeal to an academic audience, both in mathematics and computer science, and naturally find applications in engineering and the physical sciences. Computational Fluid Dynamics Frederic Magoules Mathematics at the Meridian The History of Mathematics at Greenwich Raymond Gerard Flood, Tony Mann, Mary Croarken Modelling with Ordinary Differential Equations A Comprehensive Approach Alfio Borzì Numerical Methods for Unsteady Compressible Flow Problems Philipp Birken A Gentle Introduction to Scientific Computing Dan Stanescu, Long Lee Introduction to Computational Engineering with MATLAB Timothy Bower An Introduction to Numerical Methods A MATLAB® Approach, Fifth Edition Abdelwahab Kharab, Ronald Guenther The Sequential Quadratic Hamiltonian Method Solving Optimal Control Problems Alfio Borzì For more information about this series please visit: https://www.crcpress.com/Chapman--HallCRC-Numerical-Analysis-and-Scientific-Computing-Series/book-series/ CHNUANSCCOM

The Sequential Quadratic Hamiltonian Method Solving Optimal Control Problems

Alfio Borzì

University of Würzburg, Germany

MATLAB ® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of MATLAB ® software or related products does not constitute endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular use of the MATLAB® software. First edition published 2023 by CRC Press 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742 and by CRC Press 4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN CRC Press is an imprint of Taylor & Francis Group, LLC © 2023 Alfio Borzì Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, access www.copyright. com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact [email protected] Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for identification and explanation without intent to infringe. ISBN: 978-0-367-71552-6 (hbk) ISBN: 978-0-367-71560-1 (pbk) ISBN: 978-1-003-15262-0 (ebk) DOI: 10.1201/9781003152620 Typeset in LM Roman by KnowledgeWorks Global Ltd. Publisher’s note: This book has been prepared from camera-ready copy provided by the authors. Access the Support Material: www.Routledge.com/9780367715526

Dedicated with love to Mila and to our children Zoe, Lilli, Arthur, and Anastasia

In nova fert animus mutatas dicere formas corpora; di, coeptis (nam vos mutastis et illas) adspirate meis primaque ab origine mundi ad mea perpetuum deducite tempora carmen.

(Metamorphosen, Publius Ovidius Naso (P. Ovidii Nasonis) 1-4, Liber I. )

My soul is wrought to sing of forms transformed to bodies new and strange! Immortal Gods inspire my heart, for ye have changed yourselves and all things you have changed! Oh lead my song in smooth and measured strains, from olden days when earth began to this completed time!

(translation by Brookes More)

Contents

Preface

xi

Author

xv

1 Optimal Control Problems with ODEs 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8

1

Formulation of ODE Optimal Control Problems The Controlled ODE Model . . . . . . . . . . Existence of Optimal Controls . . . . . . . . . Optimality Conditions . . . . . . . . . . . . . . The Pontryagin Maximum Principle . . . . . . The PMP and Path Constraints . . . . . . . . Sufficient Conditions for Optimality . . . . . . Analytical Solutions via PMP . . . . . . . . .

. . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

2 The Sequential Quadratic Hamiltonian Method 2.1 2.2 2.3 2.4 2.5

Successive Approximations Schemes . The Sequential Quadratic Hamiltonian Mixed Control and State Constraints Time-Optimal Control Problems . . . Analysis of the SQH Method . . . . .

. . . . . Method . . . . . . . . . . . . . . .

. . . . .

45 . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

3 Optimal Relaxed Controls 3.1 3.2 3.3 3.4

Young Measures and Optimal Relaxed Controls . . The Sequential Quadratic Hamiltonian Method . . . The SQH Minimising Property . . . . . . . . . . . . An Application with Two Relaxed Controls . . . . .

Introduction . . . . . . . . . . . . . . . . . PMP Characterisation of Nash Games . . . The SQH Method for Solving Nash Games Numerical Experiments . . . . . . . . . . .

45 50 62 65 72 83

. . . .

. . . .

. . . .

. . . .

. . . .

4 Differential Nash Games 4.1 4.2 4.3 4.4

1 3 7 15 17 29 32 37

83 85 91 97 101

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

101 102 104 108

vii

viii

Contents

5 Deep Learning in Residual Neural Networks 5.1 5.2 5.3 5.4 5.5 5.6

113

Introduction . . . . . . . . . . . . . . . . . . . Supervised Learning and Optimal Control . . The Discrete Maximum Principle . . . . . . . The Sequential Quadratic Hamiltonian Method Wellposedness and Convergence Results . . . . Numerical Experiments . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

6 Control of Stochastic Models 6.1 6.2 6.3 6.4 6.5 6.6

131

Introduction . . . . . . . . . . . . . . . . . . . . . . Formulation of Ensemble Optimal Control Problems The PMP Characterisation of Optimal Controls . . The Hamilton-Jacobi-Bellman Equation . . . . . . . Two SQH Methods . . . . . . . . . . . . . . . . . . Numerical Experiments . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

7 PDE Optimal Control Problems 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11

Introduction . . . . . . . . . . . . . . . . . . . Elliptic Optimal Control Problems . . . . . . . The Sequential Quadratic Hamiltonian Method Linear Elliptic Optimal Control Problems . . . A Problem with Discontinuous Control Costs . Bilinear Elliptic Optimal Control Problems . . Nonlinear Elliptic Optimal Control Problems . A Problem with State Constraints . . . . . . . A Nonsmooth Problem with L1 Tracking Term Parabolic Optimal Control Problems . . . . . Hyperbolic Optimal Control Problems . . . . .

Introduction . . . . . . . . . . An Inverse Diffusion Coefficient The SQH Method . . . . . . . Finite Element Approximation Numerical Experiments . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . Problem . . . . . . . . . . . . . . .

Function Spaces . . . . . . . . . Spaces of Continuous Functions Spaces of Integrable Functions Sobolev Spaces . . . . . . . . .

153 155 160 164 167 169 171 174 177 181 185 197

. . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

A Results of Analysis A.1 Some A.1.1 A.1.2 A.1.3

131 133 136 140 143 147 153

8 Identification of a Diffusion Coefficient 8.1 8.2 8.3 8.4 8.5

113 114 118 120 124 127

197 198 200 204 208 213

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

213 213 215 215

Contents A.2 A.3 A.4 A.5

The Grönwall Inequality . . . . . . . Derivatives in Banach Spaces . . . . . The Implicit Function Theorem . . . L∞ Estimates . . . . . . . . . . . . .

ix . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

218 219 221 222

Bibliography

227

Index

249

Preface

The sequential quadratic Hamiltonian (SQH) method is a novel numerical optimisation procedure for solving optimal control problems governed by differential models. It is based on the characterisation of optimal controls in the framework of the Pontryagin maximum principle (PMP). The SQH method represents the most recent development of the so-called successive approximations schemes that were proposed soon after the formulation of the PMP theory in the sixties. Thus, some time has passed since the appearance of these schemes, but until recently they have been usually neglected in favour of other methods, especially gradient-based techniques. In fact, also the name successive approximation(s) appears in other unrelated contexts as, e.g., the Picard iteration; a sign that the successive approximations strategy has played a marginal role in the solution of optimal control problems with differential models. On the other hand, especially in the field of optimal control with ordinary differential equations (ODEs), the PMP theory is often advocated in order to analyse optimal control problems in many situations where other techniques present difficulties. This means that there is a certain discrepancy between theory and practice of the PMP framework, which becomes more evident when considering largesize ODE control problems and optimal control problems governed by partial differential equations (PDEs). Moreover, in the latter case, the theoretical investigation of the maximum principle has received much less attention, and this state of the art could be explained by the lack of an efficient numerical methodology that implements the maximum principle with PDE control problems, especially in those cases where the PMP appears to us to be the most appropriate choice like nonsmooth and nonconvex control problems, mixedinteger control problems, etc.. Specifically, we refer to those problems where the admissible set of control functions is not convex or the set of admissible values of the controls is a discrete set of isolated points. The purpose of this book is to contribute to the investigation and application of the Pontryagin maximum principle in optimal control problems by presenting the iterative SQH method as the numerical tool that implements this principle in an efficient and robust way. For this purpose, the analysis and application of the SQH method to different classes of optimal control problems, differential Nash games, and inverse problems governed by ODE and PDE models are discussed. Furthermore, we illustrate theoretical results concerning existence of optimal or quasioptimal controls, the proof of the PMP

xi

xii

Preface

in some cases, and the wellposedness and convergence of the SQH method. In this context, we notice that the maximum principle has a local character that appears clearly in its proof by ‘needle variations’, and this character is also a specific feature of the SQH method. The ancestor of the SQH scheme is the successive approximations method proposed in different variants and simultaneously by I.A. Krylov and F.L. Chernous’ko, and by H.J. Kelley, R.E. Kopp and H.G. Moyer in the sixties. In fact, all these authors were inspired by the work of L. I. Rozonoèr who first realised the potential of a numerical realisation of the maximum principle and provided a fundamental result for this purpose. The successive approximations methods, whose formulation is based on the control Hamiltonian function introduced in the PMP theory, appeared efficient but not robust with respect to the numerical and optimisation parameters. However, twenty years later an improvement in robustness was achieved by Y. Sakawa and Y. Shindo by introducing a quadratic penalty of the control updates that resulted in an augmented Hamiltonian function. In this latter formulation, the need of frequent updates of the state variable of the controlled system limited the application of the resulting method to small-size control problems. This limitation was resolved in the SQH approach where a sequential pointwise optimisation of the augmented Hamiltonian function with respect to the control variable is performed while the state function is updated after this step. Moreover, in this new approach, the augmentation parameter is chosen adaptively in such a way to guarantee the construction of a minimising sequence for the objective functional. With these improvements, one obtains an efficient and robust iterative procedure that is also able to solve discontinuous optimal control problems and problems on discrete sets of admissible control values. The SQH method is a powerful computational methodology that is capable of developing in many directions. This book represents an attempt to encourage its use and further development by demonstrating its capability in solving ODE control problems, relaxed ODE control problems, stochastic control problems, PDE control problems, inverse problems, differential game problems and problems related to neural networks. Also for this purpose, a related suite of freely available codes accompanies this book. The first chapter of this book is devoted to the formulation and analysis of optimal control problems governed by ODE models. It starts with a discussion on proving the existence of optimal controls based on some specific assumptions on the structure of the problem. Further results address optimal control problems with more general structure, and in this context the Ekeland’s variational principle is mentioned, which leads to the notion of quasioptimal controls. The second part of this chapter provides an introduction to the characterisation of optimal controls by optimality systems in the framework of the Pontryagin maximum principle. Chapter 2 starts with an illustration of the successive approximations method as originally proposed by Krylov and Chernous’ko. This discussion

Preface

xiii

serves as the preparation for the formulation of the SQH method that follows. The algorithms that implement these methods are presented and applied to different control problems. One section is devoted to path mixed control and state constrained problems, and another addresses time-optimal control problems. The last part of this chapter is devoted to a theoretical analysis of the convergence properties of the SQH method. Chapter 3 is devoted to the formulation of optimal relaxed control problems and their solution by a SQH method with a special augmentation term. Also in this case, wellposedness of the SQH algorithm is analysed and the effectiveness of this scheme is successfully validated with control problems with bang-bang, oscillations, and concentrations effects. This chapter is concluded with an application concerning a bioreactor. In Chapter 4, the SQH method is further developed in order to solve differential Nash games, and also in this framework, a theoretical justification for the proposed algorithm is given. In the case of linear quadratic games, the Nash equilibrium solution obtained with the SQH method is successfully compared with that obtained solving Riccati equations. Further experiments are discussed that involve nonsmooth problems. Chapter 5 presents an extension of the SQH scheme as a learning method for residual neural networks. From the mathematical point of view, it represents the implementation of a discrete maximum principle for the optimisation of a given loss function subject to the constraint given by a finite difference model. In particular, the discussion focuses on Runge-Kutta neural networks applied to approximation and regression problems. Chapter 6 is devoted to the formulation and analysis of ensemble optimal control problems with stochastic drift-diffusion models. These problems are formulated as optimal control problems governed by the parabolic FokkerPlanck equation modelling the evolution of the probability density function of the underlying stochastic process. This approach combined with the PMP characterisation of optimality allows to determine open- and closed-loop controls within a unique formalism that includes the Hamilton-Jacobi-Bellman equation, also in the case of deterministic models. Based on the peculiarities of open- and closed-loop control mechanisms, two SQH algorithms are developed and applied to problems that require to steer stochastic trajectories in a desired way. Chapter 7 presents the analysis and implementation of the SQH method for solving different optimal control problems governed by PDEs. The main focus is elliptic control problems, whereas one of the last two sections is devoted to parabolic control problems, and the last section is devoted to optimal distributed and boundary control problems with the wave equation. Elliptic optimal control problems with distributed linear and bilinear control mechanisms, and control- and state-constraints are discussed. Furthermore, the cases of discontinuous control costs, mixed-integer control problems, and a problem with a nonsmooth tracking term are presented.

xiv

Preface

Chapter 8 is devoted to the inverse problem of identifying the diffusion coefficient of an elliptic model based on measurements of the state configuration. The SQH scheme for solving this class of problems is presented and theoretically investigated subject to favourable assumptions. In this case, the elliptic model and its optimisation adjoint are approximated by finite elements, and the resulting algorithm is successfully applied to identify a diffusion coefficient. I would like to remark that, in this book, methods and problems are presented with enough details such that the book may serve as a textbook for undergraduate and graduate students, and as an introduction for researcher in sciences and engineering that intend to further develop and use the SQH method, or wish to use it as a numerical tool for investigating the Pontryagin maximum principle on new optimisation problems. Hopefully, I have at least partially succeeded in showing the great potential of the SQH method and related strategies. In many cases, I have relied on recent results obtained in collaboration with my students Tim Breitenbach, Francesca Calà-Campana, Sebastian Hofmann, Nico Nees, Max Steinlein and Andreas Seufert, and through collaboration with Mario Annunziato and Souvik Roy, who are all gratefully acknowledged. In particular, I express my sincere gratitude to Tim Breitenbach for all our early passionate efforts in the genesis of the SQH method and its investigation. I also would like to gratefully acknowledge the continued support of John Burns, Kurt Chudej, Gabriele Ciaramella, Andrei V. Dmitruk, Francesco Fanelli, Andrei V. Fursikov, Matthias Gerdts, Omar Ghattas, Abdou Habbal, Eldad Haber, Rolf Krause, Karl Kunisch, Kees Oosterlee, Hans-Josef Pesch, Francesco Petitta, Georg Propst, Arnd Rösch, Souvik Roy, Ekkehard Sachs, Volker Schulz, Endre Süli, Mikhail I. Sumin, Fredi Tröltzsch, Nadja Vater, Marco Verani and Daniel Wachsmuth. Heartiest thanks to Mikhail I. Sumin, Souvik Roy and Daniel Wachsmuth for reading early drafts of this book and for their extremely useful comments and suggestions. I am very grateful to Choi-Hong Lai who encouraged me to publish with CRC Press. I also would like to thank very much Callum Fraser and Mansi Kabra, and the production team of CRC Press and Taylor & Francis Group for their kind and very professional assistance in publishing this work. I owe my thanks also to Meeta Singh and her team at KGL for their support on this project. Alfio Borzì Würzburg, 2023

Author

Alfio Borzì, born 1965 in Catania (Italy), is Professor and Chair of Scientific Computing at the Institute for Mathematics of the University of Würzburg, Germany. He studied Mathematics and Physics in Catania and Trieste where he received his PhD in Mathematics from Scuola Internazionale Superiore di Studi Avanzati (SISSA). He served as Research Officer at the University of Oxford (UK) and as Assistant Professor at the University of Graz (Austria) where he completed his Habilitation and was appointed as Associate Professor. Since 2011 he has been Professor of Scientific Computing at the University of Würzburg. Alfio Borzì is the author of four mathematics books and numerous articles in scientific journals. The main topics of his research and teaching activities are modelling and numerical analysis, optimal control, optimisation, and scientific computing. He is a member of the editorial board for SIAM Review.

xv

Chapter 1 Optimal Control Problems with ODEs

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8

Formulation of ODE Optimal Control Problems . . . . . . . . . . . . . . . . The Controlled ODE Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Existence of Optimal Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Optimality Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Pontryagin Maximum Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The PMP and Path Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sufficient Conditions for Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analytical Solutions via PMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 3 7 15 17 29 32 37

This chapter provides an introduction to optimal control problems governed by ordinary differential equations. It illustrates the approach to proving existence of solutions to different optimal control problems, and focuses on the characterisation of these solutions in the framework of the Pontryagin maximum principle. Further insight is given into sufficient conditions for optimality and the analytical solution of some representative control problems.

1.1

Formulation of ODE Optimal Control Problems

Optimal control problems appear in the design of control mechanisms in differential systems for the purpose of achieving some prescribed goals of the states of these systems. A basic optimal control problem governed by a system of ordinary differential equations is formulated as follows: Z

T

min J(y, u) :=

`(t, y(t), u(t)) dt + γ(y(T )) t0

s.t. y 0 (t) = f (t, y(t), u(t)), u ∈ Uad

y(t0 ) = y0 ,

(1.1)

where ‘s.t.’ stands for ‘subject to’. In this formulation, f : [t0 , T ] × Rn × Rm → Rn determines the dynamics of the governing system, whose configuration is represented by the state variable y : [t0 , T ] → Rn driven by the control DOI: 10.1201/9781003152620-1

1

2

The Sequential Quadratic Hamiltonian Method

function u : [t0 , T ] → Rm chosen in an admissible functional set Uad . In particular, in application one frequently considers control-affine systems having the following structure y 0 (t) = f˜(t, y(t)) + fˆ(t, y(t)) u(t).

(1.2)

Further, the optimal control may be sought in a subset of L2 (t0 , T ) given by  Uad = u ∈ L2 (t0 , T ; Rm ) : u(t) ∈ Kad a.e. , (1.3) where Kad is a compact subset of Rm , and a.e. means for almost all t in [t0 , T ] in the sense of the Lebesgue measure. The purpose of the action of the control is modelled by the functional J, which is called the cost (or objective) functional, where ` represents the running cost and γ the terminal observation (or endpoint cost). Usually, the scalar function ` : [t0 , T ] × Rn × Rm → R has a composite structure as follows: `(t, y, u) = h(t, y) + g(t, u),

(1.4)

where h models the purpose of the action of the control for the state of the system and g represents the cost of this action. A typical choice of these functions is the following α ν γ(y) = |y − yT |2 . h(t, y) = |y − yd (t)|2 , g(t, u) = |u|2 , 2 2 In this case, yd ∈ L2 (t0 , T ; Rn ) represents a desired (but not necessarily attainable) trajectory that the system should follow; thus h(t, y(t)) measures the tracking error at the instant t. Similarly, yT ∈ Rn represents a desired target configuration at final time, and γ(y(T )) measures the discrepancy of y(T ) to this target, we choose α ≥ 0. Further, the g(t, u(t)) above provides a quadratic cost at this instant, and ν ≥ 0 denotes the weight of this cost. With | · | we denote the Euclidean norm. In the formulation where J contains both the ` and γ terms, we say that (1.1) is in the Bolza form. If only the running cost appears, then the optimal control problem is said to be in the Lagrange form. If only the terminal term appears, then (1.1) is said to be in the Mayer form. One can transform an optimal control problem in Bolza form to the Mayer form by introducing an additional scalar variable z such that z 0 (t) = `(t, y(t), u(t)), Then it holds Z

z(t0 ) = 0.

T

`(t, y(t), u(t)) dt = z(T ). t0

Therefore, the Bolza problem (1.1) takes the following Mayer form min J(y, z, u) = z(T ) + γ(y(T )) s.t. y 0 (t) = f (t, y(t), u(t)), z 0 (t) = `(t, y(t), u(t)),

y(t0 ) = y0 z(t0 ) = 0.

(1.5)

Optimal Control Problems with ODEs

3

On the other hand, starting from a problem in Mayer form (i.e. with ` = 0), one can construct a problem in Lagrange form. For this purpose, consider the following calculation Z

T

γ(y(T )) = γ(y0 ) + t0 T

Z = γ(y0 ) +

d γ(y(t)) dt dt (∂y γ(y(t)) · y 0 (t)) dt

t0 T

Z

(∂y γ(y(t)) · f (t, y(t), u(t))) dt.

= γ(y0 ) + t0

(With ‘·’ we denote the Euclidean scalar product.) Thus, we obtain `(t, y(t), u(t)) = ∂y γ(y(t)) · f (t, y(t), u(t)), and γ(y0 ) is a constant that for the purpose of the optimisation can be ignored. In the same way, we can transform an optimal control problem in Bolza form to its equivalent Lagrange form. Notice that, in the calculation above, we have assumed Fréchet differentiability of γ, and ∂y γ denotes its Fréchet derivative with respect to y. We may write γy in place of ∂y γ for a shorter notation. In general, both the initial and final time instants may be fixed or not. Then we have a fixed or free ‘time’ problem, respectively. We can also have the conditions that y(T ) ∈ E, where E represents a prescribed subset of Rn . In the case that E coincides with Rn we have a free endpoint problem; if E is a single point of Rn , then we have a target point problem. However, while we discuss primarily fixed times with free endpoint at t = T , these specifications constitute only a small set of all possible conditions that can be found in the vast literature on optimal control problems; see, e.g., [17, 59, 85, 111, 153, 217, 252]. It is common to call the function pair (y, u) on the interval [t0 , T ] an admissible process of the optimal control problem (1.1) if u and y satisfy all the given constraints. A process (u∗ , y ∗ ) is called a strong local minimiser provided that, for some  > 0, for any other process (u, y) satisfying the constraints as well as ky − y ∗ kL∞ (t0 ,T ) ≤ , we have J(y ∗ , u∗ ) ≤ J(y, u). In this case, we say that u∗ is a locally optimal control and y ∗ is the corresponding optimal trajectory, and (u∗ , y ∗ ) is a locally optimal process. Clearly, we say that (u∗ , y ∗ ) is a globally optimal process if J(y ∗ , u∗ ) ≤ J(y, u) holds for all admissible processes (u, y).

1.2

The Controlled ODE Model

We assume that the control function u is Lebesgue measurable, and u(t) ∈ Kad a.e., where Kad ⊂ Rm is compact. Then for a given u, we can define the

4

The Sequential Quadratic Hamiltonian Method

function F (t, y) = f (t, y, u(t)) and write our Cauchy problem as follows: y 0 (t) = F (t, y(t)),

˜ (t0 , y0 ) ∈ D,

y(t0 ) = y0 ,

(1.6)

˜ is the domain of definition of F . A generic compact subset of D ˜ is where D denoted with ˜ D = {(t, y) ∈ R1+n : t ∈ I ⊆ R, |y| ≤ M } ⊂ D, where I is an interval and M > 0. We assume that [t0 , T ] ⊂ I. Our purpose is to state existence and uniqueness of solutions to (1.6) in the sense established by C. Carathéodory. For this purpose, we require that F satisfies the following Carathéodory’s conditions [113]: Assumption 1.1 (Carathéodory’s conditions) In the domain D of the (t, y) space: a) the function F (t, y) be defined and continuous in y for almost all t; b) the function F (t, y) be measurable in t for each y; c) |F (t, y)| ≤ m(t), where m is a Lebesgue-integrable function. Now, we can report the statement of Carathéodory’s theorem as given in, e.g., A.F. Filippov’s book [113]: Theorem 1.1 Let the function F satisfies the Carathéodory conditions and assume that (t0 , y0 ) ∈ D. Then there exists an absolutely continuous function y that solves (1.6) on a closed interval [t0 , t0 + α], α > 0, in the sense that it satisfies Z t

y(t) = y0 +

F (s, y(s)) ds,

t ∈ [t0 , t0 + α].

(1.7)

t0

This means that y 0 (t) = F (t, y(t)),

(t, y(t)) ∈ D,

t ∈ [t0 , t0 + α],

(1.8)

except on a set of Lebesgue-measure zero. Details on the space of absolutely continuous (AC) functions are given in Appendix A. By assuming a Lipschitz condition on F , one can prove that the solution established by Carathéodory’s theorem is unique. We have Theorem 1.2 Let the assumptions of Theorem 1.1 hold and further assume that |F (t, y1 ) − F (t, y2 )| ≤ L(t) |y1 − y2 |, (1.9) for all (t, y1 ), (t, y2 ) ∈ D, and L is a Lebesgue-integrable function. Then the solution established in Theorem 1.1 is unique.

Optimal Control Problems with ODEs

5

As remarked in [113], the uniqueness property given in this theorem remains valid if one replaces the global Lipschitz condition (1.9) with the following one-sided condition (F (t, y1 ) − F (t, y2 )) · (y1 − y2 ) ≤ L(t) |y1 − y2 |2 ,

(1.10)

for all t ≥ t0 , (t, y1 ), (t, y2 ) ∈ D. See [82] for recent results and further references on generalised Lipschitz conditions. Further, subject to the Assumptions 1.1 and the properties (1.9) or (1.10), the solution to (1.6) can be uniquely extended up to the boundary of D. Notice that Assumption 1.1 c) and (1.10) could be further weakened, for example, considering a local formulation; see, e.g., [92] for a survey and further references. However, care must be taken to guarantee that no blow-up of solutions in the time interval of interest occurs. In view of these results and of the definition of F , we need to specify under which conditions on f , with u as given above, the function F satisfies the Assumption 1.1 and (1.9). These conditions are given in, e.g., [83] and summarised below. Assumption 1.2 (Clarke’s conditions) In the space D × Kad : a) the function f is measurable in t and continuous in u; alternatively, one requires that every coordinate function of f is upper semicontinuous in (t, u); b) there is a Lebesgue-integrable function λ such that it holds |f (t, y1 , v) − f (t, y2 , v)| ≤ λ(t) |y1 − y2 |, for all (t, y1 ), (t, y2 ) ∈ D and v ∈ Kad . As mentioned in [83], since f is locally Lipschitz continuous in y, we can introduce the generalised Jacobian of the map y 7→ f (t, y, u(t)), and denote it with ∂y f (t, y, u(t)), because it coincides with the usual Jacobian if f is a C 1 function of the y argument. Now, we suppose that the Assumptions 1.1 and 1.2 hold, so that our initial value problem y 0 (t) = f (t, y(t), u(t)),

y(t0 ) = y0 ,

t ∈ (t0 , T ],

(1.11)

has a unique solution for any choice of u ∈ Uad ⊂ U , where U denotes a convenient control space (in the present discussion, we can take U = L1 (t0 , T ; Rm )). Thus (1.11) defines a well posed function S : Uad → AC([t0 , T ]; Rn ),

u 7→ y = S(u),

(1.12)

which is called the control-to-state map. Notice that this map is defined with a given fixed initial condition.

6

The Sequential Quadratic Hamiltonian Method

Clearly, the properties of the map (1.12) are determined by the structure of f . In particular, if some growth conditions for f can be verified, then uniform boundedness of solutions to (1.11), and thus of the map S, can be proved based on known inequalities; see, e.g., [194]. The simplest assumption that is consistent with the global Lipschitz condition above is that of linear growth, which is usually expressed as follows: |y · f (t, y, u(t))| ≤ c0 (t) + c1 (t) |y|2 , where c0 , c1 are nonnegative Lebesgue-integrable functions. Thus, taking the scalar product of y 0 (t) = f (t, y(t), u(t)) with y(t) and using the condition above, one obtains v 0 (t) ≤ 2c0 (t) + 2c1 (t) v(t),

v(t) := |y(t)|2 .

Now, we can apply T. H. Grönwall’s inequality and obtain the bound Z t   2 B(t) 2 |y(t)| ≤ e |y0 | + 2 c0 (s)e−B(s) ds , t0

Rt

where B(t) = 2 t0 c1 (s)ds. For the proof of this result see Appendix A. Next, assume that |f (t, y1 , u1 ) − f (t, y2 , u2 )| ≤ λ(t) |y1 − y2 | + µ(t) |u1 − u2 |,

(1.13)

where (t, y1 ), (t, y2 ) ∈ D, u1 , u2 ∈ Kad , and λ and µ are Lebesgue-integrable functions that are uniformly bounded in [t0 , T ] by some constant C > 0. Then, using (1.13) with u1 , u2 ∈ Uad , and y1 (t) = S(u1 )(t), y2 (t) = S(u2 )(t), we obtain d |y1 (t) − y2 (t)| ≤ λ(t) |y1 (t) − y2 (t)| + µ(t) |u1 (t) − u2 (t)|. dt Further, by applying Grönwall’s inequality, we obtain |S(u1 )(t) − S(u2 )(t)| ≤ C eC (T −t0 ) ku1 − u2 kL1 (t0 ,T ;Rm ) , from which the Lipschitz continuity of the map S follows. In optimisation, it is customary to introduce the map (y, u) 7→ c(y, u) := y 0 − f (·, y, u), such that the differential constraint is formulated as c(y, u) = 0, which includes the given initial condition. Then the construction of S requires that the equation c(y, u) = 0 can be solved for y for a given u. Equivalently, this means that c is invertible with respect to y. Supposing that f ∈ C 2 , one can show that the Fréchet derivative of the differential constraint is given by ∂c(y, u)(δy, δu) = δy 0 − (∂y f )(y,u) δy − (∂u f )(y,u) δu.

Optimal Control Problems with ODEs

7

Thus, we have the linearised constraint ∂c(y, u)(δy, δu) = 0, which results in δy 0 = ∂y f δy + ∂u f δu,

δy(t0 ) = 0.

(1.14)

Hence, requiring that at (y, u) and with given δu we can uniquely solve this problem, we have that the Fréchet derivative ∂c(y, u) is bijective and we can apply the implicit function theorem (Theorem A.3 in Appendix A) to state that y = S(u) and c(y, u) = 0 are equivalent. Further, it results that S is Fréchet differentiable, and its derivative is given by −1

∂u S(u) = − (∂y c(S(u), u))

∂u c(S(u), u).

Therefore we have δy = ∂u S(u) δu, where δy is the solution to the initial value problem (1.14) with the given δu.

1.3

Existence of Optimal Controls

In this section, we discuss existence of locally optimal processes for our optimal control problem (1.1). This is an essential issue in working with these problems and it is the subject of a vast scientific literature. Thus, our discussion can only be limited to the purpose of illustrating some important results on this issue and to provide some references. For our purpose, we recall the optimal control problem Z T min J(y, u) := `(t, y(t), u(t)) dt + γ(y(T )) t0

s.t. y 0 (t) = f (t, y(t), u(t)), u ∈ Uad .

y(t0 ) = y0 ,

(1.15)

Next, we use the control-to-state map to define the following reduced cost functional ˆ J(u) := J(S(u), u). (1.16) Then, it should be clear that (1.15) is equivalent to the following ‘unconstrained’ optimisation problem ˆ min J(u).

u∈Uad

(1.17)

The equivalence of (1.17) to (1.15) is an essential step that links optimal control problems to optimisation problems, in general, and to the calculus of variation in particular. This means that many results and techniques known in these fields apply to optimal control problems. After this preparation, we can start considering the so called linearquadratic control problem where a direct approach to proving existence of

8

The Sequential Quadratic Hamiltonian Method

optimal controls is possible. For this case, consider the following linear ODE problem y 0 (t) = A y(t) + B u(t), n

m

where y(t) ∈ R , u(t) ∈ R , A ∈ R problem is given by

n×n

y(t0 ) = y0 , n×m

and B ∈ R

y(t) = e(t−t0 ) A y0 + et A

Z

(1.18)

. The solution to this

t

e−s A B u(s) ds.

(1.19)

t0

Because

sup e−sA B < ∞, and assuming u ∈ L1 (t0 , T ; Rm ), then the s∈[t0 ,T ]

Cauchy problem (1.18) admits a unique solution y ∈ AC([t0 , T ]; Rn ). Thus (1.19) defines a control-to-state map S : L1 (t0 , T ; Rm ) → AC([t0 , T ]; Rn ). It is clear that this map is affine and differentiable. Next, we choose the following quadratic functional 1 J(y, u) = 2

Z

T

t0

ν |y(t)| dt + 2 2

Z

T

|u(t)|2 dt,

(1.20)

t0

where ν > 0. Notice that, with respect to the state and control variables, this functional is strictly convex and bounded from below. Now, consider an optimal control problem that requires to minimise (1.20) with the constraint given by (1.18). In this case, since S is affine, one can easily verify that the reduced cost functional Z Z 1 T ν T ˆ J(u) = |S(u)(t)|2 dt + |u(t)|2 dt 2 t0 2 t0 is strictly convex with respect to u. Hence, assuming that Uad is a closed and convex set of U (it can be U itself), it is a standard optimisation result that ˆ the reduced problem min J(u) possesses a unique solution. Notice that, by u∈Uad

the choice of the control cost, we have u ∈ L2 (t0 , T ; Rm ) ⊂ L1 (t0 , T ; Rm ). In a more general setting, proving existence of an optimal control is more involved. In the following, we illustrate a main strategy for achieving this goal that can be traced back to the work of L. Tonelli [266] concerning the proof of existence of minimum points of variational problems. This strategy is closely related to the sequential compactness theorem of B. Bolzano and K. Weierstrass; see, e.g., [167]. Now, for clarity, we consider the optimal control problem (1.15) with a more detailed cost functional as follows: Z T Z ν T J(y, u) = (y)(s) ds + |u(s)|2 ds + γ(y(T )), (1.21) 2 t0 t0

Optimal Control Problems with ODEs

9

where  and γ are convex functions. However, we could also directly refer to (1.15) with `(t, y, u) being continuous in t, convex in y and strictly convex ˜ in u. We denote with J(y) the first integral in (1.21). Furthermore, we restrict to the scalar case with n = 1 and m = 1, and consider the Hilbert spaces H = H 1 (t0 , T ) and U = L2 (t0 , T ). We make the following assumptions: Assumption 1.3 a) The functional J is convex, Fréchet differentiable, and bounded from below; ν ≥ 0. b) If ν = 0, then the optimal control is sought in a closed, convex and bounded subset of U given by Uad = {u ∈ U : u(t) ∈ Kad a.e. } , where Kad is a compact and convex subset of R. If ν > 0, we can choose any closed and convex set Uad ⊆ U . c) The function f satisfies the Assumption 1.2. d) For the given f , there exists a continuous function φ, such that the following estimate holds ky 0 kL2 (t0 ,T ) ≤ φ(|y0 |, kukU ).

e) The structure of the function f is such that the weak limit f (·, yk , uk ) * f (·, y, u) in L2 (t0 , T ) is well defined, assuming that (uk ) ⊂ U , uk * u in U , and (yk ) ⊂ H, yk = S(uk ), yk → y, strongly in C([t0 , T ]). With Assumption 1.3 c), we can invoke Carathéodory’s theorem (Theorems 1.1 and 1.2) and state existence of a unique solution y ∈ AC([t0 , T ]) for a given u ∈ U . For this solution, it holds that y 0 ∈ L1 (t0 , T ). Then, the stability Assumption 1.3 d) allows to improve this result so that y ∈ H. With this result, we have wellposedness of the control-to-state map S : U → H,

u 7→ y = S(u).

Further, we need to prove that this map is weakly sequential continuous in the sense that for any sequence of control functions (uk ) ⊂ U that converges weakly to a u ∈ U , uk * u, the corresponding sequence of states (yk ) ⊂ H, yk = S(uk ), converges weakly in H to y = S(u). For this purpose, notice that any minimising sequence (uk ) is bounded and so also (yk ) is bounded in H because of Assumption 1.3 d). Further, by the Eberlein-Šmulian theorem [81], from any bounded sequence in a Hilbert space, we can extract a weakly

10

The Sequential Quadratic Hamiltonian Method

convergent subsequence. Therefore there is ukm * u in U , and correspondingly we obtain ykj * y in H. Moreover, because of the compact embedding H ⊂⊂ C([t0 , T ]), we have strong convergence ykj → y in C([t0 , T ]), and this limit is unique; this result is essential to accommodate the term γ(y(T )) (if γ is not zero) in our cost functional. Now, we need to prove that the two limits, u and y, satisfy y = S(u). For this purpose, let v ∈ H be any test function. By construction of ykj = S(ukj ), we have Z T  yk0 j (t) − f (t, ykj (t), ukj (t)) v(t) dt = 0, v ∈ H, t0

for all kj ∈ N of the convergent sequences. In this integral, and thanks to Assumption 1.2 and Assumption 1.3 e), we can take the weak limit and obtain Z T (y 0 (t) − f (t, y(t), u(t))) v(t) dt = 0, t0

for all v ∈ H. Hence, we have y = S(u). The discussion that follows involves also the Assumptions 1.3 a) – e). In particular, concerning Assumption 1.3 a), for the cost functional (1.21), our considerations on the existence of optimal controls hold true if we add additional convex (nonsmooth) control costs as, e.g., β kukL1 (t0 ,T ) , β ≥ 0. Next, we illustrate Tonelli’s approach in proving existence of minimiser for the reduced cost functional by means of minimising sequences. Definition 1.1 Consider a nonempty subset V of a Banach space U and a functional Jˆ : V → R. A sequence (vk ) ⊂ V is said to be a minimising ˆ ˆ k ) = inf J(v). sequence of Jˆ in V if lim J(v k→∞

v∈V

It is by definition of the infimum that existence (and the construction) of minimising sequences is guaranteed; see, e.g., [5]. The next lemma gives a condition on Jˆ implying that a minimising sequence is bounded. Lemma 1.1 Consider a nonempty subset V of a Banach space U and a funcˆ = ∞, then tional Jˆ : V → R. If Jˆ is coercive, in the sense that limkvk→∞ J(v) any minimising sequence for Jˆ is bounded. We remark that to our functional in (1.21), with Uad ⊆ U and ν > 0, corresponds a reduced functional Jˆ that is coercive, and thus any minimising sequence is bounded. If ν = 0, then boundedness of a minimising sequence can only be achieved by requiring boundedness of V = Uad ⊂ U . For this reason, our next discussion concerns the admissible set Uad . We recall the following Banach-Saks-Mazur theorem. Theorem 1.3 Let U be a real normed vector space. Let V be a nonempty, convex, and closed subset of U , and let (vk ) be a sequence of points vk ∈ V that weakly converges to v ∈ U as k → ∞. Then the weak limit v belongs to V.

Optimal Control Problems with ODEs

11

A consequence of this theorem is that V is weakly sequentially closed, that is, every weakly convergent sequence has the weak limit in V . This result and the Eberlein-Šmulian theorem prove the following Theorem 1.4 Let U be a reflexive Banach space. Let V be a nonempty, convex, closed and bounded subset of U . Then V is weakly sequentially compact, that is, every sequence contains a subsequence that weakly converges to some v ∈V. We can summarise the above discussion considering Uad as V , and noticing that U = L2 (t0 , T ) is a Hilbert space. Thus, if Uad ⊆ U is convex and closed, and the functional Jˆ is weakly coercive (ν > 0), then a minimising sequence is bounded, and so it has a subsequence that weakly converges to an element of Uad . On the other hand, if Uad ⊂ U is convex, closed, and bounded, then it is weakly sequentially compact, and so any minimising sequence contains a subsequence that weakly converges to some u ∈ Uad . In this latter case, we can have ν = 0. With this preparation, we can prove the following theorem that states existence of a globally optimal process for (1.15). Theorem 1.5 Let the Assumptions 1.2 and 1.3 hold. Then the ODE optimal control problem (1.15) with (1.21) has a solution on Uad . Proof. Consider the problem (1.15) with (1.21) with the Assumptions 1.3 a)–e). Since J is bounded from below, there exists minimising sequences (yk , uk ) ⊂ H × Uad , yk = S(uk ), such that lim J(yk , uk ) = inf J(S(v), v).

k→∞

v∈Uad

Let ν > 0, then it holds ν ν 2 2 ˆ ˜ J(u) = J(S(u), u) = J(S(u)) + kukU + γ(S(u)(T )) ≥ kukU . 2 2 This shows coercivity of Jˆ with respect to u in the U norm. Therefore a minimising sequence (uk ) is bounded in Uad ⊆ U , and since U is a Hilbert space, we can extract a weakly convergent subsequence (ukm ), with ukm * u ∈ Uad . On the other hand, if ν = 0, and Uad is convex, closed, and bounded, then Uad is weakly sequentially compact and therefore from any minimising sequence we can extract a weakly convergent subsequence (ukm ), with ukm * u ∈ Uad . In both cases, the corresponding function sequence (ykm ), ykm = S(ukm ), is bounded in the Hilbert space H by Assumption 1.3 d). Thus, we can extract a weakly convergent subsequence (ykj ), with ykj * y ∈ H, and because H ⊂⊂ C([t0 , T ]), we have ykj → y in C([t0 , T ]). Clearly, ukj * u, and we have that y = S(u).

12

The Sequential Quadratic Hamiltonian Method

Now, notice that J˜ and γ, as well as (denote kj with n)

ν 2

2

kukU are convex. Hence, we have

˜ + ν kuk2 + γ(y(T )) J(y, u) = J(y) U 2 

 ˜ n ) + ν kun k2 ≤ lim γ(yn (T )) + lim inf J(y U n→∞ n→∞ 2 = lim inf J(yn , un ) = inf J(S(v), v). n→∞

v∈Uad

That is, we have obtained J(S(u), u) ≤ inf v∈Uad J(S(v), v), which means that (y, u) is the optimal process sought. We remark that, for the purpose of illustration, we have made a convenient choice of a cost functional that is convex (strictly in u) and differentiable in both its arguments. However, existence of optimal solutions to (1.15) can be proved subject to weaker conditions; see, e.g., L. Cesari’s book [69]. In particular, existence can be proved subject to the following conditions on `. Assumption 1.4 (Cesari’s conditions) In the space D × Kad : a) `(t, y, u) is locally bounded, measurable in t, and convex in u. b) `(t, y, u) is locally Lipschitz continuous in (y, u) uniformly in t. c) there is a scalar function Φ(ζ), 0 ≤ ζ < +∞, bounded below, such that Φ(ζ)/ζ → +∞, and `(t, y, u) ≥ Φ(|u|) for all (t, y) ∈ D and u ∈ Kad . d) γ is lower semicontinuous. We see that in the direct variational approach for proving existence of an optimal process, one relies on the fact that Uad is weakly sequentially compact and Jˆ is weakly lower semicontinuous. These two properties are tightly related to the convexity of the closed set Uad , i.e. Kad , and to the convexity and growth condition of the function `(t, y, u) with respect to the u variable. However, notice that our discussion outlines sufficient conditions for the existence of optimal processes that may be far away from the necessity. In fact, it is possible to formulate control problems that admit optimal processes without any lower semicontinuity and convexity property. We refer to [69, 196] for detailed discussion and further references. Further, in view of our use of the Pontryagin maximum principle, we remark that conditions must be met assuring that optimal controls are essentially bounded [243]. This is the case under the standard hypotheses of coercivity of the existence theory, and implies Lipschitz regularity of optimal trajectories [243, 267]. Clearly, if the properties mentioned above are not available, then existence of an optimal process as defined above may fail. In this case, subject to the

Optimal Control Problems with ODEs

13

remaining conditions in Assumption 1.4, two approaches are available that, in more general settings, allow to prove existence of a suitably defined optimal control. These approaches are known under the name of relaxation. In the first one, the integrand `(t, y, u) is replaced by its quasiconvex envelope with respect to u; see, e.g., [96]. The second approach consists in enlarging the space where the control is sought and to ensure the required compactness properties in a weaker topology. This approach was initiated by L.C. Young [285] with the introduction of the concept of generalised curves in the calculus of variation that resulted in the notion of relaxed controls [35, 69, 111, 121, 279]. We discuss this framework in Chapter 3. Although the approach of relaxed controls is quite general, it does not apply to problems where, e.g., ` is discontinuous in u. On the other hand, as mentioned in [85], we expect that every function that is bounded from below ˆ has or nearly attains a minimum. In particular, assuming that J(u) is bounded from below and lower semicontinuous, then a generalisation of the Weierstrass ˆ theorem [167] states that the problem minu∈Uad J(u) has a solution if Uad is compact in U . However, this compactness property may be too restrictive in applications. For example, the admissible set in Assumption 1.3 b) is not compact in the topology induced by the L2 norm. (In general, the closed unit ball in infinite-dimensional Hilbert spaces is not compact.) Clearly, although the attempt of solving an optimal control problem without having proved existence of an optimal process may be considered naive [285], it can be nevertheless reasonable to search for approximate solutions. In mathematical terms, this means that for any  > 0, by definition of infimum we can find a u ¯ ∈ Uad such that ˆ u ) ≤ inf J(u) ˆ + . J(¯ u∈Uad

(1.22)

In this sense, we call u ¯ a quasioptimal control. The proof of existence of quasioptimal solutions represents a fundamental achievement in optimisation theory that is known as Ekeland’s variational principle. In fact, it was I. Ekeland [105] who investigated problem (1.22), thus providing a new powerful tool in the analysis of variational problems and optimal control problems. In the following, we report the main result in [105] by slightly changing the notation to make it closer to our purpose; see [190] for further details and references. Theorem 1.6 Let V be a complete metric space, the distance of two elements u, v ∈ V being denoted by d(u, v). Let Jˆ : V → R ∪ {+∞} be a lower semicontinuous function, not identically +∞. That is, +∞ is allowed as a value for ˆ but at some point u0 , Jˆ is finite. Further, suppose that Jˆ is bounded from J, below, Jˆ > −∞.

14

The Sequential Quadratic Hamiltonian Method

ˆ  ) ≤ inf u∈V J(u) ˆ + Then, given any  > 0, for every u ∈ V satisfying J(u and every λ > 0, there exists a v ∈ V such that ˆ  ) ≤ J(u ˆ  ), J(v d(v, u ) ≤ λ, ˆ  ) < J(u) ˆ + J(v

(1.23)  λ

d(v , u),

u ∈ V, u 6= v .

ˆ in this reference we find λ = √. As in [111], we can call u an -minimum of J; This theorem is referred to as the ‘strong form’ of Ekeland’s variational principle. It states that, for any chosen λ, , and u an -approximate solution to our optimisation problem, there exists a new function v that possibly improves on u and belongs to a neighbourhood of u , and v satisfies the ˆ third  inequality in (1.23). This relation says that v minimises globally J(·) +  ˆ d(v , ·), which represents a Lipschitz perturbation of J.  λ Notice that Ekeland’s theorem applies to the broad class of lower semicontinuous functionals (including some discontinuous cases) and it does not require any compactness property. We remark that already in [105], this theorem is applied to a class of optimal control problems in Mayer form with free end points. In this framework, let U be the space of Lebesgue measurable controls u : [t0 , T ] → Kad , where Kad ⊂ Rm is compact, not necessarily convex. On this space, we have the metric d(u, v) = meas{t ∈ [t0 , T ] : u(t) 6= v(t)}. Then (U, d) is a complete metric space [105], and the functional Jˆ : U → R is continuous with respect to the metric d. As pointed out in [132], existence of quasioptimal controls should be understood in the sense of a minimising sequence property as follows. Theorem 1.7 Let (αn )n=1,2,... ⊂ R be a sequence of positive numbers with limn→∞ αn = 0, and (un )n=1,2,... ⊂ U a minimising sequence for the optimal control problem with ˆ + αn . ˆ n ) ≤ inf J(u) J(u (1.24) u∈U

Then there exists a second minimising sequence (vn )n=1,2,... ⊂ U with ˆ n ) + αn d(vn , un ) ≤ J(u ˆ n ), J(v ˆ n ) < J(u) ˆ + αn d(vn , u), J(v

u ∈ U, u 6= vn .

We remark that in [212] the proximity condition in the variational principle is used to rewrite the condition for the second sequence into the condition for the (un ) sequence. This result appears advantageous for constructing minimising sequences compared to the earlier result in [105]. Further, we have the following theorem [105, 132].

Optimal Control Problems with ODEs

15

Theorem 1.8 Let (U, k·k) be a Banach space, and Jˆ satisfies the assumptions in Theorem 1.6 and it is Gâteaux-differentiable at all points of a closed and convex set M ⊂ U . Let (αn )n=1,2,... ⊂ R be a sequence of positive numbers with limn→∞ αn = 0, and (un )n=1,2,... ⊂ U a minimising sequence for Jˆ with ˆ n ) ≤ inf u∈M J(u) ˆ J(u + αn . Then there exists a second minimising sequence (vn )n=1,2,... ⊂ U with ˆ n ) + αn kvn − un k ≤ J(u ˆ n ), J(v Jˆ0 (vn )(v − vn ) + αn kv − vn k ≥ 0,

v ∈ M, n = 1, 2, . . . ,

where Jˆ0 denotes the Gâteaux derivative (see Appendix A). ˆ ∗) = We remark that if u∗ is a minimum point for Jˆ on the set M , i.e. J(u ˆ inf u∈M J(u), then it follows also from Ekeland’s variational principle that Jˆ0 (u∗ )(u − u∗ ) ≥ 0,

u ∈ M,

and Jˆ0 (u∗ ) = 0 if u∗ is in the interior of M .

1.4

Optimality Conditions

We have seen that, with a well-defined control-to-state map that encodes the solution of the differential constraint, an optimal control problem becomes a problem of the calculus of variation with the reduced cost functional as follows: ˆ min J(u). (1.25) u∈Uad

Hence, if Jˆ is Fréchet differentiable, the optimality conditions for (1.25) can be formulated in terms of functional derivatives; see Appendix A for more details. In particular, in terms of the gradient of Jˆ with respect to the control function, if u is an optimal control, it must satisfy   ˆ ∇J(u), v − u ≥ 0, v ∈ Uad . (1.26) ˆ Now, since J(u) = J(S(u), u), differentiability of Jˆ requires differentiability of S(·) and of J(·, ·). Notice that, if J is defined as in (1.15), the conditions ` ∈ C 1 and γ ∈ C 1 are sufficient for guaranteeing the differentiability of J; clearly, these conditions are satisfied in the linear-quadratic optimal control case. Now, in order to determine the gradient above, in the following lemma we introduce the adjoint (or costate) equation for the variable p, representing the Lagrange multiplier of our constrained optimisation problem.

16

The Sequential Quadratic Hamiltonian Method

Lemma 1.2 Let f, `, γ ∈ C 1 and the Assumption 1.3 holds. Then the differential equation >

− p0 (t) = (∂y f (t, y(t), u(t)))

p(t) − ∂y `(t, y(t), u(t)),

(1.27)

with terminal condition p(T ) = −∂y γ(y(T )) and y ∈ H, has a unique solution p ∈ H. Proof. Notice that, with the transformation tˆ := (t0 + T ) − t, this terminal value problem, where p is solved ‘backwards’, is transformed to a problem with the initial condition at tˆ = t0 . As by our assumptions, we have y ∈ AC([t0 , T ]) and u ∈ Uad . Let us refer to the right-hand side of (1.27) as F˜ (t, p, u) and notice that −p0 = F˜ (t, p, u(t)) is affine in p and nonhomogeneous. Further, since u(t) is continuously differentiable and Lipschitz-continuous in y, then ∂y f is bounded, and thus F˜ (t, p, u(t)) is Lipschitz-continuous in p. Therefore F˜ results continuous in p for fixed t and, by the assumptions, it is measurable in t for each fixed p. Moreover, as required in Theorem 1.1, we can construct a nonnegative Lebesgue-integrable function m such that |F˜ (t, p, u(t))| ≤ m(t), for (t, p) in an appropriate compact rectangular set R ⊂ R2 , centred at (T, −∂y γ(y(T )). Hence, all assumptions in the Theorems 1.1 and 1.2 are satisfied, and existence and uniqueness of solutions to (1.27) with the given terminal condition is proved. ˆ assuming With this preparation, we can now compute the derivative of J, 2 1 ˆ that f ∈ C and `, γ ∈ C , which are sufficient to state that J is Fréchet differentiable. For this reason, one can focus on the Gâteaux derivative of Jˆ in u in the direction δu, and take the Hilbert space U = L2 (t0 , T ; Rm ). ˆ Therefore, one can identify this derivative with the reduced gradient ∇J(u) in U ; see Appendix A. We have >

ˆ ∇J(u)(t) = − (∂u f (t, y(t), u(t)))

p(t) + ∂u `(t, y(t), u(t)).

A detailed derivation of this formula is given below.  1 ˆ ˆ ˆ (∇J(u), δu) = lim+ J(u + αδu) − J(u) = α→0 α Z T = (∂y ` δy + ∂u ` δu) dt + ∂y γ δy(T ), t0

using Taylor expansion and the fact that S(u + αδu) = S(u) + α ∂u S(u) δu. Now, we replace ∂y ` in the integral with p0 + (∂y f )> p using (1.27). We obtain ˆ (∇J(u), δu) =

Z

T 0

>

Z



T

p + (∂y f ) p δy dt + t0

Z

∂u ` δu dt + ∂y γ δy(T ) = t0

T

= t0

−δy 0 + (∂y f )> δy p dt + p δy|Tt0 +



Z

T

∂u ` δu dt + ∂y γ δy(T ). t0

Optimal Control Problems with ODEs

17

Next, we use the linearised constraint problem, that is, −δy 0 +(∂y f )> δy = −(∂u f )> δu with δy(t0 ) = 0, and obtain ˆ (∇J(u), δu) =

Z

T

 −(∂u f )> p + ∂u ` δu dt + (p + ∂y γ)|t=T δy(T ).

t0

Notice that the second term is zero because of the terminal condition on p(T ); ˆ so the L2 reduced gradient of J(u) has been obtained. We summarise the results above in the following optimality system. y 0 = f (t, y, u), 0

y(t0 ) = y0 >

−p = (∂y f (t, y, u)) p − ∂y `(t, y, u), p(T ) = −∂y γ(y(T ))   > − (∂u f (·, y, u)) p + ∂u `(·, y, u), v − u ≥ 0, v ∈ Uad .

(1.28)

 Notice that  the last inequality in this system can be written as ˆ ∇J(u), v − u ≥ 0 for all v ∈ Uad . In particular, if Uad = U , then it beˆ comes the equation ∇J(u) = 0. In order to ease notation, in the remaining part of the book or whenever possible, we omit to write the transposition operation for the gradients ∂y `, ∂u ` and ∂y γ.

1.5

The Pontryagin Maximum Principle

A formal way to derive the optimality system (1.28) corresponding to (1.15) is by introducing the following Lagrange functional Z

T

L(y, u, p) = J(y, u) +

(y 0 (t) − f (t, y(t), u(t))) p(t) dt.

(1.29)

t0

Then the optimality system (1.28) is obtained as the differential condition for extremality of L with respect to its arguments. Specifically, assuming Uad = U , we have ∇p L(y, u, p) = 0,

∇y L(y, u, p) = 0,

∇u L(y, u, p) = 0.

(1.30)

For this reason, we can refer to this framework as the Lagrange approach. However, an alternative approach is possible, whose starting point is the following Hamilton-Pontryagin (HP) function H(t, y, u, p) = p · f (t, y, u) − `(t, y, u).

(1.31)

18

The Sequential Quadratic Hamiltonian Method

With this function, we can write Z

T

L(y, u, p) = γ(y(T )) +



 p(t) · y 0 (t) − H(t, y(t), u(t), p(t)) dt.

t0

Now, our purpose is to derive optimality conditions based on the HP function. For this purpose, let u∗ be an optimal control and global minimum of ˆ J(u) in U . We formulate variations of u∗ as follows: u = u∗ + α δu, where α > 0. Corresponding to this variation of the optimal control, we obtain a state y of the controlled model that can be expressed by y = y ∗ + α δy, where y ∗ = S(u∗ ) is the state corresponding to the optimal control u∗ , and δy = ∂u S(u∗ ) δu solves the linearised constraint problem δy 0 = ∂y f (t, y ∗ , u∗ ) δy + ∂u f (t, y ∗ , u∗ ) δu,

δy(t0 ) = 0.

Next, we consider the variation of L in (y ∗ , u∗ , p) along the differential constraint. We have L(S(u), u, p) − L(S(u∗ ), u∗ , p) = α ∂y γ(y ∗ (T )) δy(T ) Z T +α (δy 0 p − ∂y H(t, y ∗ , u∗ , p) δy − ∂u H(t, y ∗ , u∗ , p) δu) dt t0

where higher-order terms in α are neglected. From this result and using integration by parts with δy(t0 ) = 0, we obtain L(S(u), u, p) − L(S(u∗ ), u∗ , p) = α (∂y γ(y ∗ (T )) + p(T )) δy(T ) Z T +α (− (p0 + ∂y H(t, y ∗ , u∗ , p)) δy − ∂u H(t, y ∗ , u∗ , p) δu) dt. t0

Notice that, since u∗ is optimal, the variation of L should be zero. Therefore we have p0 (t) = −∂y H(t, y ∗ (t), u∗ (t), p(t)) p(T ) = −∂y γ(y ∗ (T )).

(1.32)

We denote with p∗ the solution to this differential problem and assume that p∗ ∈ H 1 (t0 , T ). One can also verify, based on the definition of H, that y ∗ is the solution to the following Cauchy problem y 0 (t) = ∂p H (t, y(t), u∗ (t), p∗ (t)) y(t0 ) = y0 .

(1.33)

Optimal Control Problems with ODEs

19

Furthermore, we obtain ∂u H(t, y ∗ , u∗ , p∗ ) = 0.

(1.34)

Summarising, we have that the conditions of extremality of L given in (1.30) can be expressed as follows: y 0 (t) = ∂p H (t, y(t), u(t), p(t)) , p0 (t) = −∂y H(t, y(t), u(t), p(t)), ∂u H(t, y(t), u(t), p(t)) = 0.

y(t0 ) = y0 p(T ) = −∂y γ(y(T ))

(1.35) (1.36) (1.37)

We refer to this framework as the Hamilton approach. Compare with (1.28). In this system, the last equality implies that, at optimality, the function H(t, y(t), u(t), p(t)) has an extremum in u for almost all t ∈ [t0 , T ]. Moreover, recalling the discussion on existence of optimal controls given above, assuming f to be linear in u and ` convex in this variable, then (1.37) characterises a maximum of H(t, y(t), u(t), p(t)) with respect to u at each point of the optimal trajectory. That is, 2 ∂uu H(t, y(t), u(t), p(t)) ≤ 0, (1.38) which is the Legendre-Clebsch condition. More in general, if Uad ⊂ U , the optimality condition (1.37) is replaced with (∂u H, v − u) ≤ 0, v ∈ Uad . (1.39) Notice that we could consider a different Lagrange multiplier p˜ = −p, for which we take the Lagrange function Z

T

L(y, u, p˜) = J(y, u) −

(y 0 (t) − f (t, y(t), u(t))) p˜(t) dt.

t0

In this case, the HP function is defined as ˜ y, u, p˜) = p˜ · f (t, y, u) + `(t, y, u). H(t, ˜ = −H, and at optimality the extremum of H ˜ in u is a Thus, we have that H minimum. In the following, we may omit to write the ‘dot’ denoting the scalar product. We see that the Lagrange and Hamilton formulation cannot be applied straightforwardly if H given in (1.31), and thus f and/or ` are not differentiable with respect to u. Indeed, with additional effort, it is possible to extend the Lagrange and Hamilton formulation in the case of semismooth functions; see, e.g., [83, 211]. However, it is not possible to extend these frameworks to cases where Kad is not convex or it represents a finite set of discrete values. Notice that (1.38) and (1.39) represent the necessary conditions for a maximum of the HP function with respect to a bounded u at each point of the optimal trajectory. This fact leads to the formulation of the optimal control

20

The Sequential Quadratic Hamiltonian Method

theory developed by L. S. Pontryagin and his research team [40, 217], where the characterisation of optimality of (y, u, p) is expressed as follows: H(t, y(t), u(t), p(t)) ≥ H(t, y(t), v, p(t)),

(1.40)

for all v ∈ Kad and almost all t ∈ [t0 , T ]. Clearly, assuming that H is differentiable with respect to u, this characterisation implies the conditions (1.39) given above. The condition (1.40) together with (1.32) and (1.33) represents an instance of the Pontryagin maximum principle (PMP). However, as discussed in detail in [259], the principle of maximum of the HP function becomes a theorem after proving it subject to certain assumptions on the components that enter in the formulation of the specific optimal control problem. Thus, in general, in order to account for some special situation, the HP function should also involve an ‘abnormal’ multiplier p0 ∈ R as follows: H(t, y, u, p) = p f (t, y, u) − p0 `(t, y, u), which must be accompanied by a nontriviality condition stating that |p(t)| + |p0 | > 0 for almost all t ∈ [t0 , T ]. However, if y(T ) ∈ intE, with the prescribed E ⊆ Rn , then we can take p0 = 1; see, e.g., [85]. Next, we recall a PMP theorem for the optimal control problem (1.15), in the scalar case, assuming that the set of admissible controls Uad consists of piecewise continuous functions on [t0 , T ] with values in Kad . In this framework, a point t ∈ [t0 , T ] where u is continuous is called a regular point of the control. Clearly, at all regular points the state variable y is continuously differentiable. The proof of the following theorem is taken from [142]. Clearly, this result is limited in scope to our basic optimal control problem; however, it allows to illustrate the construction of a so-called needle variation. A more general PMP is discussed after this theorem. Theorem 1.9 Consider the optimal control problem (1.15) with f, `, γ continuous and continuously differentiable in (t, y). Let w = ((y(t), u(t)) | t ∈ [t0 , T ]) be a strong minimum for the optimal control, where u is a piecewise continuous solution of the optimal control problem, and y is the corresponding continuous and piecewise continuously differentiable state, which is obtained by solving (1.33); further let p be the absolutely continuous function that satisfies (1.32) with the given u and y. Then, in all regular points of the optimal control u, the triple (y, u, p) satisfies the maximality condition H(t, y(t), u(t), p(t)) = max H(t, y(t), v, p(t)). v∈Kad

(1.41)

Optimal Control Problems with ODEs

21

Proof. Let u be an optimal control and let t˜ ∈ [t0 , T ] be a regular point. We define a new admissible control u by a so-called needle variation as follows:  u(t) t ∈ [t0 , T ] \ [t˜ − , t˜) u (t) = (1.42) v t ∈ [t˜ − , t˜), where  > 0 is sufficiently small, and v ∈ Kad . Corresponding to the admissible control u , we obtain the following perturbed trajectory  y(t) t ∈ (t0 , t˜ − ) y (t) = (1.43) yv (t) t ∈ [t˜ − , T ), where yv is the solution of the initial value problem yv0 = f (t, yv , u ),

yv (t˜ − ) = y(t˜ − ),

in the interval (t˜ − , T ). Now, consider the expansion y (t˜ − ) = y (t˜) − y0 (t˜)  + o(), y(t˜ − ) = y(t˜) − y 0 (t˜)  + o(). Notice that, by construction, y (t˜ − ) = y(t˜ − ). With these two expansions, we make the following computation  y (t˜) − y(t˜) = y0 (t˜) − y 0 (t˜)  + o()  = f (t˜, y (t˜), v) − f (t˜, y(t˜), u(t˜))  + o(),  = f (t˜, y(t˜), v) − f (t˜, y(t˜), u(t˜))  + o(), where the last equation has been obtained using the first equation in the sense that y (t˜) = y(t˜) + o(), and the fact that f is continuously differentiable in y. Next, notice that starting at t˜, the optimal state y and the perturbed state y are solutions to the same differential equation y 0 = f (t, y, u(t)), with the same optimal control u, but with different initial conditions, namely y(t˜) and y(t˜) + ω(t˜, v), respectively, where ω(t˜, v) = f (t˜, y(t˜), v) − f (t˜, y(t˜), u(t˜)). At this point we remark that, although we may take ‘large’ variations of the control u through the choice of v, smallness of the variation of the state variable is controlled through the choice of a sufficiently small . This fact appears clearly in the following step. Recall well-known results on the dependence of solutions to Cauchy problems on the initial condition; see, e.g., [46]. Then, for  sufficiently small, the difference between the optimal state y and the perturbed state y in the interval (t˜, T ), where they are both driven by the same dynamics (and control), is given by y (t) − y(t) = Φ(t, t˜, y(t˜)) ω(t˜, v) ,

(1.44)

22

The Sequential Quadratic Hamiltonian Method

where the fundamental matrix Φ is the solution to the following matrix Cauchy problem d Φ = ∂y f (t, y(t), u(t)) Φ, dt Φ(t˜) = In . Thus, we define z(t) = Φ(t, t˜, y(t˜)) ω(t˜, v), and it holds that z 0 = ∂y f (t, y, u) z, with initial condition z(t˜) = ω(t˜, v). We have y (t) − y(t) = z(t) . Now, recall (1.32) and notice that d (p(t) z(t)) = (−∂y f p + ∂y `) z + p ∂y f z = ∂y ` z. dt Thus, integration from t˜ to T gives Z T ∂y ` z dt = p(T )z(T ) − p(t˜)z(t˜) = −∂y γ(y(T ))z(T ) − p(t˜)z(t˜).

(1.45)



ˆ  ) − J(u) ˆ We are now ready to estimate J(u and, for this purpose, we use  the mean value theorem with t ∈ [t˜ − , t˜]. We have ˆ  ) − J(u) ˆ J(u Z t˜ = (`(t, y (t), v) − `(t, y(t), u(t))) dt Z

t˜− T

(`(t, y (t), u(t)) − `(t, y(t), u(t))) dt + γ(y (T )) − γ(y(T ))

+ t˜

=  (`(t , y (t ), v) − `(t , y(t ), u(t ))) Z T + (∂y ` (t, y(t), u(t)) (y (t) − y(t)) + o(|y (t) − y(t)|)) dt t˜

+ ∂y γ(y(T )) (y (T ) − y(T )) + o(|y (T ) − y(T )|) Z T =  (`(t , y (t ), v) − `(t , y(t ), u(t ))) +  (∂y ` (t, y(t), u(t)) z(t)) dt t˜

+  ∂y γ(y(T )) z(T ) + o(). We proceed using (1.45) and obtain ˆ  ) − J(u) ˆ J(u =  (`(t , y (t ), v) − `(t , y(t ), u(t ))) −  p(t˜) ω(t˜, v) + o(). Therefore we have ˆ  ) − J(u) ˆ  J(u = `(t˜, y(t˜), v) − `(t˜, y(t˜), u(t˜)) − p(t˜) ω(t˜, v). lim →0  ˆ we must have J(u ˆ  ) ≥ J(u), ˆ On the other hand, since u is the minimiser of J, then it follows that the limit above is nonnegative. Hence, we obtain  `(t˜, y(t˜), v) − `(t˜, y(t˜), u(t˜)) − p(t˜) ω(t˜, v) ≥ 0.

Optimal Control Problems with ODEs

23

Finally, recalling the definition of ω(t˜, v), we have p(t˜) f (t˜, y(t˜), u(t˜)) − `(t˜, y(t˜), u(t˜)) ≥ p(t˜) f (t˜, y(t˜), v) − `(t˜, y(t˜), v), and the theorem is proved. Since its inception, the PMP framework has been analysed and extended to many classes of optimal control problems among which (1.15) represents a basic one. In this development, one can frequently encounter the formulation of optimal control problems in the so-called canonical Pontryagin form; see, e.g., [101]. In order to illustrate this formulation and discuss a more general optimal control setting. We assume that our governing model is well posed in the framework of Carathéodory, and recall a few transformations. Consider the following functional with free endpoints Z T J(y, u) = `(t, y(t), u(t)) dt + γ1 (t0 , y(t0 )) + γ2 (T, y(T )). (1.46) t0

We introduce the variable z, as the solution to z 0 (t) = `(t, y(t), u(t)), z(t0 ) = 0. With this transformation, the functional (1.46) is replaced by the following one J0 (t0 , y(t0 ), T, y(T ), z(T )) = γ1 (t0 , y(t0 )) + γ2 (T, y(T )) + z(T ). It appears that it is convenient to extend the original differential model for y to include also the variable z with the corresponding evolution equation. Therefore we could write J0 (t0 , y(t0 ), T, y(T )), where y denotes both the state variable and the auxiliary variable z. In this setting, we denote with y 0 = f (t, y, u) the differential constraint including the equation for z. Similarly, an additional equation can be included to make the differential constraint ‘autonomous’ by interpreting the variable t as the state variable of the equation x0 (t) = 1, x(t0 ) = t0 . We assume that after all these transformations, we have the (new) state y(t) ∈ Rn and require u(t) ∈ Kad ⊂ Rm . An optimal control problem can be formulated that includes equality constraints at the endpoints as follows: K(t0 , y(t0 ), T, y(T )) = 0.

(1.47)

This formulation may include initial- and terminal conditions: y(t0 ) − y0 = 0 and y(T ) − yT = 0, respectively. More in general, it defines a manifold where the endpoints can be taken. We denote with d(K) the number of equations in (1.47); in the scalar case, if only initial- and terminal conditions are considered, then d(K) = 2. We can also have inequality constraints at the end points; for example the requirement y(T ) ≤ c. This type of constraints can be expressed as follows: I(t0 , y(t0 ), T, y(T )) ≤ 0.

(1.48)

24

The Sequential Quadratic Hamiltonian Method

We denote with d(I) the number of inequalities in (1.48). With the transformation above, and the given constraints, a canonical Pontryagin optimal control problem is formulated as follows [100, 101]:

s.t.

min J0 (t0 , y(t0 ), T, y(T )) y 0 = f (t, y(t), u(t)), u(t) ∈ Kad , K(t0 , y(t0 ), T, y(T )) = 0 I(t0 , y(t0 ), T, y(T )) ≤ 0.

(1.49)

We assume that J0 , K, I ∈ C 1 and f and its derivatives in t and y are continuous in all their arguments. The bounded set Kad is arbitrary. Corresponding to (1.49), we have the following HP function H(t, y, u, p) = p · f (t, y, u), where p(t) ∈ Rn is the Lagrange multiplier for the differential constraint. With this setting, the endpoints Lagrange functional is given by  L(t0 , y(t0 ), T, y(T )) = α0 J0 + α> I + β > K (t0 , y(t0 ), T, y(T )), where α0 ∈ R, α ∈ Rd(I) , β ∈ Rd(K) . As already discussed at the beginning of this chapter, the pair w = (y, u) and a segment [t0 , T ] define an admissible process if they satisfy all the given constraints. Further, the process w∗ = (y ∗ , u∗ ) with the interval [t∗0 , T ∗ ] provide a strong minimum if there exists a  > 0 such that J(w) ≥ J(w∗ ) for all admissible processes w with [t0 , T ] that satisfy the following conditions |t0 − t∗0 | < , |y(t) − y ∗ (t)| < ,

|T − T ∗ | <  t ∈ [t0 , T ] ∩ [t∗0 , T ∗ ].

We say that w satisfies the maximum principle, if there exist α0 ∈ R, α ∈ Rd(I) and β ∈ Rd(K) , and two absolutely continuous functions p : R → Rn and q : R → R such that the following optimality system holds (i) Nonnegativity: α0 ≥ 0, α ≥ 0 (componentwise); (ii) Nontriviality: α0 + |α| + |β| > 0; (iii) Complementarity: α> I(t0 , y(t0 ), T, y(T )) = 0 (the component αi is zero or the ith inequality is an equality); (iv) Adjoint equations: −p0 = ∂y H(t, y, u, p) and −q 0 = ∂t H(t, y, u, p); (v) Transversality: p(t0 ) = ∂y(t0 ) L(t0 , y(t0 ), T, y(T )), p(T ) = −∂y(T ) L(t0 , y(t0 ), T, y(T )); q(t0 ) = ∂t0 L(t0 , y(t0 ), T, y(T )), q(T ) = −∂T L(t0 , y(t0 ), T, y(T ));

Optimal Control Problems with ODEs

25

(vi) H(t, y(t), u(t), p(t)) + q(t) = 0 for almost all t ∈ [t0 , T ]; (vii) H(t, y(t), v, p(t)) + q(t) ≤ 0 for all t ∈ [t0 , T ], v ∈ Kad . Notice that the function q(t) = −H(t, y(t), u(t), p(t)), which is defined along the optimal trajectory, results absolutely continuous in [t0 , T ], and if H does not depend explicitly on t, then H(y(t), u(t), p(t)) = const along the optimal solution. The conditions (vi) and (vii) imply the maximality condition for the HP function as follows: H(t, y(t), u(t), p(t)) = max H(t, y(t), v, p(t)), v∈Kad

(1.50)

for almost all t ∈ [t0 , T ], which gives the whole set of conditions (i)–(vii) the name PMP. As already stated, the PMP conditions above represent necessary optimality conditions. We have [101] Theorem 1.10 If w = ((y(t), u(t)) | t ∈ [t0 , T ]) is a strong minimum for (1.49), then it satisfies the maximum principle. As remarked in [100] in reviewing the fundamental work [103], although the PMP is stated as a necessary condition for a strong minimum, this framework allows to weaken the notion of minimum such that it occupies an intermediate position between that of classical weak and strong minima. Notice that the canonical class of optimal control problems discussed above is quite general. In particular, it includes time-optimal control problems and optimal control problems with integral constraints, which can be conveniently accommodated using the transformation that has been illustrated above. Now, for further illustration of (i)–(vii), we discuss some special cases related to our basic problem (1.15). A simple case is one when [t0 , T ] and the initial conditions are fixed, and the system is autonomous in the sense that y 0 = f (y, u), ` = `(y, u), and γ = γ(y). Thus, we have α0 = 1, α = 0, β = 0; the adjoint equation with terminal condition is as given in (1.32), and the following value condition holds: H(y(t), u(t), p(t)) = 0. A possible constraint on the state variable (state constraint) could be that, at final time, the endpoint is laying on a surface K(y(T )) = (K1 (y(T )), . . . , Kd (y(T ))) = 0, d < n. In this case, in addition to this terminal condition, we have the following transversality condition [153] p(T ) = −∂y γ(y(T )) − β · ∂y K(y(T )),

(1.51)

where β · ∂y K(y(T )) = β1 ∂y K1 (y(T )) + . . . + βd ∂y Kd (y(T )). Notice that if the final state is fixed, i.e., y(T ) = yT with yT given, then correspondingly (1.51) is dropped; see, e.g., [153].

26

The Sequential Quadratic Hamiltonian Method

Other cases that illustrate the optimality system above are given by problem (1.15) with γ = γ(T, y(T )), and free endtime T . In particular, it holds the following value endtime condition ∂T γ(T, y(T )) − H(T, y(T ), u(T ), p(T )) = 0.

(1.52)

Furthermore, the transversality condition p(T ) = −∂y γ(T, y(T )) holds. However, if the final state is fixed with y(T ) = yT , we have no terminal condition for the adjoint variable. We can also have the case where the endpoint inequality condition I(T, y(T )) ≤ 0 is required to hold. In this case, it is convenient to introduce the function Γ(T, y(T )) = γ(T, y(T )) + α · I(T, y(T )), (1.53) where for the ith component of α we have  ≥ 0 if Ii (T, y(T )) = 0 αi . = 0 if Ii (T, y(T )) < 0 In this setting, the terminal condition in (1.32) for the adjoint variable p is given by transversality condition p(T ) = −∂y Γ(T, y(T )).

(1.54)

Further, we have the value (terminal/endtime) condition ∂T Γ(T, y(T )) − H(T, y(T ), u(T ), p(T )) = 0.

(1.55)

Many time-optimal control problems where the final time T is free are characterised by some endpoint equality constraint in the form of the moving surface K(T, y(T )) = 0. This case is clearly similar to the previous one with Γ(T, y(T )) = γ(T, y(T )) + β · K(T, y(T )). Hence, formally, we have the same transversality and terminal conditions given by (1.54) and (1.55), respectively. See, e.g., [59, 153] for a derivation and detailed discussion of these conditions, where it also appears that the lefthand side of (1.55) represents the derivative of the reduced cost functional with respect to T . Further, notice that the transversality condition is, sometime in the literature, e.g., [74, 104], reformulated in combination with (1.55). We illustrate this reformulation in the case of K(T, y(T )) = 0 being a scalar equation. In this case, we have ∂T Γ(T, y(T )) = H(T, y(T ), u(T ), p(T )) = p(T ) · f (T, y(T ), u(T )) − `(T, y(T ), u(T )) = −∂y Γ(T, y(T )) · f (T, y(T ), u(T )) − `(T, y(T ), u(T )).

Optimal Control Problems with ODEs

27

Now, for simplicity of notation and similar to [74], we define the linear operator (·)0 so that Γ0 := ∂T Γ+∂y Γ·f . Hence, the previous result gives γ 0 +β K 0 = −`. Therefore the (normal) Lagrange multiplier is given by β = −(γ 0 + `)/K 0 . Using this result, the terminal condition becomes p(T ) = −∂y γ(T, y(T )) +

γ0 + ` ∂y K(T, y(T )). K0

(1.56)

Another special class of optimal control problems where an endpoint equality constraint appears is that of optimal periodic processes [139, 188, 203, 249]. These processes are of great interest in, e.g., engineering of chemical plants where cyclic regimes may provide better yields than steady-state operations. If there exists a time-dependent periodic control showing better performance than steady-state control, then the problem is called proper. In the class of periodic control problems, the equality constraint (1.47) becomes the condition of periodicity y(T ) = y(0) (we take t0 = 0 for simplicity), and in applications the functional to be minimised is given as an average over the free period T as follows: Z T 1 `(y(t), u(t)) dt. J(y, u) = T 0 Moreover, it is assumed that the governing model is autonomous. Therefore we can formulate the following optimal periodic control problem Z T 1 min J(y, u) := `(y(t), u(t)) dt T 0 (1.57) 0 s.t. y (t) = f (y(t), u(t)), y(T ) = y(0), The optimal steady-state control problem corresponding to (1.57) is defined as follows: ¯ y, u min J(¯ ¯) := `(¯ y, u ¯),

s.t. f (¯ y, u ¯) = 0,

where y¯ and u ¯ are constant vectors. In the time-dependent case, we assume that u is sought in L∞ (0, T ); however, the case u(t) ∈ Kad is also considered in the literature. While we refer to [38, 89, 133, 203, 249] for theoretical results concerning existence and characterisation of optimal periodic controls, in the following we discuss the PMP optimality conditions for (1.57) in the smooth setting chosen in [249]. For our purpose, it is convenient to consider the following Lagrange function Z T 1 (y 0 (t) − f (y(t), u(t))) p(t) dt + β (y(T ) − y(0)). L(y, u, p, β) = J(y, u) + T 0 Therefore the structure of our HP function remains unchanged: H(y, u, p) = p · f (y, u) − `(y, u).

28

The Sequential Quadratic Hamiltonian Method

As in [249], we assume that H is twice differentiable in u. Further, one obtains the following adjoint problem −p0 (t) = ∂y H(y(t), u(t), p(t)),

p(T ) = p(0).

Notice that the HP function must be constant along the optimal trajectory since our system is autonomous. We also have the following transversality condition H(y(T ), u(T ), p(T )) + J(y, u) = 0. In this setting, the maximality condition for the HP function is given by H(y(t), u(t), p(t)) ≥ H(y(t), v(t), p(t)), for all v ∈ L∞ (0, T ) and almost all t ∈ [0, T ]. However, in the literature, we 2 H < 0 along the optimal also find the stronger conditions ∂u H = 0 and ∂uu trajectory; see, e.g., [249] for related second-order sufficient optimality conditions. On the other hand, notice that in general nonconcavity of H in (y, u) is necessary for proper optimal solution paths [133]. Clearly, if an optimal triple (y, u, T ) exists, it can be viewed as a cyclic solution in the appropriate space; this is shown in [139]. Therefore the transversality condition above holds at any point of the optimal orbit, which should be obvious since we are considering a time-invariant system. It is clear that determining y0 = y(0) for the given time parametrisation is a challenging computational task; see, e.g., [139] for a specific approach. For this purpose, it is very useful to consider the map u 7→ y0 , and this map can be easily written in the case of linear control systems as follows [188]. As in Section 1.3, consider the model y 0 (t) = A y(t) + B u(t),

y(0) = y0 ,

where A ∈ Rn×n and B ∈ Rn×m are constant matrices, and assume that A has no pure imaginary eigenvalues. The solution to this Cauchy problem is given by Z t

e−s A B u(s) ds.

y(t) = et A y0 + et A

0

Hence, assuming that T is known and requiring y(T ) = y(0), after some algebraic manipulation, we obtain y0 = 1 − e T A

−1

eT A

Z

T

e−s A B u(s) ds,

0

which defines the required map; see [188] for a detailed discussion on this function.

Optimal Control Problems with ODEs

1.6

29

The PMP and Path Constraints

Another large class of optimal control problems are those augmented with path constraints, that is, constraints on the values of the state and control functions along the trajectory; see, e.g., [100, 134] for surveys on this topic. A state constraint can be formulated as the requirement that the state variable y satisfies an inequality φ(t, y(t)) ≤ 0 a.e. in [t0 , T ]. However, as already shown in [103], provided that the function φ(t, y(t)) is continuous in [t0 , T ], then the corresponding Lagrange multiplier is a measure, and this fact complicates considerably the analysis and numerical solution of such problems. For this reason, different approaches have been designed that relax the state constraint in a way that makes it more amenable. In particular, we have penalisation-based approaches; see [134] for references to earlier works, where, for example, a Courant-Beltrami penalty function, k (max{0, φ(t, y(t))})2 , is added to the cost functional, which provides a growing penalisation for violating the constraints by increasing k > 0. Similar techniques are mentioned in [161, 162] to enforce endpoint conditions. In Chapter 7, we discuss a penalty approach for state-constrained parabolic optimal control problems. In application, a class of optimal control problems of great interest is that with mixed control and state constraints with equalities and/or inequalities. Specifically, we have the requirement that the state variable y and the control u satisfy the following ψj (t, y(t), u(t)) = 0,

j = 1, . . . , d(ψ),

(1.58)

φj (t, y(t), u(t)) ≤ 0,

j = 1, . . . , d(φ).

(1.59)

and We assume that ψ = (ψ1 , . . . , ψd(ψ) ) and φ = (φ1 , . . . , φd(φ) ) are continuously differentiable. In this setting, rank hypotheses are made in order to derive multiplier rules in a PMP context. However, when the constraints consist of a combination of equalities and inequalities, the rank hypotheses are usually replaced by the Mangasarian-Fromovitz constraint qualifications; see, e.g., [99]. We formulate these qualifications as follows. Assumption 1.5 (Mangasarian-Fromovitz constraint qualifications) In the domain D of the (t, y) space (see Assumption 1.1), for every (t, y, u) ∈ D × Rm denote with I(t, y, u) = {j : φj (t, y(t), u(t)) = 0} the set of active mixed control and state inequality constraints.

30

The Sequential Quadratic Hamiltonian Method

For every (t, y, u) ∈ D × Rm , there are no aj ≥ 0, j ∈ I(t, y, u), and bj , j = 1, . . . , d(ψ) which are not all zero, such that it holds d(ψ)

X

aj ∂u φj (t, y, u) +

X

bj ∂u ψj (t, y, u) = 0.

j=1

j∈I(t,y,u)

Mixed constraints ψ and φ satisfying this condition are said to be regular. In general, supposing that the functions ψ(t, y(t), u(t)) and φ(t, y(t), u(t)) belong to L∞ ([t0 , T ]), then the corresponding Lagrange multipliers λ and µ are measures. However, in the case of regular mixed constraints, they result essentially bounded functions in [t0 , T ], which simplifies the proof of the PMP for the solution to the canonical Pontryagin optimal control problem (1.49) augmented with the mixed control and state constraints given in (1.58) and (1.59); see [85, 100]. In particular, we have a complementarity conditions µj (t) φj (t, y(t), u(t)) = 0, µj (t) ≤ 0, a.e. [t0 , T ] and for all j, and the same transversality conditions as given above. However, in this case the adjoint equations are formulated in terms of the following extended HP function (for the canonical problem (1.49)) b y, u, p, µ, λ) = p · f (t, y, u) + µ · φ(t, y, u) + λ · ψ(t, y, u). H(t, We also have an additional stationarity condition given by b y, u, p, µ, λ) = 0. ∂u H(t, On the other hand, the conditions (vi) and (vii) and thus the maximality condition (1.50) are expressed in terms of the original HP function H(t, y, u, p) and hold in the following set of control values admissible to comparison with u(t) at time t for the optimal trajectory y(t): C(t) = {v ∈ Rm : ψ(t, y(t), v) = 0,

φ(t, y(t), v) ≤ 0}.

Now for illustration, we return to our original formulation of an optimal control problem and include a mixed equality costraint, and afterwards a mixed inequality constraint as discussed above; see, e.g., [252] for more details and examples. Consider the problem Z T min J(y, u) := `(t, y(t), u(t)) dt + γ(y(T )) t0

s.t. y 0 (t) = f (t, y(t), u(t)), ψ(t, y(t), u(t)) = 0,

y(t0 ) = y0 ,

(1.60)

where y(t) ∈ Rn , u(t) ∈ Rm , and ψ(t, y(t), u(t)) ∈ Rr , r = d(ψ), and we have the r-dimensional equality constraint ψ(t, y(t), u(t)) = 0.

(1.61)

Optimal Control Problems with ODEs

31

The extended HP function for (1.60) is given by b y, u, p, λ) = p · f (t, y, u) − `(t, y, u) + λ · ψ(t, y, u). H(t, Thus, we have the adjoint equation − p0 (t) = (∂y f (t, y, u))

>

p(t) + (∂y ψ(t, y, u))

>

λ(t) − ∂y `(t, y, u),

(1.62)

with terminal condition p(T ) = −∂y γ(y(T )). The stationarity condition results in >

− (∂u f (t, y(t), u(t)))

>

p(t) − (∂u ψ (t, y(t), u(t)))

λ(t) + ∂u `(t, y(t), u(t)) = 0. (1.63) Now, consider the case r = m and ∂u ψ is nonsingular along any admissible trajectory. Then u is uniquely specified by (1.61), and the Lagrange multiplier λ is given by   > −1  > λ(t) = (∂u ψ (t, y(t), u(t))) −(∂u f (t, y(t), u(t))) p(t)+∂u `(t, y(t), u(t)) . Clearly, in the case r = m the control u is obtained without regard for optimality. However, in the case r < m, we have (m − r) components of u that can be obtained from the stationarity condition (1.63), whereas (1.61) specifies the remaining components of the control. The r components of the Lagrange multiplier λ are given by λ(t) =



> −1   > ∂u ψ (t, y(t), u(t)) −(∂u f (t, y(t), u(t))) p(t)+∂u `(t, y(t), u(t)) ,

where ∂u ψ denotes a nonsingular r × r partition of the r × m Jacobian ∂u ψ. Alternatively, one could also use the left pseudoinverse of this Jacobian [252]. Next, we discuss the following optimal control problem with a mixed control and state inequality constraint. We have Z

T

min J(y, u) :=

`(t, y(t), u(t)) dt + γ(y(T )) t0

s.t. y 0 (t) = f (t, y(t), u(t)), φ(t, y(t), u(t)) ≤ 0,

y(t0 ) = y0 ,

(1.64)

where y(t) ∈ Rn , u(t) ∈ Rm , and φ(t, y(t), u(t)) ∈ Rr , r = d(φ), and we have the r-dimensional inequality constraint φ(t, y(t), u(t)) ≤ 0.

(1.65)

In this case, it is convenient to introduce φ(t, y(t), u(t)) as the vector of r¯ ≤ r components of φ that are effectively involved in the determination of an optimal control problem as being active. On the other hand if

32

The Sequential Quadratic Hamiltonian Method

φ(t, y(t), u(t)) < 0, then the corresponding multiplier µ = 0 and the calculation does not involve the inequality constraint at all. Therefore if φ(t, y(t), u(t)) = 0, with corresponding multiplier µ(t) < 0, then we have the extended HP function b y, u, p, µ) = p · f (t, y, u) − `(t, y, u) + µ · φ(t, y, u). H(t, Consequently, the adjoint equation becomes − p0 (t) = (∂y f (t, y, u))

>

> p(t) + ∂y φ(t, y, u) µ(t) − ∂y `(t, y, u),

(1.66)

and the stationarity condition results in > p(t) − ∂u φ (t, y(t), u(t)) µ(t) + ∂u `(t, y(t), u(t)) = 0. (1.67) For further discussion on the PMP for optimal control problems with path constraints we refer to, e.g., [59, 85, 100, 252]. >

− (∂u f (t, y(t), u(t)))

1.7

Sufficient Conditions for Optimality

In general, PMP theorems provide necessary optimality conditions for strong minima of optimal control problems. However, subject to appropriate convexity conditions, the PMP can provide a sufficient condition for u to be an optimal control [134]. A simple set of sufficient conditions is given in the case where the HP function has the form H(t, y, u, p) = g(t, y, p) + c(t, y, p)> u +

1 > u R(t) u, 2

2 with ∂u H(t, y(t), u(t), p(t)) = 0 and ∂uu H(t, y(t), u(t), p(t)) is negative definite for all t ∈ [t0 , T ]. Then u is a global optimum [153]. Notice that 2 ∂uu H(t, y(t), u(t), p(t)) = R(t) and the control is given by

u(t) = −R−1 (t) c(t, y(t), p(t)). A more general case is considered in the following theorem due to O.L. Mangasarian. Theorem 1.11 Consider the optimal control problem (1.15) with f, `, γ ∈ C 1 , and Kad be convex. Let y and p be the absolutely continuous functions on [t0 , T ] that satisfy (1.33) and (1.32), respectively, for u ∈ Uad , and the triple (y, u, p)

Optimal Control Problems with ODEs

33

satisfies the maximality condition (1.50). Consider the HP function H and suppose that the map Rn × Rm 3 (z, v) 7→ H(t, z, v, p(t)) ∈ R be concave for all t ∈ [t0 , T ], and the map Rn 3 z 7→ γ(z) ∈ R be convex. Then u is optimal. Proof. Recall that, for a differentiable concave function φ on a convex set C ⊂ Rk , the following holds φ(w) ≤ φ(w0 ) + ∇φ(w0 ) · (w − w0 ),

w, w0 ∈ C.

On the other hand, if φ is convex on C, then this result holds with the reversed inequality. Thus, by the assumptions on H, we have H(t, z, v, p(t)) ≤ H(t, y, u, p(t)) + ∂y H (z − y) + ∂u H (v − u).

(1.68)

Now, in this inequality, let us identify (z, v) ∈ Rn ×Rm with the values at t of an admissible process ((z(t), v(t)) | t ∈ [t0 , T ]) for (1.15). Similarly, the pair (y, u) ∈ Rn × Rm corresponds to the value at x of the process that satisfies the PMP. This implies that ∂u H(t, y(t), u(t), p(t)) (v(t) − u(t)) ≤ 0. Using this result and (1.32) in (1.68), we obtain H(t, z(t), v(t), p(t)) ≤ H(t, y(t), u(t), p(t)) − p0 (t) (z(t) − y(t))

(1.69)

Since z and y are trajectory associated to v and u, respectively, based on the last inequality, we obtain −`(t, z(t), v(t)) ≤ p(t) (f (t, y(t), u(t)) − f (t, z(t), v(t))) − p0 (t) (z(t) − y(t)) − `(t, y(t), u(t)) = −`(t, y(t), u(t)) + p(t) (y 0 (t) − z 0 (t)) + p0 (t) (y(t) − z(t)) d = −`(t, y(t), u(t)) + p(t) (y(t) − z(t)). dt In this inequality we change sign and integrate as follows: Z T Z T T `(t, z(t), v(t)) dt ≥ `(t, y(t), u(t)) dt − p(t) (y(t) − z(t)) t0 . (1.70) t0

t0

Now, since the two processes satisfy the same initial conditions and p is subject to a terminal condition, we obtain T −p(t) (y(t) − z(t)) t = −p(T ) (y(T ) − z(T )) = ∂y γ(y(T )) (y(T ) − z(T )) 0

≥ γ(y(T )) − γ(z(T )),

34

The Sequential Quadratic Hamiltonian Method

where the last inequality results from the fact that g is convex in y. Using this result in (1.70), we obtain Z T Z T `(t, z(t), v(t)) dt + γ(z(T )) ≥ `(t, y(t), u(t)) dt + γ(y(T )). t0

t0

Thus u is optimal. The next result by K.J. Arrow and others represents a stronger result than Mangasarian’s, but more difficult to verify the required conditions. Theorem 1.12 Consider the optimal control problem (1.15) with f, `, γ ∈ C 1 , and Kad ⊂ Rm . Let p, y be the absolutely continuous functions on [t0 , T ] that satisfy (1.33) and (1.32), respectively, for u ∈ Uad , and the triple (y, u, p) satisfies the maximality condition (1.50). Suppose that the following maximised Hamiltonian function H∗ : [t0 , T ] × Rn × Rn → R exists H∗ (t, z, q) = max {q f (t, z, v) − `(t, z, v)}. v∈Kad

Further, suppose that the map Rn 3 z 7→ H∗ (t, z, p(t)) ∈ R be concave for all t ∈ [t0 , T ], and the map Rn 3 z 7→ γ(z) ∈ R be convex. Then u is optimal. Proof. The aim of this proof is to arrive at (1.69). Then the proof follows the same arguments in the proof of Theorem 1.11. By the assumptions and the definition of H∗ , we have H∗ (t, y(t), p(t)) = H(t, y(t), u(t), p(t)). Furthermore, it holds H(t, z, v, p(t)) ≤ H∗ (t, z, p(t)) for every z ∈ Rn and v ∈ Kad . Therefore we obtain H(t, z, v, p(t)) − H(t, y(t), u(t), p(t)) ≤ H∗ (t, z, p(t)) − H∗ (t, y(t), p(t)). (1.71) Now, let us recall a result in [227] stating that for a concave (convex) function φ : Rn → R, the set of the supergradients (subgradients) in any w ∈ Rn is non-empty. We say that s ∈ Rn is a supergradient (subgradient) in the point w if it holds that φ(w0 ) ≤ φ(w) + s · (w0 − w) for all w0 ∈ Rn (with reversed inequality for the convex case). Therefore, for any fixed t ∈ [t0 , T ], taking φ(z) = H∗ (t, z, p(t)), we have that there exists a supergradient s at y(t) such that the following holds H∗ (t, z, p(t)) ≤ H∗ (t, y(t), p(t)) + s · (z − y(t)),

z ∈ Rn .

(1.72)

Optimal Control Problems with ODEs

35

Hence, from (1.71) and (1.72), we obtain H(t, z, v, p(t)) − H(t, y(t), u(t), p(t)) ≤ s · (z − y(t)).

(1.73)

In particular, choosing v = u(t), we have H(t, z, u(t), p(t)) − H(t, y(t), u(t), p(t)) ≤ s · (z − y(t)).

(1.74)

Now, define the function G : Rn → R as follows: G(z) = H(t, z, u(t), p(t)) − H(t, y(t), u(t), p(t)) − s · (z − y(t)). Because of (1.74), G has a maximum in the point y(t). Further, notice that G is differentiable and it holds that 0 = ∇G(y(t)) = ∂y H(t, y(t), u(t), p(t)) − s. Therefore, by the adjoint equation (1.32), we obtain s = −p0 (t). This result in (1.73) gives H(t, z(t), v(t), p(t)) ≤ H(t, y(t), u(t), p(t)) − p0 (t) (z(t) − y(t)), which coincides with (1.69). The second part of the proof of this theorem follows using the same arguments of Theorem 1.11. Now, in the context of the last two theorems given above and as a preparation for the discussion in the next chapter, we present a result that directly relates differences of values of the cost functional in (1.15) with the corresponding differences of the related HP functions. For this purpose, we consider two admissible solution pairs of our Cauchy problem, which are denoted with (y1 , u1 ) and (y2 , u2 ), where u1 , u2 ∈ Uad . Further, we introduce an intermediate (or average) adjoint variable defined as the solution to the following problem > − p˜0 (t) = f˜y (t, y1 (t), y2 (t), u2 (t)) p˜(t) − `˜y (t, y1 (t), y2 (t), u2 (t)), (1.75) with terminal condition p˜(T ) = −˜ γy (y1 (T ), y2 (T )), and the following functions Z 1 f˜y (t, y1 , y2 , u) := ∂y f (t, y1 + s (y2 − y1 ), u) ds, 0

and `˜y (t, y1 , y2 , u) :=

Z

1

∂y `(t, y1 + s (y2 − y1 ), u) ds, 0

and Z

1

∂y γ(y1 + s (y2 − y1 )) ds.

γ˜y (y1 , y2 ) := 0

Now, we prove the following lemma.

36

The Sequential Quadratic Hamiltonian Method

Lemma 1.3 Let f , ` and γ be continuous in all their arguments and continuously differentiable with respect to y. Suppose that (y1 , u1 ) and (y2 , u2 ) are two admissible solutions of the governing model. Then the following equality holds Z T  J(y2 , u2 ) − J(y1 , u1 ) = − H (t, y1 , u2 , p˜) − H (t, y1 , u1 , p˜) dt (1.76) t0

Proof. Z

T

J (y2 , u2 ) − J (y1 , u1 ) =



` (t, y2 , u2 ) − ` (t, y1 , u1 )



dt + γ(y2 (T )) − γ(y1 (T ))

t0 T

Z



=

` (t, y2 , u2 ) − ` (t, y1 , u2 )



T

Z dt +

t0



` (t, y1 , u2 ) − ` (t, y1 , u1 )



dt

t0

+ γ(y2 (T )) − γ(y1 (T )) T

Z

`˜y (t, y1 , y2 , u2 ) (y2 − y1 ) dt +

= t0

Z

T



` (t, y1 , u2 ) − ` (t, y1 , u1 )



dt

t0

+ γ(y2 (T )) − γ(y1 (T ))

Z

T



= Z

t0 T



+

>  p˜0 + f˜y (t, y1 , y2 , u2 ) p˜ (y2 − y1 ) dt

 ` (t, y1 , u2 ) − ` (t, y1 , u1 ) dt + γ(y2 (T )) − γ(y1 (T ))

t0

Z

T

=−

t0

t0

Z

T



+

T Z p˜ (y20 − y10 ) dt + p˜ (y2 − y1 ) +

T

p˜ f˜y (t, y1 , y2 , u2 ) (y2 − y1 ) dt

t0

 ` (t, y1 , u2 ) − ` (t, y1 , u1 ) dt + γ(y2 (T )) − γ(y1 (T ))

t0

Z

T

=−

p˜ (f (t, y2 , u2 ) − f (t, y1 , u1 )) dt − γ˜y (y1 (T ), y2 (T )) (y2 (T ) − y1 (T )) t0

Z

T

  p˜ f (t, y2 , u2 ) − f (t, y1 , u2 ) dt

+ t0 T

Z



+

 ` (t, y1 , u2 ) − ` (t, y1 , u1 ) dt + γ(y2 (T )) − γ(y1 (T ))

t0

Z

T

=−



 (˜ p f (t, y1 , u2 ) − ` (t, y1 , u2 )) − (˜ p f (t, y1 , u1 ) − ` (t, y1 , u1 )) dt.

t0

We refer to, e.g., [59, 84, 85, 100, 101, 111, 121, 142] for further discussion on the PMP framework and many references. We also refer to [184, 229, 252] for introductions to the PMP with numerous examples and discussion of application problems.

Optimal Control Problems with ODEs

1.8

37

Analytical Solutions via PMP

Although the focus of this book is on the numerical solution of optimal control problems, we remark that the PMP principle is a powerful tool for constructing analytical solutions in many situations. The following example illustrates the solution of a bang-bang optimal control; see, e.g., [208] for more details on this kind of control. Example 1.1 This is a control problem of production and consumption during a fixed time interval [0, T ], T > 1. Let y(t) be the amount of economy’s output produced at time t and u(t) be the fraction of output that is reinvested at the same time. Suppose that the economy evolves as follows: y 0 (t) = u(t) y(t),

y(0) = y0 > 0.

Let Kad = [0, 1], u(t) ∈ Kad . Assume that the purpose of the control is to maximise consumption in the sense that Z

T

min J(y, u) := −

(1 − u(t)) y(t) dt. 0

Correspondingly, the Hamilton-Pontryagin function is given by H(t, y, u, p) = y + u y (p − 1). The adjoint equation is given by p0 = −1 + u (1 − p),

p(T ) = 0.

According to the PMP, if (y, u, p) is an optimal solution, then the following must hold H(t, y(t), u(t), p(t)) = max {y(t) + v y(t) (p(t) − 1)} , 0≤v≤1

for almost all t ∈ [0, T ]. Since with our setting we have y > 0, we obtain  1 p(t) > 1 u(t) = 0 p(t) ≤ 1 Now, consider the adjoint equation. Since p(T ) = 0, by continuity of the Lagrange multiplier, we have p(t) ≤ 1 for t ∈ t˜, T . 0 Hence,  u(t) = 0 in this interval and p = −1. Thus, we obtain p(t) = T − t in t˜, T . In t˜, we have p(t˜) = 1, therefore t˜ = T − 1.

38

The Sequential Quadratic Hamiltonian Method

Next, consider t ≤ T − 1, we have p(t) > 1, and hence the control function switches to the value u(t) = 1. The adjoint problem becomes p0 = −p,

p(T − 1) = 1.

The solution to this problem is given by p(t) = e(T −1)−t , 0 ≤ t ≤ T − 1, and u remains constant. Therefore we have  1 0≤t 0, y = (y1 , y2 ). For this problem, the HP function is given by H(y1 , y2 , u, p1 , p2 ) = p1 y2 + p2 u − 1 −

α 2 u , 2

where p = (p1 , p2 ). Thus the adjoint problem is given by p01 = −∂y1 H = 0, p1 (T ) = 0 0 p2 = −∂y2 H = −p1 , p2 (T ) = −y2 (T ). The solution is p1 (t) = 0 and p2 (t) = −y2 (T ), t ∈ [0, T ]. The HP function is differentiable with respect to u and concave. Thus the optimal control is characterised by ∂u H = p2 − α u = 0. Therefore, along the optimal solution, the HP function is as follows: α α (1.78) H(y1 , y2 , u, p1 , p2 ) = α u2 − 1 − u2 = −1 + u2 . 2 2 Now, since our system is autonomous, at optimality this function must p be constant and equal to zero (free endpoint). Hence, we obtain u = ± 2/α. Clearly, in the positive case, the equation y20 = u makes the velocity increase, which is not our objective. Thus, we have r 2 u(t) = − . α q Correspondingly, the velocity at time t is given by y2 (t) = 1 − α2 t. Furthermore, using these results, we can explicitly compute the value of the cost functional as follows: r 1 2 2 J(y1 , y2 , u; T ) = 2 T + (1 − T) . (1.79) 2 α This is a convex function of T whose minimum is achieved at r  α 2 T∗ = −2 , 2 α if 0 < α < 1/2, or at T ∗ = 0 if α ≥ 1/2. Notice that using the optimality system and the fact that p H(y1 (t), y2 (t), q u(t), p1 (t), p2 (t)) = 0, we have obtained u(t) = − 2/α and y2 (t) = 1 −

2 α

t. Therefore, at t = T , we obtain

r H(y1 (T ), y2 (T ), u(T ), p1 (T ), p2 (T )) = −2 + (1 −

2 T) α

r

2 . α

Hence, by direct comparison of the derivative with respect to T of J given in dJ (1.79) with H at t = T , we verify that dT = − H.

40

The Sequential Quadratic Hamiltonian Method

Next, we use a setting similar to Example 1.2 to illustrate: 1) a case where the endtime and the final state are fixed; 2) a case where the endtime is free and the final state is required to lay on a surface. Example 1.3 Our governing model is given by y10 (t) = y2 (t), y20 (t) = u(t),

y1 (0) = 0 y2 (0) = 1.

We consider the following cost functional 1 J(y1 , y2 , u) = 2

Z

T

u2 (t) dt.

0

The HP function for this problem is given by 1 H(y1 , y2 , u, p1 , p2 ) = p1 y2 + p2 u − u2 . 2 The adjoint equations are given by p01 = 0,

p02 = −p1 .

The general solution of this system is given by p1 (t) = c1 and p2 (t) = −c1 t + c2 , where c1 and c2 are integration constants to be determined. Further, the optimal control is characterised by ∂u H = p2 − u = 0. Now, using this last result and the second equation of our model, we have y20 (t) = p2 (t) = −c1 t + c2 . Therefore we can integrate this equation and obtain 1 y2 (t) = − c1 t2 + c2 t + c3 , 2 and c3 = 1 because of the initial conditions. Next, we use this result in the first equation of our model y10 (t) = y2 (t), and by integration we obtain 1 1 y1 (t) = − c1 t3 + c2 t2 + c3 t + c4 , 6 2 where c4 = 0 by imposing the initial conditions. Our first control problem requires that, at the final time T = 1, the conditions y1 (T ) = 1 and y2 (T ) = 0 are satisfied. In this case, the constants c1 and c2 need to be solution to the linear system 1 1 − c1 + c2 = 0, 6 2

1 − c1 + c2 = −1. 2

The solution is given by c1 = 6 and c2 = 2. Thus the optimal control is given by u(t) = −6 t + 2.

Optimal Control Problems with ODEs

41

Our second problem requires to determine T subject to the additional condition that K(y(T )) := y1 (T ) + y2 (T ) − 1 = 0. Therefore we have the transversality condition p(T ) = −β ∂y K(y(T )), that is,   1 p(T ) = −β . 1 Now, our purpose is to determine c1 , c2 , β and T . For this reason, we take the value of p(T) given above and combine it with the solution to the adjoint equations given above. We obtain c1 = −β,

−c1 T + c2 = −β.

Next, from the condition that the HP function is zero along the optimal trajectory, and so at t = 0, we have c1 +

1 2 c = 0. 2 2

From this equation and the previous one follows that β = 2/(1 + T )2 . Further, we take the functions y1 (t) and y2 (t) given above, and with t = T , and analyse the equation y1 (T ) + y2 (T ) = 1. We obtain β = 3/(T 2 + 3 T + 3). Thus, comparing with the previous result for β, we obtain √ T = 3, from which the values of β and c1 , c2 follow: β=

2 , (1 + T )2

c1 = −β,

c2 = −β (1 + T ).

The optimal control is given by u(t) = −β (T − t) − β. Next, we discuss by means of an example the notion of singular controls. A singular control is one for which the Legendre-Clebsch condition (1.38) is not satisfied with strict inequality anywhere along the extremal. Thus, we do not have any concavity property of the HP function. Equivalently, we can say that 2 u is singular if ∂uu H(t, y, u, p) = 0 along the optimal trajectory. In particular, if H is linear in one (or more) components of the control function, then the extremal is singular. In this case, a generalised form of the Legendre-Clebsch necessary conditions is required that provides a characterisation of a singular extremal; see [28, 44]. The example that follows is taken from [245]. Example 1.4 Consider the following optimal control problem Z 1 min J(y, u) := y(t) u(t) dt 0

s.t.

y 0 (t) = u(t),

y(0) = 0,

u ∈ Uad := {u ∈ L2 (0, 1) : u(t) ∈ [−1, 1] a.e. in (0, 1)}.

(1.80)

42

The Sequential Quadratic Hamiltonian Method The HP function for this problem is given by H(t, y, u, p) = p u − y u,

where p solves the adjoint problem p0 (t) = u(t),

p(1) = 0.

We have 2 ∂uu H(t, y, u, p) = 0.

∂u H(t, y, u, p) = p − y,

This result suggests that H cannot have a local extrema, since it is linear in u. However, it may have an extremum on the boundary points of Kad = [−1, 1]. On these points, we have H(t, y, −1, p) = y − p, Then, the PMP gives the following  1 u(t) = −1

H(t, y, 1, p) = p − y.

if p(t) − y(t) > 0 if p(t) − y(t) < 0

(1.81)

which implies that the control must be piecewise constant. However, it appears that u = 0, and thus p(t) − y(t) = 0, can be an extremal. In our case, the solution of the state and adjoint equations is immediate. We obtain Z Z t

y(t) =

1

u(s) ds,

p(t) = −

u(s) ds.

(1.82)

t

0

R1 Thus, we have that the difference p(t) − y(t) = − 0 u(s) ds. These facts lead to the conclusion ( R1 1 if 0 u(s) ds < 0 R1 u(t) = −1 if 0 u(s) ds > 0. Clearly, this equation has no solution. Thus, the only possible extremal seems to be u = 0, but this is not true. The point is that in the reasoning leading to (1.81), we have not required that the HP function is evaluated along the optimal solution. If this solution exists, then the HP function is given by Z 1 H(t, y(t), u(t), p(t)) = (p(t) − y(t)) u(t) = −u(t) u(s) ds, 0

where we have used (1.82). Thus the PMP for the optimal u is as follows: Z −u(t)

1

u(s) ds = max 0

v∈Kad



Z −v 0

1

 u(s) ds .

Optimal Control Problems with ODEs

43

We see that this equation is satisfied for all admissible control functions in the following set Z 1 n o U0 = u ∈ Uad : u(s) ds = 0 . 0

Notice that the PMP condition is satisfied in the sense that both sides of the PMP equality vanish. In this case, the PMP is said to be degenerate. Now, we have the following Z 1 Z 1 1 ˆ J(u) = y(t) u(t) dt = y(t) y 0 (t) dt = y(1)2 ≥ 0. 2 0 0 Therefore using (1.82), we have 1 ˆ J(u) = 2

Z

1

2 u(s) ds .

0

Hence, the reduced cost functional is nonnegative and it vanishes only at the controls that belong to U0 , that is, the singular controls. Now, we report an example from [85] concerning mixed control and state constraints. Notice that the problem below is linear quadratic with a linear mixed constraint, and in this case the necessary optimality conditions discussed in the previous section are sufficient for optimality; see [85]. Example 1.5 Consider the following optimal control problem Z 3  min J(y, u) := y(t) + u2 (t)/2 dt 0

s.t.

y 0 (t) = u(t), y(0) = 0, y(t) − u(t) − 1/2 ≤ 0,

(1.83)

u ∈ Uad := {u ∈ L2 (0, 1) : u(t) ∈ [−1, 1] a.e. in (0, 3)}. Without the mixed constraint, the optimal process is given by   −1 if 0 ≤ t < 2 −t if 0 ≤ t < 2 u(t) = y(t) = t − 3 if 2 ≤ t ≤ 3 (t − 3)2 /2 − 5/2 if 2 ≤ t ≤ 3 However, this process violates the constraint on the interval [0, 1/2), where we have y − u > 1/2. Then, in order to keep as close as possible to this solution, we assume that initially the constraint holds as equality y − u = 1/2. Consequently, we have y(t) = (1 − et )/2,

u(t) = −et /2.

44

The Sequential Quadratic Hamiltonian Method

However, this control is feasible only until t = log 2 > 1/2, at which point we assume that the control switches to the function given above. That is, we have the following candidate for an optimal control   −et /2 if 0 ≤ t < log 2 −1 if log 2 ≤ t < 2 u(t) =  t−3 if 2 ≤ t ≤ 3 Correspondingly, we have the following state trajectory  (1 − et )/2 if 0 ≤ t < log 2  log 2 − t − 1/2 if log 2 ≤ t < 2 y(t) =  (t − 3)2 /2 − 3 + log 2 if 2 ≤ t ≤ 3 In correspondence of these control and state functions, one can verify that φ(t, y, u) = y−u−1/2 is equal zero along the trajectory in the interval (0, log 2), where the Lagrange multiplier µ must be nonpositive. On the other hand, in (log 2, 3) we have strict inequality and µ(t) = 0 in this interval. The extended HP function is given by b y, u, p, µ) = p u − y − u2 /2 + µ (y − u − 1/2). H(t, Therefore on [log 2, 3] the adjoint equation is p0 = 1 with terminal condition p(3) = 0. The solution to this problem is p(t) = t − 3. On the other hand, in (0, log 2), we have p0 = 1−µ and the stationarity condition gives p−u−µ = 0. Hence, combining these two facts, we obtain p0 = −p + 1 + u, which together with the given u and continuity of p at t = log 2 gives p(t) = e−t (2 log 2 − 7) − et /4 + 1. Therefore we obtain µ = 1 − p0 as follows: µ(t) = −e−t (7 − 2 log 2) + et /4 + 1, which is nonpositive in (0, log 2) as required.

Chapter 2 The Sequential Quadratic Hamiltonian Method

2.1 2.2 2.3 2.4 2.5

Successive Approximations Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Sequential Quadratic Hamiltonian Method . . . . . . . . . . . . . . . . . Mixed Control and State Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . Time-Optimal Control Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis of the SQH Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45 50 62 65 72

This chapter is devoted to the formulation and analysis of the sequential quadratic Hamiltonian (SQH) method. This method represents the most recent development in the class of successive approximations schemes for solving optimal control problems governed by differential models. These schemes numerically implement the Pontryagin maximum principle. Different control settings are discussed to illustrate the application of the SQH method. Theoretical results on the wellposedness and convergence of this method are reviewed.

2.1

Successive Approximations Schemes

The successive approximations (SA) schemes are based on the characterisation of optimality in control problems by the Pontryagin maximum principle. These schemes were initially proposed in different variants by H.J. Kelley, R.E. Kopp and H.G. Moyer [152] and by I.A. Krylov and F.L. Chernous’ko [161, 162], motivated in particular by the work of L. I. Rozonoèr [234]. We refer to [74] for an early review. Notice that, sometimes in the literature, SA schemes are referred to as min-H methods [122, 125, 152], or forward-backward sweep methods [189]. The working principle of most SA schemes is the iterative pointwise minimisation of the Hamilton-Pontryagin function of the given optimal control problem. In order to illustrate this principle, it is convenient to recall our

DOI: 10.1201/9781003152620-2

45

46

The Sequential Quadratic Hamiltonian Method

optimal control problem Z

T

`(t, y(t), u(t)) dt + γ(y(T ))

min J(y, u) := t0

s.t. y 0 (t) = f (t, y(t), u(t)), u ∈ Uad ,

y(t0 ) = y0 ,

(2.1)

where the set of admissible controls is given by Uad = {u ∈ U : u(t) ∈ Kad a.e. } , where Kad is a compact subset of Rm . Furthermore, we recall the related HP function H(t, y, u, p) = p f (t, y, u) − `(t, y, u). (2.2) Now, suppose that u0 ∈ Uad is a given initial approximation (guess) of the control function sought. Using this control in the governing model, we can determine the corresponding state by solving the forward problem y 0 (t) = f (t, y(t), u0 (t)),

y(t0 ) = y0 .

(2.3)

We denote the solution to this problem with y 0 . Next, we solve the adjoint problem > p0 (t) = − ∂y f (t, y 0 (t), u0 (t)) p(t) + ∂y `(t, y 0 (t), u0 (t)), (2.4) with terminal condition p(T ) = −∂y γ(y 0 (T )). We denote the solution to this problem with p0 . Now, we can define a control update that tries to enforce the PMP condition of optimality as follows: H(t, y 0 (t), u1 (t), p0 (t)) = max H(t, y 0 (t), v, p0 (t)), v∈Kad

(2.5)

for almost all t ∈ [t0 , T ]. This step constructs the new approximation u1 to the control sought. Clearly, this procedure can be repeated iteratively until a convergence criteria is satisfied. In [161], the stopping criteria is that successive approximations do not differ from one another more than a given tolerance. We summarise the SA scheme of Krylov and Chernous’ko in Algorithm 2.1. Algorithm 2.1 (SA method) Input: initial approx. u0 , max. number of iterations kmax , tolerance κ > 0; set τ > κ, k := 0. while (k < kmax && τ > κ ) do 1) Compute the solution y k to the forward problem y 0 (t) = f (t, y(t), uk (t)),

y(t0 ) = y0 ;

The Sequential Quadratic Hamiltonian Method

47

2) Compute the solution pk to the adjoint problem > p0 (t) = − ∂y f (t, y k (t), uk (t)) p(t) + ∂y `(t, y k (t), uk (t)), with terminal condition p(T ) = −∂y γ(y k (T )). 3) Set uk+1 (t) = argmax H t, y k (t), w, pk (t)



w∈Kad

for all t ∈ [t0 , T ]. 4) Compute τ := kuk+1 − uk k2L2 (t0 ,T ) . 5) Set k := k + 1. end while This SA scheme is discussed in detail in [161, 162], including its extension to free endpoint problems and to problems with equality and integral constraints. For a review of variants of the SA scheme and related theoretical results see, e.g., [33, 74, 219]. Concerning the SA scheme for optimal control problems governed by integral equations see, e.g., [244]. Notice that in Step 3 of Algorithm 2.1 we tacitly assume that the resulting uk+1 is Lebesgue measurable, and we start the iteration with a measurable u0 in Uad . This property is guaranteed to hold if the function (t, w) 7→ H t, y k (t) , w, pk (t) is Lebesgue measurable in t for each w ∈ Kad and is continuous in w for each t ∈ [t0 , T ]; see [226]. A result that motivates Step 3. in Algorithm 2.1 is the following estimate given by Rozonoèr in [234]. Let u, v ∈ Uad with Kad compact and convex, then there exists a constant C > 0 such that the following holds Z T  ˆ − J(u) ˆ J(v) =− H (t, y(t), v(t), p(t)) − H (t, y(t), u(t), p(t)) dt + R, t0

(2.6) where |R| ≤ C t0 |u(t) − v(t)| dt. The constant C depends on the size of the interval and on the Lipschitz constants of f and ` with respect to y. In (2.6), the functions y and p are the solutions to the state and adjoint problems corresponding to the given u. We see that with an update v = u + δu that increases the value of a differentiable H along the trajectory (y(t), p(t)), we obtain a reduction of the value of the functional of order kδuk, if this norm of the variation of the control is sufficiently small such that the first term on the right-hand side of (2.6) is larger than the remainder term (in absolute values), which is bounded by kδuk2 . Notice that, in this context, a needle variation of u is allowed; see Theorem 1.9. We remark that (2.6) resembles Weierstrass’ formula [270]. We also remark the important fact that in Step 3. for each t a finitedimensional optimisation problem must be solved, which usually can be done by analytical means and/or by comparison of a finite-number of alternates. RT

2

48

The Sequential Quadratic Hamiltonian Method

A special case often considered in the references above is given by a control problem in Mayer form with a linear γ and the forward model has the linear composite structure f (t, y, u) = A(t) y + B(t) u. In this case, the adjoint problem does not depend on state variable y and on the control variable u. Therefore Step 2 in Algorithm 2.1 needs to be performed only once, and the same is true for Step 3. In this case, the SA scheme converges in two iterations. In the early work [161], the SA scheme is illustrated solving an optimal control problem with the following objective functional: J(y, u) =

n X

ci yi (T ),

i=1

n X

c2i > 0.

(2.7)

i=1

Further, it is required that u ∈ Uad := {u ∈ U : u(t) ∈ Kad a.e. } where U = L2 (0, T ; Rm ), and Kad is a compact subset of Rm . Notice that in (2.7) no running cost appears. Thus, in the case of an autonomous system, the HP function has the following structure: H(y, u, p) =

n X

pi fi (y, u).

(2.8)

i=1

For such a system and the functional (2.7), the adjoint equation is given by p0i (t)

=−

n X ∂fk k=1

∂yi

(y, u) pk .

(2.9)

If the time horizon of this optimal control problem is fixed, then the terminal condition for this equation is given by pi (T ) = −ci , i = 1, . . . , n. On the other hand, for a free endpoint problem where T is variable and should be optimised, usually a terminal condition for the state at final time is considered. In particular, as in [161] one can choose the following condition on a component of the state variable yj (T ) = A,

j ∈ {1, 2, . . . , n}.

Then the terminal condition for (2.9) can be derived from the fact that, at optimality, we have H(y(T ), u(T ), p(T )) = 0. Thus, we obtain Pn i=1,i6=j pi (T ) fi (y(T ), u(T )) pj (T ) = − , fj (y(T ), u(T )) and pi (T ) = −ci and i 6= j. Notice that this transversality condition is also obtained with (1.56). In [161] it was pointed out that the SA method applied to this problem is not robust, that is, convergence is attained only by providing a sufficiently accurate initial guess of the optimal control, and assuming a certain range of values of the optimisation parameters. For this reason, different modifications

The Sequential Quadratic Hamiltonian Method

49

of the SA scheme were proposed in order to guarantee convergence; see [74]. Among these variants, the most simple one was presented in [162] consists in the following modification of Step 3. of Algorithm 2.1:  u ˜ (t) = argmax H t, y k (t), w, pk (t) , w∈Kad

uk+1 (t) = (1 − α) uk (t) + α u ˜ (t) ,

(2.10)

where a convex Kad is assumed. That is, the control is being updated by a linear combination of the originally new and the control of the preceding iteration. However, the parameter α ∈ [0, 1] should be chosen such that ˆ − α) uk + α u ˆ k ), if uk is not optimal, which requires to perform a J((1 ˜) < J(u line-search procedure. The similarity of this modified SA scheme with a projected steepest descent method was investigated in [33] with a linear-quadratic control problem, and in this case an a-priori estimate of α was obtained; but this similarity is limited to the linear-quadratic case. However, a few convergence proofs were presented for specific optimal control problems and special SA variants [74], while a convergence theory that covers the possibly large range of applicability of SA methods seems not available. In view of this discussion on the convergence of SA schemes, we present the following example taken from [245], with some additional remarks. Example 2.1 Consider the following optimal control problem Z  1 1 2 y (t) + u2 (t) dt min J(y, u) := 2 0 s.t. y 0 (t) = u(t), y(0) = 0, u ∈ Uad ,

(2.11)

where the set of admissible controls is given by Uad = {u ∈ U : u(t) ∈ [−1, 1] a.e. } . Clearly, the control u = 0 is the unique solution to this optimal control problem. The related HP function is given by  H(t, y, u, p) = p u − y 2 + u2 /2. (2.12) We obtain the adjoint problem p0 (t) = y(t),

p(1) = 0.

(2.13)

Since H is differentiable with respect to u, we consider the optimality conditions (∂u H, v − u) ≤ 0, for all v ∈ Uad . We obtain  if p(t) < −1  −1 p(t) if − 1 ≤ p(t) ≤ 1 u(t) = (2.14)  1 if p(t) > 0

50

The Sequential Quadratic Hamiltonian Method

Now, let us apply the SA scheme, starting with any u0 ∈ Uad . With this control approximation in the state equation and integration, we obtain Z t 0 −t ≤ y (t) = u0 (s) ds ≤ t. 0

Now, replace this y 0 in the adjoint equation. Integration gives Z 1 1 − t2 1 t2 − 1 1 y 0 (s) ds ≤ − ≤ ≤ p0 (t) = − ≤ . 2 2 2 2 t Therefore from (2.14), we have u1 (t) = p0 (t) and the bounds −

1 1 ≤ u1 (t) ≤ . 2 2

Now, if we perform the above calculation procedure several times, and each time starting with the new control approximation, after k iterations we obtain the following result: 1 1 − k ≤ uk (t) ≤ k , 2 2 which shows that the SA sequence (uk ) uniformly converges to the zero function (the solution) in [0, 1] as k → ∞. Notice that, starting with u0 ∈ Uad , for all subsequent control approximations, the control constraints are not active, since pk (t) ∈ (−1, 1). Further, ˆ k )(t) = uk (t) − pk (t). Therenotice that the reduced gradient is given by ∇J(u fore, a steepest descent scheme with step size α is as follows:  uk+1 (t) = uk (t) − α uk (t) − pk (t) = (1 − α) uk (t) + α pk (t) = (1 − α) uk (t) + α u ˜(t), where u ˜(t) = pk (t).

2.2

The Sequential Quadratic Hamiltonian Method

In its original formulation, the SA scheme appears efficient but not robust with respect to the numerical and optimisation parameters. Twenty years later, an improvement in robustness was achieved by Y. Sakawa and Y. Shindo, with the algorithm given in [241], by introducing a quadratic penalty of the control updates that results in an augmented HP function as follows: 2 H (t, y, u, v, p) := H (t, y, u, p) −  u − v . (2.15)

The Sequential Quadratic Hamiltonian Method 51 2 The supporting rationale of adding the quadratic term  u − v ,  > 0 is that this term penalises local control updates that differ too much from the current control value represented by v. As a source of inspiration for this addition, Sakawa and Shindo refer to the work of B. Järmark presented in [147], thus anticipating an approach similar to the proximal scheme proposed by R. T. Rockafellar in [225]. However, notice that in these early works the differentiability of H with respect to u (in a convex set) is required, which is not necessarily the case in the SA approach or any approach that relies on the original PMP formulation. On the other hand, assuming a differentiable and concave H, and convexity of Uad , one can see that the quadratic augmentation term leads to an update that shows similarities to a steepest descent step with step-size α = 1/(2), which in some cases corresponds to the SA modification (2.10) discussed in [162]. Notice that this idea can be further traced back to the works of R.V. Southwell on relaxation schemes in the field of numerical linear algebra [248], and it appears a natural choice in view of Rozonoér result (2.6). Now, we discuss the method of Sakawa and Shindo [241] as follows. Algorithm 2.2 (Sakawa-Shindo method) Input: initial approx. u0 , max. number of iterations kmax , tolerance κ > 0,  > 0, σ > 1, and ζ ∈ (0, 1); set τ > κ, k := 0. Compute the solution y 0 to the forward problem y 0 (t) = f (t, y(t), u0 (t)),

y(t0 ) = y0 .

while (k < kmax && τ > κ ) do 1) Compute the solution pk to the adjoint problem > p0 (t) = − ∂y f (t, y k (t), uk (t)) p(t) + ∂y `(t, y k (t), uk (t)), with terminal condition p(T ) = −∂y γ(y k (T )). 2) Determine uk+1 and y k+1 such that the following constrained optimisation problem is satisfied   H t, y k+1 (t), uk+1 (t), uk (t), pk (t) = max H t, y k+1 (t), w, uk (t), pk (t) w∈Kad

together with the forward problem d k+1 y (t) = f (t, y k+1 (t), uk+1 (t)), dt

y k+1 (t0 ) = y0 ;

for all t ∈ [t0 , T ].   3) If J y k+1 , uk+1 − J y k , uk > 0 (no minimisation), then increase  with  = σ  and go to  Step 2.  Else if J y k+1 , uk+1 − J y k , uk < 0, then decrease  with  = ζ  and continue.

52

The Sequential Quadratic Hamiltonian Method 4) Compute τ := kuk+1 − uk k2L2 (t0 ,T ) . 5) Set k := k + 1.

end while In this algorithm, we notice two peculiar features. On the one hand, in Step 2, we have to perform maximisation of H by updating simultaneously the control and the state variable, for a given adjoint variable. This is possible in a numerical discretised setting as discussed in [241]. On the other hand, in Step 3, we have a procedure similar to a line-search that adaptively chooses  to guarantee the construction of a minimising sequence. The fact that one can always find an  such that, correspondingly, the value of J is reduced, is a central issue in the theoretical discussion of the Sakawa-Shindo method and its variants. We postpone this discussion to the next section. However, it is clear that the need of coupled updates of the state and control variables represents a limitation of this method from a numerical and theoretical point of view. In fact, these updates are difficult to realise in the case of large-size systems, especially in the presence of nonlinear structures and of approximated infinite-dimensional problems. In fact, in these cases, it may require multiple evaluation of the forward problem within Step 2. Moreover, it is not clear how to analyse this coupling at a continuous level beyond the statement of Step 2. Further, we see that Step 3. can be improved by including the concept of sufficient reduction of the objective functional in a way similar to that introduced by L. Armijo and P. Wolfe. In order to address these issues in the Sakawa-Shindo method, especially in view of extension of this method for solving optimal control problems governed by partial differential equations, the sequential quadratic Hamiltonian (SQH) method was proposed in [52], and applied to ODE control problems in [54]. In this development, it was recognised that the quadratic penalisation of large control updates also prevents large changes of the state y such that in Step 2 of Algorithm 2.2 the control update can be calculated using the state function of the governing model obtained at the previous iteration. This change greatly simplifies the algorithm and increases its efficiency also by avoiding the need to recalculate the state function after a local control update. The other new feature of the SQH method is to include a condition of sufficient decrease expressed by   J y k+1 , uk+1 − J y k , uk ≤ −η kuk+1 − uk k2L2 (t0 ,T ) , for some η  > 0, which must be satisfied in order to accept the update y k+1 , uk+1 . This condition is motivated by the estimate (2.6), in the sense that a successful update must induce a change of the value of the cost functional that is at least of the order of kδuk2 . The SQH method is implemented by the following algorithm.

The Sequential Quadratic Hamiltonian Method

53

Algorithm 2.3 (SQH method) Input: initial approx. u0 , max. number of iterations kmax , tolerance κ > 0,  > 0, σ > 1, η > 0, and ζ ∈ (0, 1); set τ > κ, k := 0. Compute the solution y 0 to the forward problem y 0 (t) = f (t, y(t), u0 (t)),

y(t0 ) = y0 .

while (k < kmax && τ > κ ) do 1) Compute the solution pk to the adjoint problem > p0 (t) = − ∂y f (t, y k (t), uk (t)) p(t) + ∂y `(t, y k (t), uk (t)), with terminal condition p(T ) = −∂y γ(y k (T )). 2) Determine uk+1 that solves the following optimisation problem   H t, y k (t), uk+1 (t), uk (t), pk (t) = max H t, y k (t), w, uk (t), pk (t) w∈Kad

for almost all t ∈ [t0 , T ]. 3) Compute the solution y k+1 to the forward problem y 0 (t) = f (t, y(t), uk+1 (t)),

y(t0 ) = y0 .

4) Compute τ := kuk+1 − uk k2L2 (t0 ,T ;Rm ) .   5) If J y k+1 , uk+1 − J y k , uk > −η τ , then increase  with  = σ  and go to Step 2.   Else if J y k+1 , uk+1 − J y k , uk ≤ −η τ , then decrease  with  = ζ  and continue. 6) Set k := k + 1. end while In the SQH Algorithm 2.3, we see that if the resulting control uk+1 and the corresponding y k+1 do not minimise the cost functional more than −η τ with respect to the former value J(y k , uk ), then the penalisation parameter  is increased and the maximisation of the resulting augmented Hamiltonian H is performed again. Otherwise, if we have sufficient decrease, the new control function as well as the corresponding state are accepted. In this case, the adjoint problem is solved again and the value of  is reduced such that greater variations of the control value become more likely. If the convergence criterion τ < κ is not fulfilled, then in the SQH algorithm the maximisation procedure is repeated. If the convergence criterion is fulfilled, then the algorithm stops and returns the last calculated control. Notice that the adaptive choice of the value of the weight  plays an essential role to attain convergence of the

54

The Sequential Quadratic Hamiltonian Method

SQH method. The wellposedness of the SQH method is discussed in the next section. Our first application of the SQH method is with an optimal control problem related to nuclear magnetic resonance (NMR) as illustrated in the following example with a control-affine system. Example 2.2 In 1946, F. Bloch proposed a semi-classical model that describes the time evolution of the bulk magnetisation of a noninteracting sample of atomic nuclei with spin 12 ; see [39]. This model is based on the fact that, in classical electrodynamics, the time derivative of the angular momentum is equal the cross product of the magnetic moment with the external magnetic field. Scaling this equation by multiplication with the gyro-magnetic ratio γ, we obtain m0 (t) = m(t) × γ B(t), (2.16) where m = (mx , my , mz )T , in a Cartesian (x, y, z)-coordinate system, denotes the nuclear magnetisation (magnetic moment), and B is the external magnetic field. If this field is not collinear to the bulk magnetisation it can be shown that m(t) precesses around B with the so-called Larmor frequency given by ω = −γ |B|, where |B| is the magnitude of the magnetic field B. This precession results in a time-varying magnetic field, which would induce a voltage signal in a nearby coil. However, in thermal equilibrium, the m and B fields are parallel in the average and no signal is produced. In NMR, one applies a short electromagnetic pulse (a control mechanism) that deflects the bulk magnetization from its equilibrium state such that it induces the above-mentioned signal, which can be measured in laboratory experiments. One recognises that the strength of this signal decays, which can be explained by the fact that the individual spins interact with each other and with the environment. That is, the bulk magnetisation m returns to its equilibrium state by releasing energy until m and B are parallel and the signal disappears. This is the so-called Bloch termed relaxation process that is modelled by two relaxation time constants, T1 and T2 . The first relaxation is called spin-lattice relaxation and is modelled by the following equation: m0z (t) =

m0 − mz (t) . T1

(2.17)

Thus, T1 is related to the regrowth of longitudinal magnetisation mz , and is called the spin-lattice relaxation time. In this equation, m0 represents the nuclear magnetisation at equilibrium. The second mechanism, called spin-spin relaxation, models the decay of the transverse components mx and my as follows: m0x (t) = −

mx (t) , T2

(2.18)

m0y (t) = −

my (t) . T2

(2.19)

The Sequential Quadratic Hamiltonian Method

55

Combining (2.16) with (2.17) – (2.19) gives the full Bloch equations mx (t) , T2 my (t) m0y (t) = [m(t) × γB(t)]y − , T2 m0 − mz (t) m0z (t) = [m(t) × γB(t)]z + . T1

m0x (t) = [m(t) × γB(t)]x −

Now, we consider the following external magnetic field with time-dependent transverse components that represent our control fields −γB(t) := (a u2 (t), −a u1 (t), ω), where a is a given constant. Thus, we can write our controlled Bloch model as follows: d m0x (t) = − mx (t) − ω my (t) − a u1 (t) mz (t), 2 d 0 my (t) = ω mx (t) − my (t) − a u2 (t) mz (t), 2 m0z (t) = a u1 (t)mx (t) + a u2 (t) my (t) − d (mz (t) + 1), where d represents a damping constant, depending on T1 and T2 , and appropriate scaling of m0 . This is a model with a linear-affine control mechanism:  m0 (t) = A + u1 (t) B1 + u2 (t) B2 m(t) + D, (2.20) where  d −2 A= ω 0

−ω − d2 0

 0 0 , −d

 0 0 −a B1 =  0 0 0  , a 0 0 

 0 0 0 B2 = 0 0 −a , 0 a 0 

and D = (0, 0, −d)T . Next, we specify an optimal control problem governed by (2.20) in a time interval [0, T ] and with a given initial condition m(0) = m0 . Our purpose is to minimise the following cost functional J (m, u) :=

1 ν km(T ) − mT k22 + kuk2L2 (0,T ) + β kukL1 (0,T ) . 2 2

(2.21)

Therefore the purpose of the control u = (u1 , u2 ) is to drive the Bloch system from the initial configuration m0 to a desired target configuration mT at t = T , while minimising a combination of L2 and L1 costs of the control. We assume box constraints for both control components as follows:  uj ∈ Uad := v ∈ L2 (0, T ; R) : v(t) ∈ Kad a.e. , j = 1, 2, where Kad = [u, u].

56

The Sequential Quadratic Hamiltonian Method 1.5

1 0.8

1 0.6 0.5

0.4 0.2

y

u

0 0

-0.5 -0.2 -0.4

-1

-0.6 -1.5 -0.8 -2

-1 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

t

0.5

0.6

0.7

0.8

0.9

1

t

FIGURE 2.1: Optimal controls and trajectories of the Bloch system obtained with the SQH scheme with the first experiment. T

In the first experiment, we choose a = 1, ω = 1, d = 0, and m0 = (0, 0, 1) , T that is, the equilibrium state, and mT = (1, 0, 0) . We take u = −2 and u = 2, and T = 1. The weights of the cost of the control are given by ν = 10−7 and β = 10−7 . The parameters of the SQH Algorithm 2.3 are as follows: we choose ζ = 0.8, σ = 1.2, η = 10−9 , κ = 10−10 , the initial value  = 100, and zero is the initial guess for the control function, u0 = 0. These values are chosen arbitrarily in their respective range. In this experiment, the SQH scheme converges after 40 iterations. The resulting optimal controls and trajectories are depicted in Figure 2.1. In the second experiment, we change part of the setting as follows: T = 1/2, d = 10−3 and β = 10−3 . In this case, the SQH scheme converges after 50 iterations. The resulting optimal controls and trajectory are depicted in Figure 2.2. In the first experiment, we see that the box constraints on the controls are not active, and the Bloch system reaches the given target. However, in the 2

1 0.8

1.5

0.6 1 0.4 0.5

y

u

0.2 0

0 -0.2

-0.5

-0.4 -1 -0.6 -1.5 -0.8 -2

-1 0

0.05

0.1

0.15

0.2

0.25

t

0.3

0.35

0.4

0.45

0.5

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

t

FIGURE 2.2: Optimal controls and trajectories of the Bloch system obtained with the SQH scheme with the second experiment.

The Sequential Quadratic Hamiltonian Method 1

1

0.9

0.9

57

0.8 0.8 0.7 0.7 0.6

J

J

0.6 0.5

0.5 0.4 0.4 0.3 0.3 0.2 0.2

0.1

0

0.1 0

5

10

15

20

25

30

35

40

0

SQH iterations

50

100

150

200

250

300

SQH iterations

FIGURE 2.3: Convergence history of the objective J along the SQH iterations for the first experiment (left) and for the second one. second experiment, where the target should be reached in half the time horizon, the controls are necessarily larger and so the box constraints become active. We also see that the larger values of β, that increase the L1 -cost, promote sparsity, in the sense that the control tends to be zero in some subintervals in [0, T ] whose sizes increase with β. Notice also that with d = 0 the evolution of the Bloch system is unitary, whereas with d > 0 it is dissipative. The convergence history of the objective J along the SQH iterations is shown in Figure 2.3. Now, we make some additional remarks on some general features of the SQH algorithm by using the above example for illustration. Associated to the optimal control problem of the Bloch system is the following HP function  ν A + u1 B1 + u2 B2 m + D] − (u21 + u22 ) − β (|u1 | + |u2 |). 2 (2.22) Further, the related adjoint problem is as follows:

H (t, m, u, p) = p>



p0 (t) = − A + u1 (t) B1 + u2 (t) B2

>

p(t),

p(T ) = −(m(T ) − mT ). (2.23)

We see that the functional (2.21) is not Fréchet differentiable; however, it is sub-differentiable with respect to u. Therefore semi-smooth functional calculus [145, 272] can be used to define gradient-based methods to solve this optimal optimal control problem; in particular, see [78] for analysis and implementation of a semi-smooth Newton scheme. On the other hand, in a PMP framework sub-differentiability is not required but can be used in Step 2 of Algorithm 2.3, in order to maximise   H t, mk , w, uk , pk = H t, mk , w, pk −  (w1 − uk1 )2 −  (w2 − uk2 )2 .

58

The Sequential Quadratic Hamiltonian Method

with respect to w = (w1 , w2 ) ∈ Kad × Kad ; see also [85]. However, in this case the optimisation problem is finite-dimensional and a case study allows to determine analytically the few extremal points among which the maximum of H is attained. For this purpose, a simple approach to determine the subdifferential of the term absolute value |u| in (2.22) is to consider separately the two cases where u is positive or negative. Then, for each t fixed, we obtain the following two candidates for the jth component of the control sought. We have     2  vj + p> Bj m − β 1 ,0 , uj = max min u, 2 + ν and u2j

    2  vj + p> Bj m + β ,u , = max min 0, 2 + ν

where vj = ukj and j = 1, 2, p = pk and m = mk . Thus, in the SQH scheme, we evaluate H on the points (u11 , u12 ), (u11 , u22 ), (u21 , u12 ), and (u21 , u22 ), and choose uk+1 (t) equal to that pair that maximises the augmented HP function. Clearly, in the most general case where no differentiability property can be exploited, a direct search of the maximum of H may be required. In this case, one can use different methods as those discussed in [157]; see also [275]. In this framework, a particular class of problems where the SQH method is very effective is when the admissible set of values of the control includes a finite set of distinct values, which is of interest in mixed-integer optimisation and control problems; see, e.g., [206, 239]. For example, in the first experiment with an optimal control of the Bloch system, we can replace the convex set Kad = [−2, 2] with the set Kad = {−2 + 2j/10, j = 0, . . . , 20} and in Step 2 of Algorithm 2.3 performs a direct search of the maximum of H on Kad × Kad . However, in this case an initial  = 1 is chosen. The resulting optimal controls and trajectories are depicted in Figure 2.4. 1.5

1 0.8

1 0.6 0.5

0.4 0.2

y

u

0 0

-0.5 -0.2 -0.4

-1

-0.6 -1.5 -0.8 -2

-1 0

0.1

0.2

0.3

0.4

0.5

t

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

t

FIGURE 2.4: Optimal controls and trajectories of the Bloch system obtained with the SQH scheme with a discrete admissible set of control values.

The Sequential Quadratic Hamiltonian Method

59

We remark that the Bloch model appearing in Example 2.2 is omnipresent in quantum control problems [47, 49], although a damping effect may appear only in the case of open quantum systems; see, e.g., [15, 47, 48]. More in general, the quantum description of a controlled spin system leads to the following equation [78, 79]   NC X uj (t) Bj  y(t), y(0) = y0 , (2.24) y 0 (t) = A + j=1

where y(t) ∈ Rn represents the quantum state of the system at the time t, uj (t) ∈ R  is the jth-component of the control vector-function u = u1 , . . . , uNC with NC controls, and the matrices A, Bj ∈ Rn×n , j = 1, . . . , NC are constant and skew-symmetric. The matrix A drives the dynamics of the uncontrolled system, and the Bj , modulated by the uj , represent the control potentials, that is, a varying electromagnetic field as; e.g., a laser pulse. However, in the laboratory, there are bounds on the possible range of values that the controls can attain, therefore the optimal control function u is sought in the following set of admissible controls   (2.25) Uad := u ∈ L2 0, T ; RNC | u (t) ∈ Kad a.e. , where Kad is a compact and convex set in RNC . We refer to [47] for a proof of existence and uniqueness of a solution y ∈ H 1 (0, T ; Rn ) to the initial-value problem (2.24), for any given control u ∈ Uad . Typically, in quantum control problems, the aim of the control is to steer the system as close as possible to a given target state yT at a final time T . Moreover, one requires that the cost of the control is kept small with respect to some (energy) norm. These modelling requirements lead to the following cost functional NC Z T X 1 J (y, u) := ky(T ) − yT k22 + g(uj (t)) dt, (2.26) 2 j=1 0 where g defines the class of control costs. As in the previous example, we can take g(u) = ν2 u2 + β |u|; this case is investigated in detail in [78]. However, we would like to demonstrate the ability of the SQH method to solve optimal control problems with a discontinuous cost of the control. For this reason, we choose g as follows [54]: ( β |u| if |u| > s ν 2 , g (u) := u + 2 0 else where ν, β > 0 and s > 0; the last term in g represents a discontinuous L1 type cost functional that measures zero control costs if the control is below (componentwise) the given threshold s, whereas it corresponds to L1 -costs for any uj above the threshold.

60

The Sequential Quadratic Hamiltonian Method

In the following example, we consider a specific application of the SQH algorithm for solving a quantum control problem that requires to minimise the cost functional (2.26) – (2.2) in the set (2.25). Example 2.3 The state of a quantum system of two uncoupled spin-1/2 particles can be represented by the quantum density operator whose dynamics is governed by the Liouville–von Neumann master equation [47]. In a real matrix representation, this equation has the structure (2.24), where the matrices A and B are given by [16]     0 0 0 0 0 0 0 −1 0 0 0 0 0 0 −1 0 0 0  1 0 0 0 0 0     0 1 0 0 0 0  0 0 0 0 0 0     A = 2π KA   , B = 2π 0 0 0 0 0 0  .   0 0 0 0 1 0 0 0 0 0 0 −1 0 0 0 −1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 In this formulation, we consider the case NC = 1, and take Kad = [u, u], u < u. Notice that n = 6, and we take KA = 483, as in [16]. The purpose of the control is to steer the system, starting at t = 0 with T the initial configuration y0 = √12 (0, 0, 1, 0, 0, 1) , in order to reach a desired T

target configuration at t = T given by yT = √12 (0, 0, −1, 0, 0, −1) . We take u = −100 and u = 100 and choose T = 0.01. For the parameters of the SQH method implemented in Algorithm 2.3, we choose ζ = 0.9, σ = 1.1, η = 10−5 , κ = 10−8 , the initial guess  = 1, and zero is the initial guess for the control function, u0 = 0. In Figure 2.5, we plot the optimal controls obtained with the SQH scheme for different values of values of β and s, ν fixed. One can see that, as the value of β increases, the size of the time subintervals where |u| takes the value of the threshold s increases. In Figure 2.6, we plot the quantum trajectories corresponding to the optimal control computed with ν = 10−4 , β = 0.1 and s = 20. One can see that the orientation of both spins is reverted. We obtain ky(T ) − yT k22 = 6.9 · 10−4 . In Figure 2.7, we depict the convergence history of the objective functional J and the values of  along the SQH iterations. For this experiment, we report that the convergence criterion is achieved after 668 SQH iterations, whereas the number of successful updates is 353. Although the cost functional of the quantum problem is discontinuous, it is still possible to analyses Step 2 of the SQH algorithm by a case study. In this case, we have to distinguish the case where |u| ≤ s from the case |u| > s. In the former case, we have that H achieves it maximum at     2  uk + (pk )> B y k 1 u = min max −s, ,s . 2 + ν

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

u/100

u/100

The Sequential Quadratic Hamiltonian Method

0

0

-0.2

-0.2

-0.4

-0.4

-0.6

-0.6

-0.8

-0.8

-1

-1 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

t/T

0.5

0.6

0.7

0.8

0.9

1

0.6

0.7

0.8

0.9

1

t/T

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

u/100

u/100

61

0

0

-0.2

-0.2

-0.4

-0.4

-0.6

-0.6

-0.8

-0.8

-1

-1 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

t/T

0.5

t/T

FIGURE 2.5: Optimal controls obtained with the SQH scheme for different values of β and s = 20, ν = 10−4 . From top-left to bottom-right: β = 0.01, β = 0.1, β = 1, β = 10.

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

y

y

In the cases u < −s and u > s, we obtain the following additional two points where H can attain a maximum     2  uk + (pk )> B y k + β 2 u = min max u, , −s 2 + ν

0

-0.2

-0.2

-0.4

-0.4

-0.6

-0.6

-0.8

-0.8

-1

-1 0

0.1

0.2

0.3

0.4

0.5

x/T

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x/T

FIGURE 2.6: Quantum trajectories (all 6 components) corresponding to the optimal control computed with ν = 10−4 , β = 0.1 and s = 20.

62

The Sequential Quadratic Hamiltonian Method 10 0

2

1.8

1.6

1.4

10 -1

epsilon

J

1.2

1

0.8 10 -2 0.6

0.4

0.2 10 -3 0 0

50

100

150

200

250

300

350

400

0

100

SQH iterations

200

300

400

500

600

700

SQH iterations

FIGURE 2.7: Convergence history of J and values of  along the SQH iterations.

and

    2  uk + (pk )> B y k − β u = min max s, ,u . 2 + ν 3

Therefore, in Step 2 of the SQH algorithm, we implement  uk+1 = argmax H t, y k , w, uk , pk . w∈{u1 ,u2 ,u3 }

We remark that, in the case of smooth optimal control problems, the numerical accuracy of the computed optimal control can be quantified by a norm of the reduced gradient. However, this is not possible in the cases considered in the examples above. In these cases, in order to validate the accuracy of the SQH solution (y, u, p), we define the following quantity   ∆H (t) := H (t, y, u, p) − min H (t, y, w, p) . w∈Kad

l Correspondingly, we give the number N% as the percentage of grid points −l where the inequality 0 ≤ ∆H ≤ 10 , l ∈ N, is fulfilled. In Example 2.3, we 2 4 6 8 10 obtain N% = 86.61, N% = 86.41, N% = 86.01, N% = 85.61, N% = 85.41.

2.3

Mixed Control and State Constraints

We devote this section to the development of a SQH algorithm for solving optimal control problems with control and state constraints. In general, for this purpose it is possible to combine the SQH method with different techniques as penalisation [112] (see Section 7.8, in the case of a state-constrained parabolic optimal control problem), the augmented Lagrangian approach [137, 220, 228],

The Sequential Quadratic Hamiltonian Method

63

and the regularisation approach proposed by M.M. Lavrentiev, which turns a state-constrained optimal control problem into a control problem with mixed control and state constraints; see [175, 191]. We focus on the method of ‘penalty estimates’ (PE) investigated in [215] in the case of equality constraints, and further discussed in [74] in the case of inequality constraints in the context of SA schemes. For illustration of the latter case, consider the optimal control problem (1.64). In the PE approach, the r-dimensional inequality φ(t, y(t), u(t)) ≤ 0 does not appear explicitly as a separate constraint but is used to augment the cost functional with the additional term: r Z 2 α X T max{0, φj (t, y(t), u(t)) + cj (t)/α} dt, π(y, u) := 2 j=1 t0 where α > 0, and c(t) = (cj (t), . . . , cr (t)) is a vector function that plays the role of a Lagrange multiplier [215]. In fact, the method of penalty estimates represents an instance of the augmented Lagrangian approach. The combination of the PE and SQH methods results in an iterative procedure with an outer loop that updates the functions cj as follows: ck+1 (t) = max{0, ckj (t) + α φj (t, y k (t), uk (t))}, j

t ∈ [t0 , T ], j = 1, . . . , r,

where k = 0, 1, 2, . . ., denotes the outer loop iteration number. The inner loop corresponds to the SQH iterations to compute the solution (uk , y k ) to the optimal control problem Z T ˜ u) := π k (y, u) + min J(y, `(t, y(t), u(t)) dt + γ(y(T )) t0 0

s.t. y (t) = f (t, y(t), u(t)), u ∈ Uad ,

y(t0 ) = y0 ,

(2.27)

where π k (y, u) denotes π(y, u) with cj = ckj . Concerning stopping criteria for augmented Lagrangian schemes, we refer to, e.g., [10] and references therein. In our SQH-PE algorithm, we stop the outer iteration when the following condition is satisfied k min{−φ(·, y k , uk ), ck }k ≤ ε, where k · k denotes an appropriate norm, and ε > is the required tolerance. Example 2.4 In this example, we consider an optimal control problem similar to Example 2.2 for comparison, and including an inequality constraint. In this case, we resort to our standard notation denoting with y(t) ∈ R3 and u(t) ∈ R2 the state and the control functions, respectively. The inequality path constraint is given by φ(t, y(t), u(t)) := y3 (t) − u2 (t) − 2 ≤ 2.

(2.28)

64

The Sequential Quadratic Hamiltonian Method

Therefore we consider the following augmentation term Z 2 α T π(y, u) = max{0, φ(t, y(t), u(t)) + c(t)/α} dt, 2 0 where α > 0 and c ∈ L∞ (0, T ). The governing model is given by  y 0 (t) = A + u1 (t) B1 + u2 (t) B2 y(t),

y(0) = y0 ,

(2.29)

where 

0 A = ω 0

−ω 0 0

 0 0 , 0



 0 0 −a B1 =  0 0 0  , a 0 0

  0 0 0 B2 = 0 0 −a . 0 a 0

We take the cost functional as follows: J (y, u) :=

ν 1 ky(T ) − yT k22 + kuk2L2 (0,T ) + β kukL1 (0,T ) . 2 2

(2.30)

Our optimal control problem requires to minimise J subject to the differential constraint (2.29) and the inequality constraint (2.28), and the set of admissible values for each control component is the interval Kad = [u, u]. In the SQH PE framework, we construct a sequence of optimal processes, where the kth optimal process (uk , y k ) is the solution to the optimal control ˜ u) := J(y, u) + π k (y, u) subject to (2.29), and π k (y, u) correproblem min J(y, sponds to the function ck . In our calculation, we take c0 = 0 and after every PE step, we perform the update ck+1 (t) = max{0, ck (t) + α φ(t, y k (t), uk (t))},

t ∈ [0, T ].

Based on the results in [215], we can expect that the sequence of optimal processes (uk , y k ) converges to the solution of the original optimal control problem given above. As in the first experiment of Example 2.2, we choose a = 1, ω = 1, y0 = T T (0, 0, 1) , and yT = (1, 0, 0) . We also choose u = −2 and u = 2, and T = 1. The weights of the cost of the control are given by ν = 10−7 and β = 10−7 , and we take α = 10−2 . The parameters of the SQH algorithm 2.3 are as follows: we choose ζ = 0.8, σ = 1.2, η = 10−9 , κ = 10−10 , ε = κ, the initial value  = 1, and zero is the initial guess for the control function, u0 = 0. With this setting, the PE outer iteration converges after 3 steps. In Figure 2.8, we depict the resulting optimal controls and trajectories. Comparing with Figure 2.1 and looking at the second component of the control function, one can see the enforcement of the inequality constraint. Nevertheless, the resulting control is able to drive the system to the desired target. In Figure 2.9, we plot the value of the constraint function φ along the optimal trajectory, and the multiplier c1 , which results from the first update. Notice that we obtain c3 = 0.

The Sequential Quadratic Hamiltonian Method 1.5

65

1 0.8

1 0.6 0.5

0.4 0.2

y

u

0 0

-0.5 -0.2 -0.4

-1

-0.6 -1.5 -0.8 -2

-1 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

t

0.5

0.6

0.7

0.8

0.9

1

t

FIGURE 2.8: Optimal controls and trajectories of the Bloch system with an inequality constraint. 10 -6

4.5

0

-0.2

4

-0.4 3.5 -0.6 3 -0.8 2.5

c

-1

2 -1.2 1.5 -1.4 1 -1.6 0.5

-1.8

-2

0 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

t

0.4

0.5

0.6

0.7

0.8

0.9

1

t

FIGURE 2.9: The constraint function along the optimal trajectory (left), and the multiplier c1 .

2.4

Time-Optimal Control Problems

The computational challenge of time-optimal control problems is manifold in the sense that the time domain where the problem is defined is also an optimisation variable and, consequently so also the numerical grid and discretisation parameters. Moreover, as we have already discussed in Section 1.5, additional endtime conditions appear that need to be satisfied. In the case of a governing model with only a few state components, the preferred solution approach to related time-optimal control problems is to view the optimality system as a boundary value problem and to introduce a transformation of the time variable to map the problem on a fixed interval. Concerning the time transformation [262], corresponding to t ∈ [t0 , T ] one introduces the new variable s = (t − t0 )/(T − t0 ) so that the transformed problem is defined on s ∈ [0, 1]; see also [104] for recent developments of this technique.

66

The Sequential Quadratic Hamiltonian Method With this transformation, we may have an optimality system as follows: y¯0 (s) = (T − t0 ) ∂p H (s, y¯(s), u ¯(s), p¯(s)) , p¯0 (s) = −(T − t0 ) ∂y H(s, y¯(s), u ¯(s), p¯(s)),

y¯(0) = y¯0 p¯(1) = −∂y γ(¯ y (1))

H(1, y¯(1), u ¯(1), p¯(1)) = 0 H(s, y¯(s), u ¯(s), p¯(s)) = max H(s, y¯(s), v, p¯(s)), v∈Kad

where y¯(s) = y(s (T − t0 ) + t0 ) = y(t), and similarly for the other variables. By comparison with, e.g., (1.35) – (1.37), we notice that a value endtime condition appears that provides the additional equation for the unknown that is the endtime T . Now we illustrate these techniques by means of an example taken from [140]; see also Examples 1.3 mentioned earlier. Example 2.5 Our governing model is given by y10 (t) = y2 (t), y20 (t) = u(t),

y1 (0) = 10 y2 (0) = 0.

We consider the following cost functional J(y1 , y2 , u) =

µ 2 ν T + 2 2

Z

T

u2 (t) dt.

0

We also require that, at the final time T , the conditions y1 (T ) = 0 and y2 (T ) = 0 are satisfied. The HP function for this problem is given by H(t, y1 , y2 , u, p1 , p2 ) = p1 y2 + p2 u − µ t −

ν 2 u . 2

The adjoint equations are given by p01 (t) = 0,

p02 (t) = −p1 (t).

The general solution of this system is given by p1 (t) = c1 and p2 (t) = −c1 t + c2 , where c1 and c2 are integration constants to be determined. Further, the optimal control is characterised by ∂u H = p2 − ν u = 0. With this result and the final state condition, we obtain H(T, y1 (T ), y2 (T ), u(T ), p1 (T ), p2 (T )) =

ν 2 u (T ) − µ T = 0. 2

Now, we replace the control by u = p2 /ν in our optimality system, and introduce an additional state v that corresponds to T with the dynamics v 0 (t) = 0. Therefore we have the vector of unknowns z = (y1 , y2 , p1 , p2 , v). Further,

The Sequential Quadratic Hamiltonian Method

67

10

1.5

y1 y2

8

1

6 0.5

y

u

4 0

2 -0.5 0

-1 -2

-1.5

-4 0

1

2

3

4

5

6

7

8

0

1

2

t

3

4

5

6

7

8

t

FIGURE 2.10: Optimal control and endtime (left) and corresponding trajectories obtained solving (2.31)–(2.32). we make the time transformation as illustrated above, and obtain the following differential system: y10 (s) = v(s) y2 (s) y20 (s) = v(s) p2 (s)/ν p01 (s) = 0 p02 (s) = −v(s) p1 (s) v 0 (s) = 0

(2.31)

with the endpoint conditions y1 (0) = 10, y2 (0) = 0, y1 (1) = 0, y2 (1) = 0,

1 2 p (1) − µ v(1) = 0. 2ν 2

(2.32)

The system (2.31) with the conditions (2.32) represents a well-defined time boundary-value problem. It can be solved analytically and the resulting optimal endtime is given by T = (1800 ν/µ)1/5 ; see [140]. The numerical solution of this problem with ν = 0.1 and µ = 0.01 is depicted in Figure 2.10. The optimal time is T = 7.0967. The solution of time-optimal control problems by the boundary-value approach presented above appears to be the method of choice in applications with small- to moderate-size problems. However, for large-size problems and because of the fact that the time transformation introduces a nonlinear dynamics that is not present in the original problem, this methodology may result less robust. In these cases, an alternative approach already discussed in early works on PMP-based methods [162], is an iterative bilevel scheme that combines an exterior loop for the optimisation of the endtime T with an interior loop that solves appropriate optimal control problems on fixed time intervals; see, e.g., [98] for a recent contribution in this field. Now, in order to illustrate one instance of the aforementioned bilevel approach, we consider a free endtime problem with the constraint that the state

68

The Sequential Quadratic Hamiltonian Method

variable at final time lays on a surface K(y(T )) = (K1 (y(T )), . . . , Kd (y(T ))) = 0, d < n. In this case, the transversality condition for the adjoint variable is given by (1.51) and reported below for convenience: p(T ) = −∂y γ(y(T )) − β · ∂y K(y(T )),

(2.33)

where we assume that, in the objective functional, an endpoint cost γ = γ(y(T )) is included. In this setting, the value endtime condition (1.52) becomes H(T, y(T ), u(T ), p(T )) = 0.

(2.34)

Now, in the process of finding the optimal T , we consider the penalty estimates/augmented Lagrangian approach, already introduced in the previous section [74, 137, 215, 228]. In this scheme, the equality K(y(T )) = 0 does not appear as a separate constraint but as an augmenting term of the cost functional. Specifically, if J denotes the original functional to be minimised, the augmented cost functional is given by ˜ u) = J(y, u) + J(y,

d X

λi Ki (y(T )) +

i=1

 B Ki (y(T ))2 , 2

where B > 0 is a given constant, and λi is a vector that eventually plays the role of the Lagrange multiplier β. In the outer loop of the optimisation process, this vector is updated with the following rule λk+1 = λki + B Ki (y(T )), i

i = 1, . . . , d,

where k = 0, 1, 2, . . ., denotes the outer loop iteration number; we take λ0i = 0. At the kth step of the outer loop, we have a tentative endtime T k and, on the interval [t0 , T k ], we solve the following optimal control problem ˜ u) := min J(y,

Z

Tk

`(t, y(t), u(t)) dt t0

+ γ(y(T k )) +

d X

λki Ki (y(T k )) +

i=1 0

s.t. y (t) = f (t, y(t), u(t)), u ∈ Uad .

 B Ki (y(T k ))2 , 2

y(t0 ) = y0 ,

t ∈ [t0 , T k ] (2.35)

Notice that the terminal condition for the adjoint variable corresponding to this problem is given by d h X i B p(T k ) = −∂y γ(y(T k )) + λki Ki (y(T k )) + Ki (y(T k ))2 . 2 i=1

(2.36)

The Sequential Quadratic Hamiltonian Method

69

Once this problem is solved, we can compute the HP function at T k , and if |H(T k , y(T k ), u(T k ), p(T k ))| < , where  > 0 is a given tolerance, then the algorithm is stopped. Otherwise, we update the endtime as follows: T k+1 = T k + δ H(T k , y(T k ), u(T k ), p(T k )), where δ > 0 is a chosen step size. Notice that, at convergence, the terminal condition given in (2.36) converges to the one that can be derived using H(T ) = 0; see formula (1.56). Now, we use the algorithm proposed above to solve the time-optimal control of a glider discussed in [161]. Notice that this problem involves two control variables of different nature: one takes values in an interval, whereas the other can assume only two distinct values. This is an instance of a mixed-integer optimal control problem; see, e.g., [239]. Example 2.6 Consider a glider of mass m in plane motion in a resisting medium of density ρ. We assume a (x, y)-coordinate system where x = x(t) denotes the horizontal position and y = y(t) the vertical position (aligned with the gravity vector pointing upwards) of the glider at time t. The equations of motion of this system are given by x0 (t) = v(t) cos θ(t), y 0 (t) = v(t) sin θ(t), m v 0 (t) = −R − m g sin θ(t), m v(t) θ0 (t) = Y − m g cos θ(t),

(2.37) (2.38)

where v(t) and θ(t) represent the modulus and angle of inclination of the velocity to the x-axis. Further, we have the gravity acceleration g, and R and Y denote the resistance and the lifting force, respectively. We have R=

1 2 ρ v S Cx , 2

Y =

1 2 ρ v S Cy , 2

where S is the characteristic area of the glider, and Cx and Cy are the aerodynamic coefficients. In [161], assuming small values of the angle  of attack α, and introducing the maximum quality K = maxα Cy (α)/Cy (α) and α0 is the angle of attack for which it is attained, one obtains Cx (α) = 1 − cos 2α0 cos 2α,

Cy (α) = K sin 2α0 sin 2α,

(2.39)

which defines a control mechanism in terms of α. Further, suppose that the glider can also change its characteristic area by some relay mechanism as follows:  S = S0 1 + η b , S0 > 0, b > 0, where η can take the values 0 or 1. Therefore α and η are the control functions, (α(t), η(t)) ∈ Kad := [α1 , α2 ] × {0, 1}. The optimal control problem is formulated as follows. Starting with the initial conditions at t = 0 given by x(0) = 0, y(0) = 0, v(0) = v0 > 0, θ(0) = θ0 > 0,

70

The Sequential Quadratic Hamiltonian Method

find (α(t), η(t)) and T such that x(T ) will be maximum when the glider comes back to the ground where y(T ) = 0. At this point, it is convenient to rewrite (2.38) by introducing dimensionless variables while keeping the same notation. We have x0 (t) = v(t) cos θ(t),

y 0 (t) = v(t) sin θ(t),  v 0 (t) = −σ v(t)2 Cx (α(t)) 1 + η(t) b − sin θ(t),  1 θ0 (t) = σ v(t) Cy (α(t)) 1 + η(t) b − cos θ(t). v(t)

We denote the set of dependent variables with z = (x, y, v, θ). The initial conditions at t = 0 are given by x(0) = 0, y(0) = 0, v(0) = 1, θ(0) = θ0 > 0. The parameter σ characterising the ratio of the aerodynamic force to the weight is given by σ = ρSv02 /2mg. Now, we can write our HP function as follows: H(x, y, v, θ, α, η, px , py , pv , pθ ) = px v cos θ + py v sin θ  − pv σ v 2 Cx (α) 1 + η b − pv sin θ  1 + pθ σ v Cy (α) 1 + η b − pθ cos θ. v Notice that the objective functional to be minimised is given by J(z, α, η) = γ(z(T )) = −x(T ) and we have the endpoint condition K(z(T )) := y(T ) = 0. At optimality we have px (T ) = 1, py (T ) = − cot θ(T ), pv (T ) = 0, pθ (T ) = 0. Now, since H does not depend explicitly on x and y, we obtain the adjoint equations p0x (t) = 0 and p0y (t) = 0. Further, the adjoint equations for pv and pθ are given by  p0v (t) = −px (t) cos θ(t) − py (t) sin θ(t) + 2 σ pv (t) v(t) Cx (α(t)) 1 + η(t) b  1 − σ pθ (t) Cy (α(t)) 1 + η(t) b − pθ (t) cos θ(t) v(t)2 p0θ (t) = px (t) v(t) sin θ(t) − py (t) v(t) cos θ(t) 1 + pv (t) cos θ(t) − pθ (t) sin θ(t). v(t)

The Sequential Quadratic Hamiltonian Method

71

1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

t

FIGURE 2.11: Optimal controls for the glider. α is the continuous line, η is the dashed line. Notice that at optimality such that px (T ) = 1 and py (T ) = − cot θ(T ), one can verify that the value of α at which the HP function attains its maximum at t for fixed η is given by α ¯ (t) =

 1 pθ (t) K  arctan tan(2α0 ) . 2 v(t) pv (t)

Thus, with α(t) = α ¯ (t), the choice η(t) = η¯(t) ∈ {0, 1} is made corresponding to the largest value of the HP function. Now, we implement our bilevel scheme with the following initialisation and choice of parameters’ values. We choose T 0 = 0.5 and λ0 = 0 (since d = 1, we omit the index i = 1), further we take B = 103 and δ = dt/10, where dt denotes the numerical time-step size;  = 10−3 . Thus for fixed T = T k , the terminal conditions for the adjoint equations are given by px (T k ) = 1, py (T k ) = −λk − B y(T k ), pv (T k ) = 0, pθ (T k ) = 0. Results of a numerical experiment are presented in Figures 2.11 and 2.12 for the choice v0 = 1, θ0 = 0.25, σ = 0.5, b = 0.2, α0 = 0.175, K = 0.1. For numerical integration of the differential equations we use the midpoint formula, and Nt = 1000 subintervals of the initial interval [0, T ] with T = 0.5. We initialise with α0 = 0.01, η 0 = 0.01, and choose κ = 10−12 and kmax = 10. With this setting, we obtain the final time horizon T = 0.49522 and the horizontal coordinate of the final point is x = 0.47553 and y = −5.96 10−5 . The resulting optimal control functions are plotted in Figure 2.11, the corresponding optimal trajectory is depicted in Figure 2.12.

72

The Sequential Quadratic Hamiltonian Method 0.035

0.5

0.45 0.03 0.4 0.025 0.35 0.02

0.25

y

x

0.3

0.015

0.2 0.01 0.15 0.005 0.1 0

0.05

0

-0.005 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0

0.05

0.1

0.15

0.2

t

0.25

0.3

0.35

0.4

0.45

0.5

0.3

0.35

0.4

0.45

0.5

t

1

0.3

0.995 0.2 0.99 0.1

v

0.985

0.98

0

0.975 -0.1 0.97 -0.2 0.965

0.96

-0.3 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0

0.05

t

0.1

0.15

0.2

0.25

t

FIGURE 2.12: Optimal trajectories of the glider. Clockwise from top-left the components x, y, θ and v.

2.5

Analysis of the SQH Method

In this section, we report theoretical results concerning the wellposedness and convergence properties of the SQH method applied to ODE optimal control problems. These results stem from the work [241] where, subject to some assumptions, it is proved that the sequence of approximations (uk ) generated by the Sakawa-Shindo scheme represents a minimising sequence for the objective functional, and if there exists a subsequence of this sequence that is convergent a.e. to some admissible u, then this function satisfies the PMP optimality conditions. These results are discussed for specific control problems in [240, 242], and further extended in [246] in the case of problems with terminal equality constraints, showing that in the absence of control constraints the sequence of controls is locally convergent. Further in [43] it is proved that, in the case of a convex objective the sequence (uk ) is weakly convergent to an optimal control. In all these works it is assumed that f , ` and γ are C 2 functions and Kad is convex. These results are improved in [30, 250], where it is only required that the functions f , ` and γ and their derivatives with respect to y satisfy a Lipschitz condition in this argument. Further, it is required that f and ` are Lipschitz continuous with respect to u. However, while in [250] it is still required that Kad be convex, this assumption is dropped in [30]. Below, we discuss in more detail some of the results in [30, 43, 241, 250] within the SQH framework.

The Sequential Quadratic Hamiltonian Method

73

The idea of uncoupling the updates of the state and control functions as done in Step 2 of the SQH scheme was mentioned in a remark in [31] for the first time. However, this work still focuses on the original Sakawa-Shindo scheme by putting it in a more abstract framework. Then, the first discussion of a SQH-type update was given in [30]. Twenty years later, independently of [30], this approach was proposed in [52] in the framework of optimal control with parabolic partial differential equations (PDEs) and further investigated in [51, 54, 53]. We remark that in [52] an explicit mechanism for an adaptive choice of  is given corresponding to Step 5 of Algorithm 2.3, and in this reference the name sequential quadratic Hamiltonian method for the resulting optimisation scheme was coined. In the following, we discuss some of the results mentioned above. For this purpose, we consider our optimal control problem given by Z T min J(y, u) := `(t, y(t), u(t)) dt + γ(y(T )) t0

s.t. y 0 (t) = f (t, y(t), u(t)), u ∈ Uad ,

y(t0 ) = y0 ,

(2.40)

where the admissible set Uad corresponds to all Lebesgue measurable controls u in [t0 , T ] with u(t) ∈ Kad a.e., and Kad ⊂ Rm is compact. Similar to [30], we make the following assumption. Assumption 2.1 In any compact subset D included in the domain of definitions of f : R × Rn × Rm → Rn and ` : R × Rn × Rm → R: a) there exists a constant c0 > 0 such that the following growth condition is satisfied independently of t and u ∈ Kad : |y · f (t, y, u)| ≤ c0 (1 + |y|2 );

b) the functions f , ∂y f , `, ∂y ` are measurable in t and Lipschitz continuous in y and u in the sense that there exist constants c1 , c2 , c3 , c4 > 0 such that the following holds uniformly in t:   |f (t, y1 , u1 ) − f (t, y2 , u2 )| ≤ c1 |y1 − y2 | + |u1 − u2 |   |∂y f (t, y1 , u1 ) − ∂y f (t, y2 , u2 )| ≤ c2 |y1 − y2 | + |u1 − u2 |   |`(t, y1 , u1 ) − `(t, y2 , u2 )| ≤ c3 |y1 − y2 | + |u1 − u2 |   |∂y `(t, y1 , u1 ) − ∂y `(t, y2 , u2 )| ≤ c4 |y1 − y2 | + |u1 − u2 | further, suppose that γ and ∂y γ satisfy |γ(y1 ) − γ(y2 )| ≤ c5 |y1 − y2 | |∂y γ(y1 ) − ∂y γ(y2 )| ≤ c6 |y1 − y2 | for some constants c5 , c6 > 0.

74

The Sequential Quadratic Hamiltonian Method

Notice that Assumption 2.1 a) is made to guarantee boundedness of the solution of the governing model; see Section 1.2. With Assumption 2.1, one can prove uniform boundedness and Lipschitz continuity of the control-tostate map y = S(u), u ∈ Uad as follows: there exists a constant My > 0 depending on c0 , |y0 | and (T − t0 ) such that the following holds: |S(u)(t)| ≤ My ,

t ∈ [t0 , T ].

Therefore we can take D ⊆ [t0 , T ] × Y × Kad , where Y = {y ∈ Rn : |y| ≤ My }. Moreover, by using Grönwall inequality one can show that there exists a constant Ly > 0 such that the following holds [241] 1

Z

2

T

|u1 (t) − u2 (t)| dt,

|S(u )(t) − S(u )(t)| ≤ Ly

u1 , u2 ∈ Uad .

(2.41)

t0

Consequently, we also have ˜y |S(u )(t) − S(u )(t)| ≤ L 1

2

2

Z

T

|u1 (t) − u2 (t)|2 dt,

u1 , u2 ∈ Uad . (2.42)

t0

˜ y = (T − t0 ) L2y . where L Similarly and for later use, we introduce a control-to-adjoint map such that for a given u ∈ Uad , and the resulting y = S(u), it gives the solution to the adjoint problem >

p0 (t) = − (∂y f (t, y(t), u(t)))

p(t)+∂y `(t, y(t), u(t)),

p(T ) = −∂y γ(y(T )).

We denote this map with P . Then, subject to the assumptions above, there exists a constant Mp > 0 such that [30] |P (u)(t)| ≤ Mp ,

t ∈ [t0 , T ].

Moreover, similar estimates as (2.41) and (2.42) can be proved for this map; see [241]. With this preparation, we can now discuss the minimising properties of the SQH iterates. We denote y k = S(uk ) and pk = P (uk ), and similarly with the index k + 1. As already mentioned for the SA scheme, we require that the update to the control function obtained with Step 2 of Algorithm 2.3 is Lebesgue measurable, assuming that this holds for the previous iterate uk . Starting with a measurable u0 in Uad , one can prove that this  property holds true if the function (t, w) 7→ H t, y k (t) , w, uk (t) , pk (t) is Lebesgue measurable in t for each w ∈ Kad and is continuous in w for each t ∈ [t0 , T ]; see [226].

The Sequential Quadratic Hamiltonian Method

75

Next, we prove the following theorem stating that, if uk is not already optimal, it is possible to improve the value of the cost functional by an update uk+1 that is obtained in correspondence to a sufficiently large , thus ensuring that a successful update to the control can be found in a finite number of steps. This theorem represents an extension of the one presented in [30].   Theorem 2.1 Let y k , uk and y k+1 , uk+1 be generated by the SQH scheme, and uk , uk+1 be measurable; let the Assumptions 2.1 hold. Then, there exists a θ > 0 independent of , k, and uk such that for the  > 0 currently chosen by Algorithm 2.3, the following holds   (2.43) J y k+1 , uk+1 − J y k , uk ≤ − ( − θ) kuk+1 − uk k2L2 (t0 ,T ;Rm ) .   In particular, it holds J y k+1 , uk+1 − J y k , uk ≤ −η τ for  ≥ θ + η and τ = kuk+1 − uk k2L2 (t0 ,T ;Rm ) . Proof. Recall that in Algorithm 2.3, for a given uk , y k = S(uk ) and pk = P (uk ), we determine a new uk+1 and corresponding y k+1 = S(uk+1 ) with which we compute a new value of the cost functional. Thus, we have   k+1 k+1 k k J y

− J y ,u

,u

T

Z

`(t, y k+1 (t), uk+1 (t)) dt + γ(y k+1 (T )) −

= t0

Z

T

`(t, y k (t), uk (t)) dt − γ(y k (T ))

t0 T

Z =−





H(t, y k+1 (t), uk+1 (t), pk (t)) − H(t, y k (t), uk (t), pk (t)) dt

t0

Z

T

+

h



i

pk (t) f (t, y k+1 (t), uk+1 (t)) − f (t, y k (t), uk (t))

dt

t0

+ γ(y k+1 (T )) − γ(y k (T )).

Next, we add and subtract an integral term with H(t, y k (t), uk+1 (t), pk (t)) and use the forward equation to obtain   J y k+1 , uk+1 − J y k , uk Z T  =− H(t, y k+1 (t), uk+1 (t), pk (t)) − H(t, y k (t), uk+1 (t), pk (t)) dt t0

Z

T

− t0 T

Z +

t0

 h

 H(t, y k (t), uk+1 (t), pk (t)) − H(t, y k (t), uk (t), pk (t)) dt

pk (t)

i d  k+1 y (t) − y k (t) dt + γ(y k+1 (T )) − γ(y k (T )). dt

(2.44)

Now, we elaborate on the first term, and in order to ease notation, whenever possible we omit to write the time variable, which is considered fixed.

76

The Sequential Quadratic Hamiltonian Method

By Assumption 2.1 b), we have   − H(t, y k+1 , uk+1 , pk ) − H(t, y k , uk+1 , pk ) Z 1 =− ∂y H(t, y k + s (y k+1 − y k ), uk+1 , pk ) (y k+1 − y k ) ds 0

= −∂y H(t, y k , uk , pk ) (y k+1 − y k ) Z 1h i − ∂y H(t, y k + s (y k+1 − y k ), uk+1 , pk ) − ∂y H(t, y k , uk , pk ) (y k+1 − y k ) ds 0

d   pk y k+1 − y k ≤ dt Z 1 + |∂y `(t, y k + s (y k+1 − y k ), uk+1 ) − ∂y `(t, y k , uk )| |y k+1 − y k | ds 0

Z +

1

|∂y f (t, y k + s (y k+1 − y k ), uk+1 ) − ∂y f (t, y k , uk )| |pk | |y k+1 − y k | ds.

0

where for the last inequality we have used the adjoint equation p0 = −∂y H and the fact that H = pf − `. We continue the calculation above using the Lipschitz conditions in Assumption 2.1 b). We have   d   − H(t, y k+1 , uk+1 , pk ) − H(t, y k , uk+1 , pk ) ≤ pk y k+1 − y k dt Z 1   + c4 |s (y k+1 − y k )| + |uk+1 − uk | |y k+1 − y k | ds 0

Z +

1

  c2 |s (y k+1 − y k )| + |uk+1 − uk | |pk | |y k+1 − y k | ds.

0

d   1 pk y k+1 − y k + (c4 + c2 Mp ) |y k+1 − y k |2 ≤ dt 2 + (c4 + c2 Mp ) |uk+1 − uk | |y k+1 − y k | Z T d   1 ˜y ≤ pk y k+1 − y k + (c4 + c2 Mp ) L |uk+1 (t) − uk (t)|2 dt dt 2 t0 Z T k+1 k + (c4 + c2 Mp ) |u − u | Ly |uk+1 (t) − uk (t)| dt t0

The Sequential Quadratic Hamiltonian Method

77

By inserting this estimate in (2.44), we obtain   J y k+1 , uk+1 − J y k , uk Z T  d k  k+1 ≤ p (t) y (t) − y k (t) dt dt t0 Z Th i d  k+1 y (t) − y k (t) dt + γ(y k+1 (T )) − γ(y k (T )) + pk (t) dt t0 Z T Z T 1 ˜y + (c4 + c2 Mp ) L |uk+1 (t) − uk (t)|2 dt 2 t0 t0 Z T  + (c4 + c2 Mp ) |uk+1 (s) − uk (s)| Ly |uk+1 (t) − uk (t)| dt ds t0

Z

T



  H(t, y k (t), uk+1 (t), pk (t)) − H(t, y k (t), uk (t), pk (t)) dt

t0

Next, we perform integration by parts of the first term and combine with the second term. Moreover, we use the following equality at t = T : Z 1 γ(y k+1 ) − γ(y k ) = ∂y γ(y k + s (y k+1 − y k )) (y k+1 − y k ) ds, 0

and recall that pk (T ) = −∂y γ(y k (T )). Thus, with the Assumption 2.1 b), we obtain the intermediate estimate Z T  1 k+1 k 2 k k+1 k k+1 k p (T ) y

(T )−y (T ) +γ(y

(T ))−γ(y (T )) ≤

2

˜y c6 L

|u

(t)−u (t)| dt.

t0

Hence, we have   J y k+1 , uk+1 − J y k , uk Z T 1 ≤ c6 L2y (T − t0 ) |uk+1 (t) − uk (t)|2 dt 2 t0 1 Z T L2y (T − t0 )2 + Ly (T − t0 ) |uk+1 (t) − uk (t)|2 dt + (c4 + c2 Mp ) 2 t0 Z T  − H(t, y k (t), uk+1 (t), pk (t)) − H(t, y k (t), uk (t), pk (t)) dt. t0

Now, recall Step 2 in Algorithm 2.3, which guarantees that 2 −H(t, y k (t), uk+1 (t), pk (t)) + H(t, y k (t), uk (t), pk (t)) ≤ − uk+1 (t) − uk (t) .

78

The Sequential Quadratic Hamiltonian Method Therefore we obtain J y

k+1

k+1

,u



k

k

− J y ,u



≤ θ−



Z

T

|uk+1 (t) − uk (t)|2 dt,

t0

where θ=

1  1 c6 L2y (T − t0 ) + (c4 + c2 Mp ) L2y (T − t0 )2 + Ly (T − t0 ) . 2 2

Thus the theorem is proved. As stated in this theorem, we see that θ does not depend on the kth iterate being considered. As a consequence, we have the following theorem, where we assume that our SQH algorithm generates an infinite sequence if we drop the given stopping criteria. Theorem 2.2 Let the assumptions of Theorem 2.1 be fulfilled. If in Algorithm 2.3, at every kth iterate,  = θ + η is chosen, then the following holds: ˆ k ))k=0,1,2,... is monotonically decreasing and converges a) the sequence (J(u ∗ ˆ ˆ to some J ≥ inf u∈Uad J(u); b) it holds limk→∞ kuk+1 − uk kL2 (t0 ,T ;Rm ) = 0. Proof. The first statement follows from (2.43). To prove b), we rewrite (2.43) as follows: 1 h ˆ k  ˆ k+1  i kuk+1 − uk k2L2 (t0 ,T ;Rm ) ≤ J u −J u . η Therefore we have the partial sum K X k=0

kuk+1 − uk k2L2 (t0 ,T ;Rm ) ≤

1 h ˆ 0  ˆ K+1  i J u −J u , η

which shows that in the limit K → ∞ the series with positive elements kuk+1 − uk k2L2 is convergent, hence b) is proved. We remark that the last result guarantees the validity of the stopping criteria requiring τ ≤ κ. As emphasised in [30], these results are obtained without requiring convexity of the set Kad nor concavity with respect to u of the function H. However, these cases are difficult to be analysed in a general context. In fact, it is possible to formulate problems where Kad is not convex and H is concave, and vice versa problems where Kad and H are convex, such that no solution exists in the sense considered in this chapter; see, e.g., the example in [30]. However, assuming that Kad is convex, and supposing that in the SQH iterates  remains bounded, we can prove the following theorem that extends a result in [43, 241, 250] to our SQH setting.

The Sequential Quadratic Hamiltonian Method

79

Theorem 2.3 Let the assumptions of Theorem 2.2 be fulfilled, and Kad be compact and convex, and assume that f and ` are continuously differentiable with respect to u. Further, suppose that  ≤ ¯, for some ¯ > 0, for all SQH iterates. Then the sequence (uk )k=0,1,2,... , generated by the SQH scheme, asymptotically satisfies the first-order necessary optimality conditions for the maximisation of the HP function, in the sense that   lim kuk − ΠK uk + ∂u H(·, y k , uk , pk ) kL2 (t0 ,T ;Rm ) = 0, (2.45) k→∞

where ΠK (v) denotes the projection of the element v ∈ Rm on the convex set K satisfying |v − ΠK (v)| = dK (v) := inf w∈K |v − w|, and K := Kad . Proof. In Step 2 of the SQH Algorithm 2.3, if  uk+1 (t) = argmax H t, y k (t), w, uk (t), pk (t) , w∈Kad

then uk+1 (t) must necessarily satisfy the following equation   uk+1 (t) = ΠK uk+1 (t) + ∂u H t, y k (t), uk+1 (t), uk (t), pk (t) ,

k > 0.

Now, consider the function   E1 (t) = uk+1 (t) − ΠK uk+1 (t) + ∂u H t, y k (t), uk+1 (t), uk (t), pk (t)   = uk+1 (t) − ΠK uk+1 (t) + ∂u H t, y k (t), uk+1 (t), pk (t)  − 2 uk+1 (t) − uk (t) = 0. Next, we consider the L2 -norm of the difference of E1 with the following function   E2 (t) = uk+1 (t) − ΠK uk+1 (t) + ∂u H t, y k+1 (t), uk+1 (t), pk+1 (t) . Hence, using the fact that the projection operator is a contraction, we obtain kE1 − E2 kL2 (t0 ,T ;Rm ) ≤ 2 ¯ kuk+1 − uk kL2 (t0 ,T ;Rm )   + k∂u H ·, y k , uk+1 , pk − ∂u H ·, y k+1 , uk+1 , pk+1 kL2 (t0 ,T ;Rm ) ≤ 2 ¯ kuk+1 − uk kL2 (t0 ,T ;Rm )   + k∂u H ·, y k , uk+1 , pk − ∂u H ·, y k+1 , uk+1 , pk kL2 (t0 ,T ;Rm )   + k∂u H ·, y k+1 , uk+1 , pk − ∂u H ·, y k+1 , uk+1 , pk+1 kL2 (t0 ,T ;Rm ) . Therefore, since ∂u H is Lipschitz continuous with respect to the y and p arguments, by (2.42) and the similar estimate for P , and the fact that limk→∞ kuk+1 − uk kL2 (t0 ,T ;Rm ) = 0, we obtain lim kE1 − E2 kL2 (t0 ,T ;Rm ) = 0.

k→∞

80

The Sequential Quadratic Hamiltonian Method

This result and E1 (t) = 0 a.e. in [t0 , T ] prove the validity of (2.45). As remarked in [250], this result coincides with Ekeland’s variational principle [105] characterising controls that almost minimise the cost functional. However, in this case two situations are possible: on the one hand, in the case that an optimal control exists only in a generalised (relaxed) sense, the sequence (uk )k=0,1,2,... approaches inf Jˆ by a chattering control; on the other hand, if an optimal control exists and for some k we have that uk is sufficiently close to this control, then the sequence will converge to it. In fact, if the sequence (uk )k=0,1,2,... converges to some control in the sense that lim uk (t) = u ˆ(t),

k→∞

almost everywhere in [t0 , T ], then u ˆ and the corresponding yˆ = S(ˆ u) satisfy the PMP conditions in L2 (t0 , T ; Rm ) (with pˆ = P (ˆ u)). We illustrate the first situation with the following example taken from [30]. The development of a SQH method for computing relaxed controls is discussed in the next chapter. Example 2.7 Consider the following nonconvex optimal control problem Z  1 1 2 y (t) − u2 (t) dt min J(y, u) := 2 0 s.t. y 0 (t) = u(t), y(0) = 0, u(t) ∈ [−1, 1] This problem admits no optimal control; however, we have inf Jˆ = −1/2. We apply the SQH Algorithm 2.3 to this problem using the following setting: ζ = 0.8, σ = 1.01, η = 10−12 , κ = 10−12 , the initial value  = 0.1, and zero is the initial guess for the control function. The SQH scheme converges after 129 iterations, and the resulting control, the corresponding state (approximately equal to zero), and the minimisation history of J are depicted in the Figures 2.13 and 2.14. Related to the second situation mentioned above is the following result from [43], asserting that, in case of a convex objective, the sequence (uk ) is weakly convergent to an optimal control. Theorem 2.4 Let the assumptions of Theorem 2.3 be fulfilled, and suppose that Jˆ is a convex lower semi-continuous differentiable functional. Then there exists a subsequence of the sequence (uk )k=0,1,2,... , that converges weakly in L2 (t0 , T ; Rm ) to an optimal control. Proof. The sequence (uk )k=0,1,2,... ⊂ Uad is bounded, since Uad is a closed convex bounded set in the Hilbert space U = L2 (t0 , T ; Rm ). Therefore we can

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

y

u

The Sequential Quadratic Hamiltonian Method

0

-0.2

-0.2

-0.4

-0.4

-0.6

-0.6

-0.8

-0.8

-1

81

-1 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

t

0.5

0.6

0.7

0.8

0.9

1

t

FIGURE 2.13: The case of a chattering control (left), and the corresponding trajectory. extract a subsequence, also denoted with (uk )k=0,1,2,... , that converges weakly in U to some u ∈ Uad . Now, to prove that u is optimal, define the following   v k = ΠK uk + ∂u H(·, y k , uk , pk ) . Then it must hold  v k − uk − ∂u H(·, y k , uk , pk ), w − v k ≥ 0,

w ∈ Uad .

Next, we write this inequality in an expanded form and omit the arguments of ∂u H except uk . We have    v k −uk , w−v k + −∂u H(uk ), w−uk + −∂u H(uk ), uk −v k ≥ 0, w ∈ Uad ,  where we have added and subtracted − ∂u H(uk ), uk . As stated in Theorem 2.3, we have limk→∞ kuk − v k kL2 (t0 ,T ;Rm ) = 0, hence it follows that  lim inf − ∂u H(uk ), w − uk ≥ 0, w ∈ Uad . k→∞

0

-0.05

-0.1

-0.15

J

-0.2

-0.25

-0.3

-0.35

-0.4

-0.45

-0.5 0

20

40

60

80

100

120

140

SQH iterations

FIGURE 2.14: Convergence history of J in the case of a chattering control.

82

The Sequential Quadratic Hamiltonian Method Since Jˆ is convex, and taking the limit k → ∞, we have  ˆ ˆ k ) + − ∂u H(uk ), w − uk J(w) ≥ J(u h i ˆ k ) + − ∂u H(uk ), w − uk ≥ lim inf J(u k→∞

ˆ k ) ≥ J(u), ˆ ≥ lim inf J(u k→∞

where the last inequality follows from the convexity and lower semi-continuity ˆ implying that it is weakly lower semi-continuous. Therefore u property of J, is optimal because ˆ ˆ J(u) ≤ J(w),

w ∈ Uad .

We conclude this section mentioning the recent work [201] that focuses on the analysis of proximal gradient methods applied to nonsmooth problems. One can recognise that, subject to appropriate convexity and differentiability assumptions on the HP function, the analytical tools developed in this work can be extended to the SQH method in order to prove its pointwise a.e. and strong convergence to a global optimal control.

Chapter 3 Optimal Relaxed Controls

3.1 3.2 3.3 3.4

Young Measures and Optimal Relaxed Controls . . . . . . . . . . . . . . . . The Sequential Quadratic Hamiltonian Method . . . . . . . . . . . . . . . . . The SQH Minimising Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An Application with Two Relaxed Controls . . . . . . . . . . . . . . . . . . . . .

83 85 91 97

In optimal control theory, relaxation of a given optimal control problem consists in defining a related problem having the same infimum and admitting a minimiser in a larger functional space. This chapter illustrates relaxation in the framework of Young measures and discusses a sequential quadratic hamiltonian method for computing optimal relaxed controls in this functional framework. Results of numerical experiments are presented that validate the ability of the SQH method to solve different optimal relaxed control problems.

3.1

Young Measures and Optimal Relaxed Controls

In the previous chapter, in Example 2.7 we have seen a case where the running-cost function ` is nonconvex in u and an optimal control in Uad does not exist, although inf J is finite. Furthermore, we have seen that in this case the application of the SQH method results in a so-called chattering control function that allows to achieve a value of the cost functional close to its infimum in accordance to Ekeland’s variational principle. In this chapter, we illustrate how the chattering control can be understood in a framework that enlarges the space of controls. This relaxation procedure, which originates in the realm of the calculus of variation, was pursued by L. C. Young to address situations similar to that illustrated above; see [285] and the references therein. A main result of Young’s pioneering work is that given a sequence of measurRT able functions zk : [t0 , T ] → R, k ∈ N, such that supk t0 g(|zk (t)|) dt < ∞, where g is a continuous, nondecreasing function with limt→∞ g(t) = ∞, there exists a subsequence (zm ) of (zk ) and a family of regular probability measures ¯ is given by the expected (νt ) on R, such that the weak limit φ(·, zm (·)) * φ(·) value Z ¯ = φ(t) φ(t, v) dνt (v), a.e. t ∈ [t0 , T ], (3.1) R

DOI: 10.1201/9781003152620-3

83

84

The Sequential Quadratic Hamiltonian Method

for any Carathéodory function φ : [t0 , T ] × R → R. The family of probability measures (νt ) is called the Young measure; see also [20, 68, 111, 160, 209, 230, 279, 285] (an incomplete list) for more general definitions and much more details. In view of Example 2.7 mentioned above and a comment in [20], one can realise that the Young measure can be thought of as giving the limiting probability distribution of the values of zm at almost all t as m → ∞. This process occurs clearly considering minimising sequences of controls for problems with cost functionals that are bounded from below but do not attain their minimum. Furthermore, one can see by construction that the functions zm become increasingly oscillatory as m → ∞; see [279, 285]. A way to connect certain Young measures to regular functions is to interpret a Young measure associated to a measurable function as the unique measure centred on its graph. In particular, we may identify a regular control u as the family of measures with νt (v) = δ(v − u(t)) on [t0 , T ], where δ denotes the Dirac delta. In general, notice that νt is a probability measure so that νt (v) ≥ 0, v ∈ Kad and νt (Kad ) = 1 almost everywhere in [t0 , T ]; see [111]. We refer to ν(·) as a relaxed control. In the framework of Young measures, a relaxed version of our standard optimal control problem (2.40) is formulated as follows [111]: T

Z

Z

min J(y, ν) :=

`(t, y(t), v) dνt (v) dt + γ(y(T )) t

s.t.

0

Z0

y (t) =

Kad

f (t, y(t), v) dνt (v),

(3.2) y(t0 ) = y0 .

Kad

In this framework, the control set Kad is the space of probability measures defined in the dual of C(Kad ), and the admissible control space Uad is the space of relaxed controls consisting of all probability measure valued ν(·) in L1 (t0 , T ; C(Kad ))∗ . In this context, we refer to [111] for a proof of the relaxation theorem stating that for ν(·) ∈ Uad such that the Cauchy problem in (3.2) has a solution yν in [t0 , T ], there exists a corresponding piecewise constant function u such that the resulting solution yu exists in [t0 , T ] and approximates pointwise yν to any desired accuracy  > 0, i.e. |yν (t) − yu (t)| < , t ∈ [t0 , T ]. While we refer to [111] for a more detailed discussion, we remark that an appropriate setting for proving existence of relaxed controls and for their characterisation in the PMP framework is given by Assumption 2.1. Then, we have that J(y, ν) is weakly lower semicontinuous, and correspondingly it is proved [111] that there exists an optimal relaxed control ν(·) ∈ Uad . This proof is based on Tonelli’s direct method of the calculus of variation n * ν(·) , and the fact that the space of that involves minimising sequences ν(·) relaxed controls is compact in the weak-* topology. In this new framework, the Pontryagin maximum principle can be proved for a large variety of control problems, starting with an extended HP function;

Optimal Relaxed Controls

85

see, e.g., [111, 230, 279]. In particular in our case, the relaxed HP function is given by Z Z H(t, y, ν, p) = p f (t, y, v) dν(v) − `(t, y, v) dν(v), Kad

Kad

where µ ∈ Kad . Correspondingly, the PMP optimality conditions for (3.2) are as follows: y 0 (t) = ∂p H (t, y(t), νt , p(t)) , y(t0 ) = y0 0 p (t) = −∂y H(t, y(t), νt , p(t)), p(T ) = −∂y γ(y(T )) H(t, y(t), νt , p(t)) = max H(t, y(t), µ, p(t)) a. e. [t0 , T ]. µ∈Kad

(3.3) (3.4) (3.5)

If (y ∗ , p∗ , ν ∗ ) is a solution of the relaxed optimal control problem, then this triple must satisfy these PMP conditions.

3.2

The Sequential Quadratic Hamiltonian Method

It is clear that the sequential quadratic Hamiltonian method introduced in the previous chapter cannot be directly applied to compute relaxed controls. For this purpose, in accordance with the PMP formulation in this framework, we discuss an extension of the SQH method that has been proposed in [13]. In this work, the assumption is made that the optimal relaxed control sought is absolutely continuous such that one can express the probability measure in terms of a probability density function (PDF). Therefore we can write dνt (v) = ν(t, v) dv. The next step is to construct a suitable augmented HP function having the following structure Z Z H (t, y, ν, µ, p) = p f (t, y, v) ν(v) dv− `(t, y, v) ν(v) dv− g (ν(·), µ(·)) . Kad

Kad

(3.6) A seemingly obvious extension of the previous SQH method to relaxed controls leads to the following choice Z g (ν(t, ·), µ(t, ·)) = |ν(t, v) − µ(t, v)|2 dv. (3.7) Kad

However, this choice is problematic because g given by (3.7) is not weak-* continuous. For this reason, in [13] a new augmentation term is proposed that corresponds to the Kullback-Leibler (KL) divergence as follows [163]: Z g (ν(t, ·), µ(t, ·)) := ν(t, v) log(ν(t, v)/µ(t, v)) dv. (3.8) Kad

86

The Sequential Quadratic Hamiltonian Method

With this setting, we have the following SQH method for solving relaxed optimal control problems. Algorithm 3.1 (SQH method for relaxed controls) Input: initial approx. ν 0 ∈ Uad , max. number of iterations kmax , tolerance κ > 0,  > 0, σ > 1, η > 0, and ζ ∈ (0, 1); set τ > κ, k := 0. Compute the solution y 0 to the forward problem Z y 0 (t) = f (t, y(t), v) ν k (t, v) dv, y(t0 ) = y0 . Kad

while (k < kmax && τ > κ ) do 1) Compute the solution pk to the adjoint problem Z  > 0 p (t) = − ∂y f (t, y k (t), v) ν k (t, v) dv p(t) Kad Z + ∂y `(t, y k (t), v) ν k (t, v) dv Kad

with terminal condition p(T ) = −∂y γ(y k (T )). 2) Determine ν k+1 ∈ Uad such that   H t, y k (t), ν k+1 (t), ν k (t), pk (t) = max H t, y k (t), µ, ν k (t), pk (t) µ∈Kad

for almost all t ∈ [t0 , T ]. 3) Compute the solution y k+1 to the forward problem Z 0 y (t) = f (t, y(t), v) ν k+1 (t, v) dv, y(t0 ) = y0 . Kad

4) Compute τ := kν k+1 − ν k k2L1 ([t0 ,T ]×Kad ) .   5) If J y k+1 , ν k+1 − J y k , ν k > −η τ , then increase  with  = σ  and go to Step 2.   Else if J y k+1 , ν k+1 − J y k , ν k ≤ −η τ , then decrease  with  = ζ  and continue. 6) Set k := k + 1. end while In this algorithm, we need to specify how to perform Step 2. For this purpose, we consider the gradient of H as follows: Z   (∇H (ν), δν) = H(v) −  log(ν(v)/µ(v)) + 1 δν(v) dv, (3.9) Kad

Optimal Relaxed Controls

87

where H(v) := H(t, y(t), v, p(t)) = p(t) f (t, y(t), v) − `(t, y(t), v), and we omit to write t, y(t), and p(t), which are assumed fixed. Further, for the Hessian we have Z δν(v)2 (∇2 H (ν) δν, δν) = − dv ≤ 0. (3.10) Kad ν(v) This result implies concavity of the augmenting term. With this setting, we obtain ν from the given µ by requiring that it solves the gradient equation ∇H (ν) = 0. Thus, we have H(v)/ − 1 = log(ν(v)/µ(v)). Consequently, the update in Step 2 of the SQH algorithm can be performed using the following formula  ν(t, v) = C(t) µ(t, v) exp H(t, y(t), v, p(t))/ , (3.11) where

h i C(t) = 1/Eµ exp H(t, y(t), ·, p(t))/ .

In this expression, we make use the following standard notation Z h i Eµ φ(t, ·) := φ(t, v) µ(t, v) dv,

(3.12)

Kad

which represents the expected value of φ with respect to the density µ and for a fixed t. We see that the resulting update of the Young measure is nonnegative, and the factor C(t) provides the normalisation. Next, we consider different optimal control problems and discuss results of the related experiments where we use the SQH Algorithm 3.1 given above. In this algorithm, the differential equations are approximated by an explicit Euler scheme on a uniform time grid with Nt = 200 subintervals, and integration on Kad is implemented with rectangular quadrature on a uniform mesh with Nv = 400 subintervals. We choose the initial ν 0 to be the uniform density, and we initialise  = 1. Further, we choose σ = 1.1, ζ = 0.9, η = 10−2 , and κ = 10−6 . We denote IT = [0, T ]. In the first experiment, we consider the relaxed version of the optimal control problem formulated in Example 2.7, where a chattering control function appears. Notice that control problems with oscillations are very frequently considered in the framework of relaxed controls; see, e.g., [136]. In Figure 3.1, we report results of this experiment showing the computed optimal relaxed control and its mean for all t, and the corresponding state function. We also see that the cost functional J quickly attains its minimum. Clearly, regular controls can be seen as a special case of relaxed controls and as such they can be computed with the SQH Algorithm 3.1. This fact is illustrated with the following tracking problem. Example 3.1 Consider f (t, y, u) = u y,

`(t, y, u) =

1 2 (y − yd ) + α u2 , 2

γ ≡ 0.

(3.13)

88

The Sequential Quadratic Hamiltonian Method 1 0.8 0.6 0.4

u mean

0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

t -0.15

1 0.8

-0.2

0.6 -0.25 0.4 -0.3

0

J

y

0.2

-0.2

-0.35

-0.4

-0.4 -0.45 -0.6 -0.5 -0.8 -0.55

-1 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

20

40

60

80

100

120

140

160

SQH iterations

t

FIGURE 3.1: Solution of the relaxed version of the control problem of Example 2.7: from top-left to bottom-right, the optimal relaxed control, the mean of the optimal relaxed control, the controlled state and the cost functional. We have α = 0.01, Kad = [−3, 3], IT = [0, 1] and y0 = 1. The desired trajectory is given by yd (t) = 1 − cos(2πt)/2. The solution to this problem, obtained with the SQH method, is depicted in Figure 3.2. Next, we consider a similar tracking problem, but with a nonconvex cost of the control as follows. Example 3.2 Consider f (t, y, u) = u y,

`(t, y, u) =

1 2 (y − yd ) + α (u2 − 1)2 , 2

γ ≡ 0. (3.14)

We have α = 10, Kad = [−3, 3], IT = [0, 1] and y0 = 1. The desired trajectory is given by yd (t) = 1 − cos(2πt)/2. The solution to this problem is depicted in Figure 3.3. Next, we consider the following problem with bang-bang control. Example 3.3 Consider f (t, y, u) = u y,

`(t, y, u) = −y (1 − u)/10,

γ ≡ 0.

(3.15)

Optimal Relaxed Controls

89

3

2

u mean

1

0

-1

-2

-3 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

t 0.3

1.1

1.08 0.25 1.06

0.2

y

J

1.04

1.02

0.15

1 0.1 0.98

0.05

0.96 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

100

t

200

300

400

500

600

700

800

SQH iterations

FIGURE 3.2: Solution of the control problem of Example 3.1: from top-left to bottom-right, the optimal relaxed control, the mean of the optimal relaxed control, the controlled state and the cost functional. We take Kad = [0, 1], IT = [0, 2] and y0 = 0.5. In this case, the regular optimal control is the bang-bang control given by  1 0≤t 0 such that the following growth condition is satisfied independently of t and u : |y · f (t, y, u)| ≤ c0 (1 + |y|2 );

b) the functions f , ∂y f , `, ∂y ` are continuous in t and u, and Lipschitz continuous in y in the sense that there exist constants c1 , c2 , c3 , c4 > 0 such that the following holds |f (t, y1 , u) − f (t, y2 , u)| ≤ c1 |y1 − y2 | |∂y f (t, y1 , u) − ∂y f (t, y2 , u)| ≤ c2 |y1 − y2 | |`(t, y1 , u) − `(t, y2 , u)| ≤ c3 |y1 − y2 | |∂y `(t, y1 , u) − ∂y `(t, y2 , u)| ≤ c4 |y1 − y2 |, further, suppose that γ and ∂y γ satisfy |γ(y1 ) − γ(y2 )| ≤ c5 |y1 − y2 | |∂y γ(y1 ) − ∂y γ(y2 )| ≤ c6 |y1 − y2 | for some constants c5 , c6 > 0.

Optimal Relaxed Controls

93

We remark that these assumptions are similar to the Assumptions 2.1; however, a Lipschitz condition on the argument u is not required. Continuity with respect to all arguments guarantees boundedness on D of the functions involved. With these assumptions and (3.18), it is possible to prove in a similar way the same properties of the control-to-state map y = S(ν) already discussed in Section 2.5. In particular, we have uniform boundedness of the control-to-state map y = S(ν), ν ∈ Uad , that is, there exists a constant My > 0 depending on c0 , |y0 | and (T − t0 ) such that the following holds |S(ν)(t)| ≤ My ,

t ∈ [t0 , T ].

Further, by the Assumptions 3.1 and using Grönwall inequality, it follows that there exists a constant Ly > 0 such that the following holds |S(ν 1 )(t) − S(ν 2 )(t)| ≤ Ly kν 1 − ν 2 kL1 ([t0 ,T ]×Kad ) ,

ν 1 , ν 2 ∈ Uad . (3.20)

We also have a constant Mp > 0 that provides the following bound for the control-to-adjoint map |P (ν)(t)| ≤ Mp ,

t ∈ [t0 , T ].

Next, we prove the following theorem stating that, if ν k is not already optimal, it is possible to improve the value of the relaxed cost functional by determining a ν k+1 corresponding to a sufficiently large . We remark that the assumptions above are weaker than those in [13].   Theorem 3.1 Let y k , ν k and y k+1 , ν k+1 be generated by the SQH method, and suppose ν k and ν k+1 are PDFs; let the Assumptions 3.1 hold. Then, there exist positive constants ς and θ independent of , such that for the  > 0 currently chosen by Algorithm 3.1 the following holds   J y k+1 , ν k+1 − J y k , ν k ≤ −ς ( − θ) kν k+1 − ν k k2L1 ([t0 ,T ]×Kad ) . (3.21)   In particular, it holds J y k+1 , ν k+1 − J y k , ν k ≤ −η τ for  ≥ θ + η/ς and τ = kν k+1 − ν k k2L1 ([t0 ,T ]×Kad ) . Proof. In Algorithm 3.1, for a given ν k , and y k = S(ν k ) and pk = P (ν k ), we determine a new ν k+1 and y k+1 = S(ν k+1 ) with which we compute a new

94

The Sequential Quadratic Hamiltonian Method

value of the cost functional. We have   J y k+1 , ν k+1 − J y k , ν k Z Z T h i Eν k+1 `(t, y k+1 , ·) dt + γ(y k+1 (T )) − =

T

h i Eν k `(t, y k , ·) dt − γ(y k (T ))

t0

t0 T

Z =−



h Eν k+1

i h i H(t, y k+1 , ·, pk ) − Eν k H(t, y k , ·, pk ) dt

t0

Z

T

+

h h i ii h  dt pk Eν k+1 f (t, y k+1 , ·) − Eν k f (t, y k , ·)

t0

+ γ(y k+1 (T )) − γ(y k (T )). h i Next, we add and subtract an integral term with Eν k+1 H(t, y k , ·, pk ) , use the h i forward equation, and recall that Eν H(t, y, ·, p) = H(t, y, ν, p). We obtain   J y k+1 , ν k+1 − J y k , ν k Z T  =− H(t, y k+1 , ν k+1 , pk ) − H(t, y k , ν k+1 , pk ) dt t0

Z

T

− t0 T

Z +

t0

 h

 H(t, y k , ν k+1 , pk ) − H(t, y k , ν k , pk ) dt

pk

i d  k+1 y − y k dt + γ(y k+1 (T )) − γ(y k (T )). dt

(3.22)

This expression is equivalent to (2.44) in Theorem 2.1. In fact, we proceed with similar consideration as follows:   − H(t, y k+1 , ν k+1 , pk ) − H(t, y k , ν k+1 , pk ) Z 1 =− ∂y H(t, y k + s (y k+1 − y k ), ν k+1 , pk ) (y k+1 − y k ) ds 0

= −∂y H(t, y k , ν k , pk ) (y k+1 − y k ) Z 1h i − ∂y H(t, y k + s (y k+1 − y k ), ν k+1 , pk ) − ∂y H(t, y k , ν k , pk ) (y k+1 − y k ) ds 0

d   ≤ pk y k+1 − y k dt Z 1 h i h i + |Eν k+1 ∂y `(t, y k + s (y k+1 − y k ), ·) − Eν k ∂y `(t, y k , ·) | |y k+1 − y k | ds 0

Z + 0

1

h i h i |Eν k+1 ∂y f (t, y k + s (y k+1 − y k ), ·) − Eν k ∂y f (t, y k , ·) |pk | |y k+1 − y k | ds.

Optimal Relaxed Controls

95

Now, using the Lipschitz conditions in Assumption 3.1, the boundedness of ∂y ` and ∂y f , and (3.20), we have   d   − H(t, y k+1 , ν k+1 , pk ) − H(t, y k , ν k+1 , pk ) ≤ pk y k+1 − y k dt Z 1  + c4 |s (y k+1 − y k )| + c` kν k+1 − ν k k |y k+1 − y k | ds 0

Z +

1



 c2 |s (y k+1 − y k )| + cf kν k+1 − ν k k |pk | |y k+1 − y k | ds.

0

d   1 pk y k+1 − y k + (c4 + c2 Mp ) |y k+1 − y k |2 ≤ dt 2 + (c` + cf Mp ) kν k+1 − ν k k |y k+1 − y k | d    1 pk y k+1 − y k + (c4 + c2 Mp ) L2y + (c` + cf Mp ) Ly kν k+1 − ν k k2 . ≤ dt 2 By inserting this estimate in (3.22), we obtain Z T    d k  k+1 (t) − y k (t) dt J y k+1 , ν k+1 − J y k , ν k ≤ p (t) y dt t0 Z Th  i d + pk (t) y k+1 (t) − y k (t) dt + γ(y k+1 (T )) − γ(y k (T )) dt t0  1 (c4 + c2 Mp ) L2y + (c` + cf Mp ) Ly kν k+1 − ν k k2 + (T − t0 ) 2 Z T  − H(t, y k (t), ν k+1 (t), pk (t)) − H(t, y k (t), ν k (t), pk (t)) dt. t0

Next, we perform integration by parts of the first term and combine with the second term. Moreover, we use the following equality at t = T : Z 1 k+1 k γ(y ) − γ(y ) = ∂y γ(y k + s (y k+1 − y k )) (y k+1 − y k ) ds, 0 k

and recall that p (T ) = −∂y γ(y k (T )). Thus, with the Assumption 3.1, we obtain the intermediate estimate  1 pk (T ) y k+1 (T ) − y k (T ) + γ(y k+1 (T )) − γ(y k (T )) ≤ c6 L2y kν k+1 − ν k k2 . 2 Hence, we have   J y k+1 , ν k+1 − J y k , ν k 1 ≤ c6 L2y kν k+1 − ν k k2 2 1  + (T − t0 ) (c4 + c2 Mp ) L2y + (c` + cf Mp ) Ly kν k+1 − ν k k2 2 Z T  − H(t, y k (t), ν k+1 (t), pk (t)) − H(t, y k (t), ν k (t), pk (t)) dt. t0

96

The Sequential Quadratic Hamiltonian Method Now, recall Step 2 in Algorithm 3.1, which guarantees that

 −H(t, y k (t), ν k+1 (t), pk (t))+H(t, y k (t), ν k (t), pk (t)) ≤ − g ν k+1 (t, ·), ν k (t, ·) , where

Z g (ν, µ) :=

ν(v) log(ν(v)/µ(v)) dv. Kad

Now, we invoke the following Pinsker’s inequality [95, 264], which states that Z 2 2 g (ν, µ) ≥ |ν(v) − µ(v)| dv . Kad

Based on this inequality and using the Cauchy-Schwarz inequality, we obtain Z T Z T Z 2 1 |ν(t, v) − µ(t, v)| dv dt . g (ν(t, ·), µ(t, ·)) dt ≥ 2(T − t0 ) t0 Kad t0 With these results, we obtain   1 J y k+1 , ν k+1 − J y k , ν k ≤ c6 L2y kν k+1 − ν k k2 2  1 2 (c4 + c2 Mp ) Ly + (c` + cf Mp ) Ly kν k+1 − ν k k2 + (T − t0 ) 2  kν k+1 − ν k k2 . − 2(T − t0 ) Therefore we obtain    J y k+1 , ν k+1 − J y k , ν k ≤ ς θ −  kν k+1 − ν k k2 , where ς = 1/[2(T − t0 )] and 1 1  θ = [2(T − t0 )] c6 L2y + (T − t0 ) (c4 + c2 Mp ) L2y + (c` + cf Mp ) Ly . 2 2 Thus the theorem is proved. We see that a successful update by Step 2 of the SQH Algorithm 3.1 can be obtained in a finite number of steps. Furthermore, considering the analysis presented in the previous chapter and Theorem 3.1, one can easily prove the following theorem, which also guarantees the validity of the stopping criteria. Theorem 3.2 Let the assumptions of Theorem 3.1 be fulfilled. If in Algorithm 3.1, at every kth iterate,  = θ + η/ς is chosen, then the following holds ˆ k ))k=0,1,2,... is monotonically decreasing and converges a) the sequence (J(ν ∗ ˆ to some Jˆ ≥ inf ν∈Uad J(ν); b) it holds limk→∞ kν k+1 − ν k kL1 ([t0 ,T ]×Kad ) = 0.

Optimal Relaxed Controls

3.4

97

An Application with Two Relaxed Controls

In this section, we consider an optimal control problem governed by a nonlinear coupled differential system with two controls. This model provides a simple description of the dynamics of a bioreactor with ideal mixing, where a contaminant and bacteria that degrades this contaminant are present; see [177] for a related problem. Our bioreactor model is as follows: y10 (t) = G u1 (t) y1 (t) − D y12 (t), y20 (t) = −K y1 (t) y2 (t) + L u2 (t),

(3.23) (3.24)

where t ∈ [0, T ], and the state variables y1 and y2 represent the concentration of bacteria and contaminant, respectively. In this model, the controls u1 and u2 represent nutrient for bacteria and added contaminant, respectively. The constant G > 0 represents the maximum growth rate of the bacteria, which is modulated by u1 , while D > 0 is its death rate. Further, K > 0 denotes the degradation rate of the contaminant due to the bacteria metabolism, and L > 0 is a scaling parameter for the injection of contaminant. The purpose of the controls is to have the concentration of bacteria follow a given desired profile specified by the function y1d = y1d (t), while at the end of the time horizon the value of the contaminant concentration should be as close as possible to zero. These goals are modelled requiring the minimisation of the following objective functional 1 J(y1 , y2 , ν1 , ν2 ) = 2

Z Z

T

0 T

Z

+α Kad

0

Z

T

Z

+β 0

γ |y2 (T )|2 2   (|v (1 − v)| ν1 (t, v) dv dt

|y1 (t) − y1d (t)|2 dt +

  |v (1 − v)| ν2 (t, v) dv dt,

(3.25)

Kad

where α, β, γ > 0 are optimisation weights. We assume that the measures ν1 and ν2 , associated to u1 and u2 , are defined both on the same Kad . With this setting, theRstate of the system is computed with (3.23)–(3.24), with the means uj (t) := Kad v νj (t, v) dv, j = 1, 2, and we have the following adjoint equations p01 (t) = −G u1 (t) p1 (t) + 2D y1 (t) + K y2 (t) p2 (t) + (y1 (t) − y1d (t)), p02 (t) = K y1 (t) p2 (t),

(3.26) (3.27)

with terminal conditions p1 (T ) = 0 and p2 (T ) = −γ y2 (T ), respectively.

98

The Sequential Quadratic Hamiltonian Method

1.1

0.5

0.45 1 0.4

0.35 0.9

u2 mean

u1 mean

0.3

0.8

0.25

0.2 0.7 0.15

0.1 0.6 0.05

0.5

0 0

1

2

t

3

0

1

2

3

t

FIGURE 3.6: Optimal relaxed controls (top) and the corresponding means. For our problem, we consider the following augmented HP function H (t, y1 , y2 , ν1 , ν2 , µ1 , µ2 , p1 , p2 ) = H(t, y1 , y2 , ν1 , ν2 , p1 , p2 ) −  g(ν1 , µ1 ) −  g(ν2 , µ2 ). Further, notice that the HP function has a composite structure with respect to ν 1 and ν 2 as follows: H(t, y1 , y2 , ν1 , ν2 , p1 , p2 ) = H1 (t, y1 , y2 , ν1 , p1 , p2 ) + H2 (t, y1 , y2 , ν2 , p1 , p2 ). Therefore the update formula given in (3.11) can be applied separately to each density as follows:  νj (t, v) = Cj (t) µj (t, v) exp H j (t, y1 (t), y2 (t), v, p1 (t), p2 (t))/ , j = 1, 2. In the SQH Algorithm 3.1, we set τ := kν1k+1 − ν1k k2L1 ([t0 ,T ]×Kad ) + kν2k+1 − ν2k k2L1 ([t0 ,T ]×Kad ) .

Optimal Relaxed Controls 0.62

99

0.5 0.45

0.6 0.4 0.35 0.58

y2

y1

0.3 0.56

0.25 0.2

0.54 0.15 0.1 0.52 0.05 0.5

0 0

1

2

3

0

1

2

t

3

t

FIGURE 3.7: The evolution of concentration of bacteria and contaminant. 0.35

0.3

0.25

J

0.2

0.15

0.1

0.05

0 0

100

200

300

400

500

600

SQH iterations

FIGURE 3.8: The minimisation history of the objective functional. For the numerical approximation, we solve the differential equations using the explicit Euler scheme on a uniform time grid with Nt = 200 subintervals, and integration on Kad is implemented with rectangular quadrature on a uniform mesh with Nv = 400 subintervals. We choose the initial uniform densities, and we set  = 1, σ = 1.1, ζ = 0.9, η = 10−2 , and κ = 10−6 . With this setting, we apply the SQH method to our relaxed optimal control problem with the following choice of values of the parameters: G = 1, D = 1, K = 3, L = 1, α = 0.1, β = 0.1, γ = 0.01. We choose y1d (t) = 0.6, i.e. a constant function. Further, we choose Kad = [0, 2], T = 3, and the initial conditions for the state of the bioreactor are given by y1 (0) = 0.5 and y2 (0) = 0.5. The resulting optimal relaxed controls and the corresponding means are depicted in Figure 3.6. The time evolution of the controlled bioreactor is depicted in Figure 3.7. The minimisation history of J is reported in Figure 3.8.

Chapter 4 Differential Nash Games

4.1 4.2 4.3 4.4

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PMP Characterisation of Nash Games . . . . . . . . . . . . . . . . . . . . . . . . . . The SQH Method for Solving Nash Games . . . . . . . . . . . . . . . . . . . . . . Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

101 102 104 108

In this chapter, a sequential quadratic Hamiltonian method for solving differential Nash games is discussed. The focus is on noncooperative open-loop games with different differential constraints and on the Nash equilibrium concept. In this framework, multiple Hamilton-Pontryagin functions are involved that correspond to different players, and successive approximations in the SQH method are obtained by solving finite-dimensional Nash games. Numerical experiments are performed with linear-quadratic Nash games and with some related nonsmooth games.

4.1

Introduction

A differential game is defined by a differential model that governs the state of the system and is subject to different mechanisms of action representing the strategies of the players in the game. In addition, different objective functionals and admissible sets of actions are associated to each player, and in the game the purpose of each player is to optimise (e.g., minimise) its own objective subject to the constraints given by the differential model and the admissible sets. In the case of noncooperative games, a suitable solution concept is the one proposed by J. F. Nash in [199, 200], where a Nash equilibrium (NE) for static games with complete information is defined, that is, a configuration where no player can benefit from unilaterally changing its own strategy. In this framework, differential Nash games where pioneered by R. P. Isaacs [143]. However, contemporary to Isaacs’ book, there are the works [216, 218] where differential games are discussed within the Pontryagin maximum principle. Along this line, in [4, 162] we find early attempts and comments towards the development of a SA scheme for differential games, but this line of research has received less attention until the recent work [61], where a sequential quadratic DOI: 10.1201/9781003152620-4

101

102

The Sequential Quadratic Hamiltonian Method

Hamiltonian method for solving open-loop non zero-sum two-players differential Nash games is presented. It is the purpose of this chapter to introduce open-loop differential Nash games and discuss theoretical and numerical aspects of the SQH method proposed in [61].

4.2

PMP Characterisation of Nash Games

In this section, we formulate differential Nash games and discuss the characterisation of their NE solution in the PMP framework. We discuss the case of two players, which can be readily extended to the case of N players, that are represented by their strategies u1 and u2 . We assume the following dynamics y 0 (t) = f (t, y(t), u1 (t), u2 (t)),

y(t0 ) = y0 ,

(4.1)

where t ∈ [t0 , T ], y(t) ∈ Rn , and u1 (t) ∈ Rm and u2 (t) ∈ Rm , m ≤ n. We assume f is such that for any choice of the initial condition y0 ∈ Rn , and any u1 , u2 ∈ L2 (t0 , T ; Rm ), the Cauchy problem (4.1) admits a unique solution in the sense of Carathéodory; see, e.g., [46]. Further, we assume that the map (u1 , u2 ) 7→ y = y(u1 , u2 ), where y(u1 , u2 ) represents the unique solution to (4.1) with fixed initial conditions, is continuous in (u1 , u2 ); see [62] for details. We focus on problems with fixed endtime T ; see [63] for the formulation of a time-optimal differential Nash game. We refer to u1 and u2 as the game strategies of the players P1 and P2 , respectively. The goal of P1 is to minimise the following cost (or objective) functional Z T J1 (y, u1 , u2 ) := `1 (t, y(t), u1 (t), u2 (t)) dt + γ1 (y(T )), (4.2) t0

whereas P2 aims at minimising its own cost given by Z

T

J2 (y, u1 , u2 ) :=

`2 (t, y(t), u1 (t), u2 (t)) dt + γ2 (y(T )).

(4.3)

t0

We consider the cases of unconstrained and constrained strategies. In the former case, we assume u1 , u2 ∈ L2 (t0 , T ; Rm ), whereas in the latter case we assume that u1 and u2 belong, respectively, to the following admissible sets (i)

(i)

Uad = {u ∈ L2 (t0 , T ; Rm ) : u(t) ∈ Kad a.e.}, (i)

i = 1, 2,

(4.4)

where Kad are compact and convex subsets of Rm . We denote with Uad = (1) (2) Uad × Uad and U = L2 (t0 , T ; Rm ) × L2 (t0 , T ; Rm ). Notice that we have a

Differential Nash Games

103

uniform bound on |y(u1 , u2 )(t)|, t ∈ [t0 , T ], that holds for any u ∈ Uad ; see [62]. By using the map (u1 , u2 ) 7→ y = y(u1 , u2 ), we can introduce the reduced objectives Jˆ1 (u1 , u2 ) := J1 (y(u1 , u2 ), u1 , u2 ) and Jˆ2 (u1 , u2 ) := J2 (y(u1 , u2 ), u1 , u2 ). In this framework, a Nash equilibrium is defined as follows: Definition 4.1 The functions (u∗1 , u∗2 ) ∈ Uad are said to form a Nash equi(1) (2) librium (NE) for the game (Jˆ1 , Jˆ2 ; Uad , Uad ), if it holds (1)

Jˆ1 (u∗1 , u∗2 ) ≤ Jˆ1 (u1 , u∗2 ),

u1 ∈ Uad ,

Jˆ2 (u∗1 , u∗2 ) ≤ Jˆ2 (u∗1 , u2 ),

u2 ∈ Uad .

(2)

(4.5)

(A similar Nash game is defined replacing Uad with U .) We remark that existence of a NE point can be proved subject to appropriate conditions on the structure of the differential game, including the choice of T . For our purpose, we assume existence of a Nash equilibrium (u∗1 , u∗2 ) ∈ Uad , and refer to [62] for a review and recent results in this field. We remark that, if u∗ = (u∗1 , u∗2 ) is a NE for the game, then it satisfies the following u∗1 = argmin Jˆ1 (u1 , u∗2 ), (1)

u1 ∈Uad

u∗2 = argmin Jˆ2 (u∗1 , u2 ).

(4.6)

(2)

u2 ∈Uad

This fact implies that the NE point u∗ = (u∗1 , u∗2 ) must fulfil the necessary optimality conditions given by the Pontryagin maximum principle applied to both optimisation problems stated in (4.5), alternatively (4.6). In order to discuss these conditions, we introduce the following HamiltonPontryagin functions Hi (t, y, u1 , u2 , p1 , p2 ) = pi · f (t, y, u1 , u2 ) − `i (t, y, u1 , u2 ),

i = 1, 2. (4.7)

In terms of these functions, the PMP condition for the NE point u∗ = states the existence of multiplier (adjoint) functions p1 , p2 : [t0 , T ] → R such that the following holds

(u∗1 , u∗2 ) n

max H1 (t, y ∗ (t), w1 , u∗2 (t), p∗1 (t), p∗2 (t)) = H1 (t, y ∗ (t), u∗1 (t), u∗2 (t), p∗1 (t), p∗2 (t)), (1)

w1 ∈Kad

max H2 (t, y ∗ (t), u∗1 (t), w2 , p∗1 (t), p∗2 (t)) = H2 (t, y ∗ (t), u∗1 (t), u∗2 (t), p∗1 (t), p∗2 (t)), (2)

w2 ∈Kad

(4.8) for almost all t ∈ [t0 , T ]. Notice that, at each t fixed, problem (4.8) corresponds to a finite-dimensional Nash game.

104

The Sequential Quadratic Hamiltonian Method

In (4.8), we have y ∗ = y(u∗1 , u∗2 ), and the adjoint variables p∗1 , p∗2 are the solutions to the following differential problems >

−p0i (t) = (∂y f (t, y(t), u1 (t), u2 (t))) pi (T ) = −∂y γi (y(T )),

pi (t) − ∂y `i (t, y(t), u1 (t), u2 (t)), (4.9) (4.10)

where i = 1, 2, and ∂y φ(y) represents the Jacobian of φ with respect to the vector of variables y. Similarly to (4.1), one can prove that (4.9)–(4.10) is uniquely solvable, and the solution can be uniformly bounded independently of u ∈ Uad . We conclude this section introducing the Nikaido-Isoda [202] function ψ : Uad × Uad → R, which we use for the realisation of the SQH algorithm. We have ψ(u, v) := Jˆ1 (u1 , u2 ) − Jˆ1 (v1 , u2 ) + Jˆ2 (u1 , u2 ) − Jˆ2 (u1 , v2 ),

(4.11)

where u = (u1 , u2 ) ∈ Uad and v = (v1 , v2 ) ∈ Uad . At the Nash equilibrium u∗ = (u∗1 , u∗2 ) it holds ψ(u∗ , v) ≤ 0, (4.12) for any v ∈ Uad and ψ(u∗ , u∗ ) = 0.

4.3

The SQH Method for Solving Nash Games

In the spirit of the SA methods, a procedure for solving our Nash game (1) (2) ˆ (J1 , Jˆ2 ; Uad , Uad ) consists of an iterative scheme, starting with an initial guess 0 0 (u1 , u2 ) ∈ Uad , and followed by the solution of our governing model (4.1) and of the adjoint problems (4.9) - (4.10) for i = 1, 2. Thereafter, a new approximation to the strategies u1 and u2 is obtained by solving, at each t fixed, the Nash game (4.8) and assigning the values of (u1 (t), u2 (t)) equal the solution of this game. This update step is well posed if this solution exists for t ∈ [t0 , T ] and the resulting functions u1 and u2 are measurable. Clearly, this issue requires to identify classes of problems for which we can guarantee existence and uniqueness (or the possibility of selection) of a NE point. In this respect, a large class can be identified based on the following result given in [55], which is proved by an application of the Kakutani’s fixed-point theorem; see, e.g., [46] for references. We have Theorem 4.1 Assume the following structure f (t, y, u1 , u2 ) = f0 (t, y) + M1 (t, y) u1 + M2 (t, y) u2 , and `i (t, y, u1 , u2 ) = `0i (t, y) + `1i (t, u1 ) + `2i (t, u2 ),

i = 1, 2.

Differential Nash Games (1)

105

(2)

Further, suppose that Kad and Kad are compact and convex, the function f0 , `0i and the matrix functions M1 and M2 are continuous in t and y, and the functions u1 → `11 (t, u1 ) and u2 → `22 (t, u2 ) are strictly convex for any choice of t ∈ [t0 , T ] and y ∈ Rn . Then, for any t ∈ [t0 , T ] and any y, p1 , p2 ∈ Rn , there exists a unique pair (1) (2) (˜ u1 , u ˜2 ) ∈ Kad × Kad such that  u ˜1 = argmax p1 · f (t, y, v, u ˜2 ) − `1 (t, y, v, u ˜2 ) , (1)

v∈Kad

 u ˜2 = argmax p2 · f (t, y, u ˜1 , w) − `2 (t, y, u ˜1 , w) . (2)

w∈Kad

With the setting of this theorem, the map (t, y, p1 , p2 ) 7→ (u∗1 , u∗2 ) is continuous [55]. Moreover, based on results given in [226], one can prove that the functions (u1 (t), u2 (t)) resulting from the SA update, starting from measurable (u01 (t), u02 (t)), are measurable. Therefore the proposed SA update is well posed and ∞it can be repeated in order to construct a sequence of functions (uk1 , uk2 ) k=0 . However, as already discussed in the case of optimal control problems, while it is difficult to find conditions that guarantee convergence of SA iterations, we can pursue the SQH approach considering the following augmented HP functions H(i) (t, y, u1 , u2 , v1 , v2 , p1 , p2 ) := Hi (t, y, u1 , u2 , p1 , p2 ) −  |u − v|2 ,

i = 1, 2, (4.13) where, in the iteration process, u = (u1 , u2 ) is subject to the update step, and v = (v1 , v2 ) corresponds to the previous strategy approximation; | · | denotes the Euclidean norm. The parameter  > 0 represents the augmentation weight that is chosen adaptively along the iteration as discussed below. Now, we can define the SQH step to pointwise update the game strategies. Suppose that the kth function approximation (uk1 , uk2 ) and the corresponding y k and pk1 , pk2 have been computed. For any fixed t ∈ [t0 , T ] and  > 0, consider the following finite-dimensional Nash game H(1) (t, y k , u ˜1 , u ˜2 , uk1 , uk2 , pk1 , pk2 ) = max H(1) (t, y k , u1 , u ˜2 , uk1 , uk2 , pk1 , pk2 ), (1)

u1 ∈Kad

H(2) (t, y k , u ˜1 , u ˜2 , uk1 , uk2 , pk1 , pk2 ) = max H(2) (t, y k , u ˜1 , u2 , uk1 , uk2 , pk1 , pk2 ), (2)

u2 ∈Kad

(4.14) where y k = y k (t), pk1 = pk1 (t), pk2 = pk2 (t), and (uk1 , uk2 ) = (uk1 (t), uk2 (t)). It is clear that, assuming the structure specified in Theorem 4.1, the Nash (1) (2) game (4.14) admits a unique NE point, (˜ u1 , u ˜2 ) ∈ Kad × Kad , and the sequence constructed recursively by the procedure: (uk1 (t), uk2 (t)) → (uk+1 (t), uk+1 (t)) := (˜ u1 , u ˜2 ) 1 2 is well defined.

106

The Sequential Quadratic Hamiltonian Method

Notice that, in this procedure, the solution to (4.14) depends on the value of . Therefore the issue arises whether, corresponding to the step k → k + 1, we can choose the value of this parameter such that the strategy function uk+1 = (uk+1 , uk+1 ) represents an improvement on uk = (uk1 , uk2 ), in the sense 1 2 that some convergence criteria towards the solution to our differential Nash problem are fulfilled. For this purpose, in [61] the following criterion is defined that is based on the Nikaido-Isoda function ψ(uk+1 , uk ) ≤ −ξ kuk+1 − uk k2L2 (0,T ;Rm ) , for some chosen ξ > 0. This is a consistency criterion in the sense that ψ must be nonpositive, and if (uk+1 , uk ) → (u∗ , u∗ ), then we must have limk→∞ ψ(uk+1 , uk ) = 0. Furthermore, it is required that the absolute value |ψ(uk+1 , uk )| monotonically decreases in the SQH iteration process. In this process, if the update meets the two requirements above, then the update is taken and the value of  is diminished by a factor ζ ∈ (0, 1). If not, the update is discarded and the value of  is increased by a factor σ > 1, and the procedure is repeated. Below, we show that a value of  can be found such that the update is successful and the SQH iteration proceeds until an appropriate stopping criterion is satisfied. The SQH method for differential Nash games is implemented as follows: Algorithm 4.1 (SQH method for differential Nash games) Input: initial approx. Ψ0 > 0, (u01 , u02 ), max. number of iterations kmax , tolerance κ > 0,  > 0, σ > 1, ζ ∈ (0, 1) and ξ ∈ (0, ∞); set τ > κ, k := 0. Compute the solution y 0 to the forward problem y 0 (t) = f (t, y(t), u01 (t), u02 (t)),

y(t0 ) = y0 .

while (k < kmax && τ > κ ) do 1) Compute the solutions pk1 , pk2 to the adjoint problems > p0i (t) = − ∂y f (t, y k (t), uk1 (t), uk2 (t)) pi (t) − ∂y `i (t, y k (t), uk1 (t), uk2 (t)), with terminal condition pi (T ) = −∂y γi (y k (T )), i = 1, 2. 2) Determine uk+1 , uk+1 that solve the following Nash game: 1 2 uk+1 = argmax H(1) (t, y k , u1 , uk+1 , uk1 , uk2 , pk1 , pk2 ) 1 2 (1)

u1 ∈Kad

and uk+1 = argmax H(2) (t, y k , uk+1 , u2 , uk1 , uk2 , pk1 , pk2 ) 2 1 (2)

u2 ∈Kad

for all t ∈ [t0 , T ].

Differential Nash Games

107

3) Compute the solution y k+1 to the forward problem (t)), (t), uk+1 y 0 (t) = f (t, y(t), uk+1 2 1

y(t0 ) = y0 .

− uk2 k2L2 (t0 ,T ;Rm ) . − uk1 k2L2 (t0 ,T ;Rm ) + kuk+1 4) Compute τ := kuk+1 2 1 5) Calculate the Nikaido-Isoda function ψ(uk+1 , uk ). 6) If ψ(uk+1 , uk ) ≥ −ξ τ or |ψ(uk+1 , uk )| ≥ Ψk , then increase  with  = σ  and go to Step 2. Else if ψ(uk+1 , uk ) ≤ −ξ τ and |ψ(uk+1 , uk )| ≤ Ψk , then set Ψk+1 := |ψ(uk+1 , uk )|, decrease  with  = ζ  and continue. 7) Set k := k + 1. end while Next, we discuss wellposedness of the Steps 2–7 of the SQH method. For this purpose, we consider the assumptions of Theorem 4.1 with additional simplifying hypothesis for ease of calculation. In particular, we show that it is possible to find an  in Algorithm 4.1 such that uk+1 determined in Step 2, satisfies the criterion required in Step 6. for a successful update. We have the following lemma. See [61] for a proof. Lemma 4.1 Let the assumptions of Theorem 4.1 hold, and suppose that f0 , γi , `i , i = 1, 2, are differentiable and are quadratic forms in u and y such that their Hessians are constant. Moreover, let M1 , M2 depend only on t. Let (y k+1 , uk+1 , uk+1 ), (y k , uk1 , uk2 ) be generated by Algorithm 4.1. Then, there 1 2 exists a θ > 0 independent of  such that, for  > 0 currently chosen in Step 2, the following inequality holds ψ(uk+1 , uk ) ≤ −( − θ) kuk+1 − uk k2L2 (0,T ) ,

(4.15)

where uk+1 = (uk+1 , uk+1 ) and uk = (uk1 , uk2 ). In particular, if  > θ then 1 2 k+1 k ψ(u , u ) ≤ 0. We remark that in Step 2. of the SQH algorithm, the NE solution uk+1 obtained in this step depends on  so that kuk+1 − uk k2L2 (0,T ) decreases as O(1/2 ). In order to illustrate this fact, consider the following optimisation problem ν max f (u) := b u − u2 −  (u − v)2 , 2 where ν,  > 0. Clearly, the function f is concave and its maximum is attained at u = (b + 2  v)/(ν + 2 ). Thus, we obtain |u − v| =

|b − ν v| . (ν + 2 )

108

The Sequential Quadratic Hamiltonian Method

Now, subject to the assumptions of Lemma 4.1 and using the estimates in its proof, we can state that there exists a constant C > 0 such that |ψ(uk+1 , uk )| ≤ C kuk+1 − uk k2L2 (0,T ) , where C increases linearly with . On the other hand, since the HP functions are concave, we have that kuk+1 − uk k2L2 (0,T ) decreases as O(1/2 ). Therefore, given the value Ψk in Step 5 of the SQH algorithm, it is always possible to choose  sufficiently large such that |ψ(uk+1 , uk )| ≤ Ψk . In Algorithm 4.1, we have that ψ(uk+1 , uk ) → 0 as k → ∞. Thus, since ψ(uk+1 , uk ) ≤ −ξ kuk+1 − uk k2L2 , it follows that limk kuk+1 − uk k2L2 = 0 and hence the convergence criterion can be satisfied. We remark that, subject to the assumptions of Lemma 4.1, if (uk1 , uk2 ) generated by Algorithm 4.1 satisfies the PMP conditions, then Algorithm 4.1 stops returning (uk1 , uk2 ).

4.4

Numerical Experiments

In this section, we present results of numerical experiments with linearquadratic (LQ) Nash games and variations thereof in order to validate the computational performance of the proposed SQH method. Linear-quadratic Nash games appear in the field of, e.g., economics and marketing [102, 148], and are very well investigated from the theoretical point of view; see, e.g., [55, 108, 119, 274]. Moreover, since the solution of unconstrained LQ Nash games can be readily obtained by solving coupled Riccati equations [46, 108], they provide a convenient benchmark for the SQH method. Further, these problems can be conveniently extended to define Nash games with tracking objectives, box constraints on the players’ actions, and actions’ costs that include L1 terms. However, in all these cases, the structure of the corresponding problems is such that in Step 2 of the SQH algorithm the update at any fixed t can be determined analytically. The first experiment exploits the possibility to compute open-loop NE solutions to LQ Nash games by solving a coupled system of Riccati equations [108]. Thus, we use this solution for comparison to the solution of the same Nash game obtained with the SQH method. Our linear-quadratic Nash game is formulated as follows: y 0 (t) = A y(t) + B1 u1 (t) + B2 u2 (t),

y(0) = y0 ,

(4.16)

where  A=

1 0

 0 , 2

 B1 =

 1 0 , 0 −1

B2 =

Therefore y(t) ∈ R2 and ui (t) ∈ R2 , t ∈ [0, T ].

  2 −1 , 0 2

y0 =

  2 . 1

Differential Nash Games 1

0

0.5

-5

0

-10

-0.5

-15

-1

-20

-1.5

109

-25 0

0.05

0.1

0.15

0.2

0.25

0

0.05

0.1

t

0.15

0.2

0.25

0.15

0.2

0.25

t

1

0

0.5

-5

0

-10

-0.5

-15

-1

-20

-1.5

-25 0

0.05

0.1

0.15

t

0.2

0.25

0

0.05

0.1

t

FIGURE 4.1: Strategies u1 (left) and u2 for the LQ Nash game obtained with the SQH method (top) and by solving the Riccati system.

The cost functionals are as follows: Z  1 T 1 Ji (y, u1 , u2 ) = y(s)> Li y(s) + ui (s)> Ni ui (s) ds + y(T )> Di y(T ), 2 0 2 (4.17) i = 1, 2, where the matrices Li , Di , Ni are given by L1 = α1 I2 , L2 = α2 I2 , N1 = ν1 I2 , N2 = ν2 I2 , and D1 = γ1 I2 , and D2 = γ2 I2 , where I2 is the identity matrix in R2 . In the following experiment, we choose α1 = 0.1, α2 = 1, ν1 = 0.01, ν2 = 0.01, γ1 = 0.01 and γ2 = 0.01. We consider the time interval [0, T ], with T = 0.25, subdivided into N = 2500 subintervals and, on this grid, the state and adjoint equations are solved numerically by a midpoint scheme [46]. The initial guess u01 , u02 for the SQH iteration are zero functions, and we choose  = 10, ζ = 0.95, σ = 1.05, ξ = 10−8 , Ψ0 = 10, and κ = 10−12 . With this setting, we obtain the Nash strategies (u1 , u2 ) depicted in Figure 4.1 (left), which are compared with the solution obtained by solving the Riccati

110

The Sequential Quadratic Hamiltonian Method

3

0

2

-0.5

1

-1

0

-1.5

-1

-2

-2

-2.5

-3

-3 0

0.05

0.1

0.15

0.2

t

0.25

0

0.05

0.1

0.15

0.2

0.25

t

FIGURE 4.2: Strategies u1 (left) and u2 for the LQ Nash game with constraints on u as obtained by the SQH method.

system as shown in Figure 4.1 (right). We can see that the two sets of solutions overlap. Next, we consider the same setting but require that the players’ strategies (i) are constrained by choosing Kad = [−3, 3]×[−3, 3], i = 1, 2. With this setting, we obtain the strategies depicted in Figure 4.2. Since the control constraints are active, we do not have a Riccati-type solution to compare with. In our third experiment, we consider a setting similar to the second experiment but add to the cost functionals a weighted L1 cost of the strategies. We have (written in a more compact form) Ji (y, u1 , u2 ) =

1 2

Z

T



 αi |y(s)|2 + νi |ui (s)|2 + 2βi |ui (s)| ds,

(4.18)

0

where i = 1, 2; the terms with Di , i = 1, 2, are omitted. We choose β1 = 0.01, β2 = 0.01; the other parameters are set as in the first experiment. Further, (i) we require that the players’ strategies are constrained by choosing Kad = [−3, 3] × [−3, 3], i = 1, 2, as above. The strategies obtained with this setting are depicted in Figure 4.3. Notice that the addition of L1 costs of the players’ actions tends to promote their sparsity. In the next experiment, we consider a tracking problem where the cost functionals have the following structure Z  1 T αi |y(s) − y¯i (s)|2 + νi |ui (s)|2 + 2βi |ui (s)| ds 2 0 γi + |y(T ) − y¯i (T )|2 , (4.19) 2

Ji (y, u1 , u2 ) =

Differential Nash Games 3

0

2

-0.5

1

-1

0

-1.5

-1

-2

-2

-2.5

-3

111

-3 0

0.05

0.1

0.15

t

0.2

0.25

0

0.05

0.1

0.15

0.2

0.25

t

FIGURE 4.3: Strategies u1 (left) and u2 for the Nash game with L2 and L1 costs and constraints on u as obtained by the SQH method.

where y¯i denotes the trajectory desired by the Player Pi , i = 1, 2. Specifically, we take     1 1 y¯1 (t) = sin (2πt) , y¯2 (t) = cos (2πt) . 1 1 Notice that these trajectories are orthogonal to each other, that is, the two players have very different purposes. For the initial state, we take y0 = (1/2, 1/2). In this fourth experiment, the values of the game parameters are given by α1 = 1, α2 = 10, ν1 = 10−7 , ν2 = 10−7 , β1 = 10−7 , β2 = 10−6 , and γ1 = 1 and γ2 = 1. Further, we require that the players’ strategies are constrained (i) by choosing Kad = [−3, 3] × [−3, 3], i = 1, 2. In this experiment, we take T = 1 and N = 104 subdivision of [0, T ] for the numerical approximation. The parameters of the SQH method remain unchanged. The results of this experiment are depicted in Figure 4.4. For this concluding experiment, we report that the convergence criterion is achieved after 923 SQH iterations, whereas the number of successful updates is 367. We see that ψ is always negative and its absolute value monotonically decreases, with ψ = −8.38 × 10−8 at convergence. On the other hand, we can see that the value of  is changed along the iteration, while the values of the players’ functionals reach the Nash equilibrium.

112

The Sequential Quadratic Hamiltonian Method

3

1.5

1 2 0.5

1

0

-0.5 0 -1

-1

-1.5

-2 -2 -2.5

-3

-3 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0

1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.6

0.7

0.8

0.9

1

t

t 0.7 7 0.6 6

0.5

0.4 5 0.3 4 0.2

0.1

3

0 2 -0.1

1

-0.2

-0.3 0 0 -10 -8

0 50

100

150

200

250

300

350

0.1

0.2

0.3

0.4

0.5

t

400 10 5

-10 -7 10 4 -10 -6

-10 -5

10 3

-10 -4 10 2 -10 -3

-10 -2

10 1

-10 -1 10 0 -10 0

-10 1

10 -1 0

50

100

150

200

250

300

350

400

0

100

200

300

400

500

600

700

800

900

1000

FIGURE 4.4: Strategies u1 (top, left) and u2 (top, right) for the Nash game with tracking functional as obtained by the SQH method. In the middle figures the values of J1 and J2 along the SQH iterations (left) and the evolution of y corresponding to u1 and u2 (right). In the bottom figures the values of ψ (left) and of  along the SQH iterations.

Chapter 5 Deep Learning in Residual Neural Networks

5.1 5.2 5.3 5.4 5.5 5.6

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Supervised Learning and Optimal Control . . . . . . . . . . . . . . . . . . . . . . The Discrete Maximum Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Sequential Quadratic Hamiltonian Method . . . . . . . . . . . . . . . . . Wellposedness and Convergence Results . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

113 114 118 120 124 127

In the framework of residual neural networks, learning problems can be interpreted as optimal control problems governed by difference equations, and solutions to these control problems can be characterised by a discrete version of the Pontryagin maximum principle. This fact provides the foundation of a sequential quadratic Hamiltonian learning method, which is discussed in this chapter. Applications of this method to supervised learning problems are presented.

5.1

Introduction

The connection between supervised learning in artificial neural networks (ANNs) and optimal control of differential systems is a central topic in the mathematics of machine learning. Earlier related works in this field can be found in [176, 280], and more recently in, e.g., [128, 178], where supervised learning is discussed as a time-discrete equivalent to an optimal control problem governed by a system of ODEs, and backpropagation is studied in the Lagrange framework. In this chapter, we consider Runge-Kutta (RK) neural networks as a special class of residual neural network (ResNet) architectures [135], and illustrate a sequential quadratic Hamiltonian learning procedure based on a discrete version of the PMP. In this context, a learning procedure for neural networks is formulated as the optimisation of a loss functional with penalisation and subject to the constraint given by a nonlinear discrete-time equation, which represents the RK neural network whose weights play the role of the DOI: 10.1201/9781003152620-5

113

114

The Sequential Quadratic Hamiltonian Method

control function. In this discrete setting, a solution to this optimisation problem can be characterised by a discrete version of the PMP discussed in [131]. A similar framework is considered in [178, 179, 181], where an extended MSA (EMSA) strategy is proposed using an augmenting term for the Hamiltonian that resembles augmented Lagrangian techniques. The SQH learning method for RK neural networks has been recently proposed in [138], where it has been successfully compared to a similar method introduced in [178].

5.2

Supervised Learning and Optimal Control

In supervised learning, one considers a set of data-label pairs as follows: D(n) = {xiinput , y i }ni=1 ⊂ X × Y. This is called a data set, where X ⊂ Rd is a collection of all inputs and hence termed input space, which can be assumed finite and discrete, whereas Y ⊆ Rp denotes the label space that contains all labels and possible predictions related to the elements of the input space. In practice, the data set is split into one ¯ = D(n). set for training and another one for testing, i.e. Dtrain (S) ∪ Dtest (S) In the following, S ∈ N represents the number of samples within the training ¯ set, and S¯ ∈ N is the number of elements within the test set, with n = S + S. The goal of the learning process is to determine a map M : X → Y, modelling the data-label relation of the pairs contained in D(n), based on the ¯ are utilised for validating elements of Dtrain (S). The tuples within Dtest (S) the generalisability of the trained model. In order to approximate M, we consider the following structure h ◦ Nu ◦ P : X → Y, where Nu : RN → RN represents a neural network architecture consisting of L ∈ N trainable layers with N ∈ N nodes plus one bias each. The map P : X → RN defines a projection that maps the underlying data onto the state space RN of the neural network. Further, we have the so-called hypothesis function h : RN → Y mapping the neural network output xiL (u) := Nu (xiinput ) onto the label-space. We refer to xl = (xl,1 , . . . , xl,N ) as the state of the lth layer, and to u = (ul )L−1 l=0 as the weight sequence with ul as the corresponding weights linking the layer l to the layer l + 1, where l = 0, . . . , L − 1. We consider the learning processes where u is sought in the following admissible set (N +1)N Uad := {u = (ul )L−1 , l = 0, . . . , L − 1}, l=0 | ul ∈ U ⊂ R

(5.1)

Deep Learning in Residual Neural Networks

115

such that for any i ∈ {1, . . . , S}, the value of h(xiL (u)) is appropriately close to the underlying label y i . For this purpose, one introduces sample individual loss functions:  Φi : RN → R, x 7→ φ h(x), y i , such that their sum over all samples is minimised if a suitable approximation to M is achieved. Moreover, we add to these loss functions a penalisation term: R : Uad → R. This term has the purpose of improving the stability of the forward propagation process [128] and prevents the model from overfitting. It is assumed that the penalisation term may improve the generalisability of the underlying model; see [2] for a discussion on these aspects of machine learning. Now, we can formulate the following supervised learning problem min

S hX

u∈Uad

i Φi (xiL (u)) + R(u) .

i=1

However, for simplicity we focus on the single sample case, that is, S = 1, omit the index i = 1, and define the following (reduced) objective functional ˆ J(u) := Φ(xL (u)) + R(u).

(5.2)

Concerning the components of the RK neural network, we have an affine map consisting of a linear transformation with a weighting matrix Wl ∈ RN ×N and a shift with the bias vector bl ∈ RN . The output of this map defines the input to a given activation function as follows:     γ(x1 ) x1     f : RN → RN ,  ...  7→  ...  . xN

γ(xN )

This function is a componentwise vector function where γ is discussed below. We make the following assumption. Assumption 5.1 a) Let Φ(x) := φ(h(x), y), where φ : Y × Y → R and h : RN → Y are twice continuously differentiable and φ is bounded from below by zero. Moreover, there exists a Lipschitz constant K > 0, such that |Φ(x) − Φ(˜ x)| + k∇x Φ(x) − ∇x Φ(˜ x)k ≤ K kx − x ˜k, holds for all x, x ˜ ∈ RN . b) The set U ⊂ R(N +1)N is convex and compact.

116

The Sequential Quadratic Hamiltonian Method

c) The regularising function R : Uad → R is defined as follows: R(u) := δ

L−1 X

`(ul ),

l=0

where ` : U → R is continuous, bounded from below by zero, convex, and δ ∈ (0, ∞). d) The function γ : R → R satisfies the following properties: (a) γ is continuously differentiable and has bounded derivatives. (b) γ is monotonically nondecreasing. Notice that k · k denotes the Euclidean norm. In the case of matrices, this symbol stands for the Frobenius norm. Similar to [128], we scale the activated states by introducing a parameter δ ∈ (0, ∞), which improves the stability of the forward propagation process that, in the case of a ResNet structure, results in the following recursion rule l = 0, . . . , L − 1,

xl+1 = xl + δ F (xl , ul ),

(5.3)

with   IN ⊗ xT , 1 u = f (W x + b) .

F (x, u) := f

(5.4)

The initial state of the recursion formula (5.3) is given by x0 = P(xinput ). We remark that, in line with the definition of the admissible set (5.1), we combine the individual layers weights to a single matrix and apply a vectorisation, giving ul := vec(Wl , bl ). In Figure 5.1, we provide a sketch of (5.3)–(5.4). Notice that the propagation scheme (5.3) resembles the structure of the forward Euler method for numerically solving the following ODE problem x(t) ˙ = F (x(t), u(t)), x(0) = P(xinput ),

xl

Wl

+

δF

t ∈ [0, T ],

+

(5.5)

xl+1

bl

FIGURE 5.1: Schematic propagation process within one layer.

Deep Learning in Residual Neural Networks

117

by assuming a uniform grid with mesh size δ, and T = L δ. This fact allows to interprete the propagation of intermediate states in our NN as an approximation to the evolution of x solving the initial value problem (5.5). For this reason in, e.g., [73], one refers to the ODE in (5.5) as a neural ODE. Based on the construction of F and assuming the weights function u to be measurable, a solution to (5.5) exists and is unique in the sense of Carathéodory; see, e.g., [46]. Moreover, the connection between ResNet and neural ODEs allows to investigate the stability and wellposedness of the forward propagation process based on the stability theory for ODEs. Notice that, in addition to new ResNet architectures [71], some deep NNs can be related to numerical schemes approximating differential models; see, e.g., [141, 173, 185, 286]. We remark that RK schemes can be designed to be structure preserving approximation schemes [130], which have well studied features in optimal control problems; see, e.g. [129]. Thus, NN architectures fitting theses schemes are well-suited for the application of optimal control inspired training methods. A RK NN architecture is defined as follows: DefinitionP 5.1 (Explicit Runge-Kutta neural networks) Let δ ∈ (0, ∞), s s ∈ N, and i=1 βi = 1 with βi ≥ 0, for all i ∈ {1 . . . s}. RK neural networks are neural networks implementing a forward propagation process represented by the scheme xl+1 = xl + δ F(xl , ul ),

l = 0, . . . , L − 1,

with F : RN × U → RN ,

(x, u) 7→ F(x, u) :=

s X

βi F (χi (x, u), u)

i=1

and χi (x, u) = x + δ

s X

αi,j F (χj (x, u), u),

j=1

where αi,j = 0,

i ≤ j.

One can prove by standard techniques that the states obtained by forward propagation with an explicit RK scheme are bounded. This proof is based on the Lipschitz continuity of the activation function, and Assumption 5.1 b), Definition 5.1 and a discrete Grönwall lemma [106]. Therefore we can state that for any chosen δ ∈ (0, ∞), there exist constants K1 , K2 > 0, such that the following holds kxl k ≤ eδK1 L (kx0 k + K2 ) , SL Next, we define the set X := l=1 Xl , with

l = 0, . . . , L.

Xl := {y ∈ RN | ∃u ∈ U s.t. y = x + δF(x, u), x ∈ Xl−1 }, and assume X0 = P(X ) ⊂ RN to be bounded.

118

The Sequential Quadratic Hamiltonian Method

Further, one can prove that the function F(x, u) satisfies the following Lipschitz conditions. Lemma 5.1 Let the Assumptions 5.1 b) and d) be satisfied. Then the forward propagation within every layer of an explicit RK neural network has the following properties: i) For any δ ∈ (0, ∞), there exists a constant K3 > 0, such that for any x, x ˜ ∈ X and all u ∈ U it holds kF(x, u) − F(˜ x, u)k + k∂x F(x, u) − ∂x F(˜ x, u)k ≤ K3 kx − x ˜k. ii) For any δ ∈ (0, ∞), there exists a constant K4 > 0, such that for any u, u ˜ ∈ U and all x ∈ X it holds kF(x, u) − F(x, u ˜)k + k∂x F(x, u) − ∂x F(x, u ˜)k ≤ K4 ku − u ˜k. Proof. Notice that by Assumption 5.1 d) and the boundedness of U (see Assumption 5.1 b)) and X , we have that the functions F and ∂x F are Lipschitz continuous on X × U. Hence, the property i) can be proved by the same approach used in [181]. For ii) a similar reasoning can be applied. Now, we can formulate our supervised learning problem as the following optimal control problem governed by the explicit RK scheme. We have min J(xL , u) := Φ(xL ) + R(u), s.t. xl+1 = xl + δ F(xl , ul ), x0 = P(xinput ), u ∈ Uad .

l = 0, . . . , L − 1,

(5.6)

We remark that this problem is equivalent to ˆ min J(u) := J(xL (u), u).

u∈Uad

(5.7)

In this setting, it is possible to prove existence of solutions to (5.7) by Tonelli’s approach of minimising sequences and by relying on the continuity of Jˆ : Uad → R, its boundedness from below as well as the convexity and closedness of U.

5.3

The Discrete Maximum Principle

In the case where J and F are differentiable with respect to u, the characterisation of optimal weights u∗ can be formulated in terms of the gradient of Jˆ

Deep Learning in Residual Neural Networks

119

with respect to u, and the learning problem can be solved with gradient-based techniques; see, e.g., [2, 205]. However, in the context of deep learning and especially if a large number of layers is involved, a gradient approach suffers the so-called vanishing gradient phenomenon with consequent slow-down. Moreover, semismooth calculus would be required in the case of semidifferentiable activation (e.g., ReLU) and L1 regularisation functions; see, e.g., [272]. On the other hand, these difficulties are bypassed in the PMP framework that allows to characterise optimal controls, also in the case of discrete-time systems; see [131, 179]. The PMP holds in our case subject to the Assumptions 5.1 b)–d). We report the following theorem that represents the discrete version of the PMP [131, 179]. As in the continuous case, the HP function is given by H(x, p, u) := p · F(x, u) − `(u),

(5.8)

where p is the adjoint variable specified below. We have [138] Theorem 5.1 (Discrete PMP) Let the Assumptions 5.1 a)-d) be satisfied. Let u∗ ∈ Uad be the optimal solution to (5.6), and x∗ = (x∗l )L l=0 be the corresponding state process of the explicit RK neural network. Then there exists an adjoint process p∗ = (p∗l )L l=0 , such that the following holds 1. x∗ and p∗ satisfy the following discrete Hamilton-Pontryagin system:  x∗l+1 = x∗l + δ ∇p H x∗l , p∗l+1 , u∗l (5.9) x∗0 = P(xinput )  p∗l = p∗l+1 + δ ∇x H x∗l , p∗l+1 , u∗l (5.10) p∗L = −∇x Φ (x) |x=x∗L , with l = 0, . . . , L − 1. 2. For all l ∈ {0, . . . , L − 1} it holds H(x∗l , p∗l+1 , u∗l ) = max H(x∗l , p∗l+1 , u). u∈U

(5.11)

By construction, the optimality system within the discrete PMP has a symplectic structure; see, e.g., [181], where the first equation (5.9) corresponds to the explicit RK neural network. Thus, the adjoint equation (5.10) is given by the following RK equation

pl = pl+1 + δ

s X

bi [∂x F (χi (xl , ul ), ul )]> ρi (pl , xl , ul ),

i=1

pL = −∇x Φ (x) |x=xL ,

(5.12)

120

The Sequential Quadratic Hamiltonian Method

with ρi (pl , xl , ul ) = pl − δ

s X

ai,j [∂x F (χj (xl , ul ), ul )]> ρj (pl , xl , ul ),

(5.13)

j=1

and bi = βi ,

ai,j = βj −

βj αj,i , βi

i, j ∈ {1, . . . , s}.

For more details on this structure see [129]. This result can be verified in the framework of ResNet, where we have α1,1 = 0, β1 = 1. In this case, the discrete HP system equals a symplectic partitioned Euler scheme given by xl+1 = xl + δ F (xl , ul ), pl = pl+1 +

δ WlT [∂x F

x0 = P(xin ) (xl , ul )]> pl+1 ,

pL = −∇x Φ (x) |x=xL .

Notice that for a given u, we have xl (u) uniquely determined, and this in turn results in a unique adjoint variable that we denote with pl (u), for l = 0, . . . , L. We refer to [129] for a detailed discussion of the fact that the discrete HP system (5.9)–(5.10) is equivalent to the discretisation of a continuous-time HP system using a partitioned symplectic RK scheme [130]. Transferring this knowledge to the framework of neural networks, we can refer to (5.9)–(5.10), as the forward-backward propagation process through a RK neural network. We refer to (5.9) - (5.10) and (5.11) as the PMP optimality system. Similar to the case of forward propagation, we can derive an upper bound for the adjoint variable, which is generated by the process of backpropagation modelled by (5.10). For this purpose, we utilise the boundedness of ∂x F, see Lemma 5.1, the boundedness of ∇Φ from Assumption 5.1 a), and use a discrete Grönwall inequality [106]. In this way, we obtain that for any chosen δ ∈ (0, ∞), there exists constants K5 , K6 > 0 such that kpl k ≤ K5 eδK6 L ,

l = 0, . . . , L.

This result, together with the symplectic structure of the RK forwardbackward system, justifies the applicability of PMP-based methods for solving (5.6).

5.4

The Sequential Quadratic Hamiltonian Method

The sequential quadratic Hamiltonian method for NN learning is an iterative process where, at each iteration, a forward sweep with (5.9) followed by a backward sweep with (5.10) is performed before an update of the weight sequence through a layerwise maximisation of the augmented HP function (5.8). Also in this discrete setting, the starting point of this procedure is Rozonoer’s

Deep Learning in Residual Neural Networks

121

result in [234], which is reported below adapted to the discrete framework [138] (see Lemma 5.2 and compare with (2.6)). We have Let u, w ∈ Uad , then there exists a constant C > 0 such that it holds ˆ ˆ J(w) − J(u) = −δ

L−1 X

H (xl (u), pl+1 (u), wl ) − H (xl (u), pl+1 (u), ul )



+ R,

l=0

(5.14) PL−1 where |R| ≤ C δ l=0 kul − wl k2 . The constant C depends on L and on the Lipschitz constants of F and Φ with respect to x. In (5.14) we see that with an update w, increasing the value of H along the trajectory (xl (u), pl+1 (u)), l = 0, . . . , L − 1, we obtain a reduction of the value of the loss functional if the variation in norm of the weights, defined by 2

∆u := |||u − w||| , with 2

|||u||| := δ

L−1 X

kul k2 ,

l=0

is sufficiently small. That is, the change within the function H needs to be larger than the variation (in norm) of the weight sequence, which results in the update w. Therefore, as in other settings discussed in the previous chapters, the result (5.14) implies that robustness in the maximisation process can be achieved if the HP function H is augmented with an additional quadratic term that controls the variation ∆u. This is the approach used in the SQH method based on the following augmented HP function H (x, p, u, w) := H(x, p, u) −  ku − wk2 ,

(5.15)

where the value of  is chosen adaptively such that the following sufficient decrease condition is satisfied 2 Jˆ (w) − Jˆ (u) ≤ −η |||w − u||| ,

(5.16)

where η ∈ (0, ∞) is some predefined parameter. In the SQH method, assuming that u = u(k) represents the weight sequence obtained in the k-th iteration, the sequence w holds the weights computed by maximising H with the given  along (x(u(k) ), p(u(k) )) within the (k + 1)-th step. If this approximation to the optimal u sought does not satisfy (5.16), it is rejected and the value of  is increased by a factor σ > 1. Otherwise, if the update satisfies (5.16), then it is accepted and the value of  is decreased by a factor ζ < 1. In the next section, it is shown that this procedure can always find, in a finite number of steps, a value of  such that the corresponding w satisfies (5.16), and u(k+1) = w is set.

122

The Sequential Quadratic Hamiltonian Method

The SQH learning scheme is implemented in the following algorithm: Algorithm 5.1 (SQH method for RK NN learning) Input: initial approx. u(0) ∈ Uad , xinput ∈ X , max. number of iterations kmax , tolerance κ > 0,  > 0, σ > 1, η > 0, and ζ ∈ (0, 1); set τ > κ, k := 0. Solve the forward problem:   (0) xl+1 = xl + δ F xl , ul ,

l = 0, ..., L − 1,

with initial condition x0 = P(xinput ). while (k < kmax && τ > κ ) do 1) Solve the adjoint problem   (0) pl = pl+1 + δ ∇x H xl , pl+1 , ul ,

l = L − 1, ..., 0,

with terminal condition pL = −∇x Φ (x) |x=xL . (k+1) L−1 )l=0

2) Set u(k+1) = (ul (k+1)

ul

with   (k) = argmax H xl , pl+1 , u, ul ,

l = 0, ..., L − 1.

u∈U

3) Solve the forward problem:   (k+1) , xl+1 = xl + δ F xl , ul

l = 0, ..., L − 1,

with initial condition x0 = P(xinput ). 2

4) Compute τ := |||u(k+1) − u(k) ||| .   5) If Jˆ u(k+1) − Jˆ u(k) > −η τ , then increase  with  = σ  and go to Step 2.   Else if Jˆ u(k+1) − Jˆ u(k) ≤ −η τ , then decrease  with  = ζ  and continue. 6) Set k := k + 1. end while We know that at the core of the SQH method is the solution of the optimisation problem   (k) max H xl , pl+1 , u, ul , l = 0, ..., L − 1. (5.17) u∈U

Deep Learning in Residual Neural Networks

123

If H is differentiable with respect to u, we can solve this problem for each layer l by any gradient-based technique. In particular, the limited-memory box-constrained BFGS (L-BFGS-B) algorithm [60] is the method of choice in [178]. On the other hand, notice that by (5.14) it is not necessary to determine the exact maximum of the Hamiltonian to achieve a decrease in the objective functional value. In fact, it is sufficient to determine a weight update u satisfying     (k) (k) (k) H xl , pl+1 , u, ul ≥ H xl , pl+1 , ul , ul , (5.18) with strict inequality for at least one layer. For this reason, in [138] a simpler update strategy is proposed that has larger applicability. This strategy is closely related to the Frank-Wolfe algorithm [118]. In this approach, one first solves a subproblem in order to find an (k) ascent direction of the augmented HP function, starting at ul . To construct this subproblem, one considers the linearisation of F(xl , ·) in the augmented (k) HP function (5.15), with respect to the weights at ul giving the following function of u: (k)

(k)

(k)

(k)

2 p> l+1 [F(xl , ul ) + (∂u F(xl , ul )) · (u − ul )] − `(u) − ku − ul k . (5.19)

This function is concave and closely resembles the augmented HP function in (k) a neighborhood of ul . As a result, a subproblem is given as the maximisation of (5.19) with respect to u. For illustration, let us consider the following penalisation term υ1 kuk22 + υ2 kuk1 , 2

`(u) =

where υ1 , υ2 ≥ 0, and υ1 + υ2 > 0, and k · k1 denotes the 1-norm. Further, we choose the following set with bilateral box constraints N (N +1)

U := [ a, b ]

,

a < 0 < b.

With this setting and assuming u(k) 6= 0, we can deduce that the weight maximising (5.19) must satisfy the following relation   1 (k) (k) 2ul + p> ∂ F(x, u ) − υ sgn(u) , u= u 2 l+1 l υ1 + 2 and each component uj of this weight, is either given by (k)

u− j

= max min

or

(k)

2ul,j + p> l+1 ∂uj F(x, ul,j ) + υ2 υ1 + 2 (k)

u+ j = min max

(k)

2ul,j + p> l+1 ∂uj F(x, ul,j ) − υ2 υ1 + 2

!

!

,0 ,a

!

!

,0 ,b ,

124

The Sequential Quadratic Hamiltonian Method

for all j = 1, . . . , N (N + 1) and any l = 0, . . . , L − 1. Henceforth, we refer to the weight maximising (5.19) and being the solution to the subproblem as u± . With this weight at hand and thanks to the convexity of U and approximation properties of (5.19), the procedure to determine some wl satisfying (5.18), can be restricted to the set (k)

U ∗ := {λul

+ (1 − λ)u± | λ ∈ [0, 1]}, (k)

containing all possible convex combination of the weights ul and u± . Thus, in place of (5.17) in the SQH algorithm, the following maximisation problem is considered: (k) max∗ H (xl , pl+1 , u, ul ). (5.20) u∈U

In this way, the update of the weights of any layer that satisfy (5.18) is performed considering only the finitely many values of λ on a grid defined on the interval [0, 1]. We refer to the optimisation procedure based on (5.20) as the H-max update step.

5.5

Wellposedness and Convergence Results

In this section, theoretically results concerning the wellposedness of the SQH method for training explicit RK neural networks are presented. As already mentioned, a starting point for the analysis of the SQH method is the following result [138, 234]. Lemma 5.2 Let the Assumptions 5.1 a)–d) be satisfied. Then, for any fixed δ ∈ (0, ∞), there exists a constant C > 0, such that for any two weight sequences w, u ∈ Uad the following estimate holds ˆ ˆ J(w) − J(u) ≤−δ

L−1 X

(H(xl (u), pl+1 (u), wl ) − H(xl (u), pl+1 (u), ul ))

l=0

+Cδ

L−1 X

kwl − ul k2 .

l=0

For convenience, we define the following: ∆H(w, u) := δ

L−1 X

(H(xl (u), pl+1 (u), wl ) − H(xl (u), pl+1 (u), ul )) .

l=0

Based on Lemma 5.2, one can prove the wellposedness of the SQH procedure with respect to the sufficient decrease condition (5.16). We have

Deep Learning in Residual Neural Networks

125

Lemma 5.3 Let the Assumptions 5.1 a)–d) be satisfied and u(k) , u(k+1) be generated by Algorithm 5.1. Then there exists a constant C > 0, independent of the parameter  > 0, currently chosen in Step 2, such that the following holds ˆ (k+1) ) − J(u ˆ (k) ) ≤ −( − C) |||u(k+1) − u(k) |||2 , J(u (k+1)

Proof. Let us denote wl = ul

. We have

(k)

(k+1)

H(xl (u(k) ), pl+1 (u(k) ), ul ) ≤ H(xl (u(k) ), pl+1 (u(k) ), ul

(k+1)

)− kul

(k)

−ul k2 ,

for all l ∈ {0, ..., L − 1}. By reordering this inequality, multiplying with δ ∈ (0, ∞) and summing both sides over all layers, we obtain 2

− ∆H(u(k+1) , u(k) ) ≤ − |||u(k+1) − u(k) ||| .

(5.21)

Now, we combine this inequality with the result of Lemma 5.2 to obtain ˆ (k+1) ) − J(u ˆ (k) ) ≤ − |||u(k+1) − u(k) |||2 + C |||u(k+1) − u(k) |||2 J(u 2

= −( − C) |||u(k+1) − u(k) ||| , where C > 0 is the same constant used in Lemma 5.2 and hence independent of the current choice of . Therefore if  > C is satisfied for a scaling parameter chosen by the SQH method, a decrease in the objective functional value is achieved. Moreover, as the constant C > 0 is independent of the currently chosen , the decrease of the functional value can be achieved after successively increasing  within a finite number of steps. Thus, there is a guarantee that, for any η > 0, the SQH method is able to generate a value of  in every iteration such that ( − C) > η holds and the sufficient decrease condition (5.16) is satisfied. With this result, the following theorem about the convergence behaviour of the SQH method in the framework of explicit RK neural networks can be stated. Theorem 5.2 Let the Assumptions 5.1 a)–d) be satisfied and (u(k) )k∈N be a sequence generated by the SQH method for training an explicit RK neural ˆ (k) ))k∈N of objective network with stepsize δ ∈ (0, ∞). Then the sequence (J(u functional values is monotonically decreasing with ˆ (k+1) ) − J(u ˆ (k) ) = 0 lim J(u

k→∞

and

2

lim |||u(k+1) − u(k) ||| = 0.

k→∞

(5.22)

(5.23)

Proof. Consider an iteration k ∈ N of the SQH method. By Lemma 5.3 and the iterative increase of the augmentation parameter  within one iteration of

126

The Sequential Quadratic Hamiltonian Method

the SQH method, the sufficient decrease condition (5.16) is satisfied after a finite number of steps. Hence, we can guarantee that the inequality     2 Jˆ u(k+1) − Jˆ u(k) ≤ −η |||u(k+1) − u(k) ||| ≤ 0, (5.24) with η ∈ (0, ∞), holds for all k ∈ N. Consequently, the sequence (J(u(k) ))k∈N is monotonically decreasing. In addition, since the objective functional is bounded from below, we have that the sequence of values of J is a Cauchy sequence and (5.22) is satisfied. By summing up both sides of (5.24) until the K-th iterate and using the boundedness of the discrete cost functional from below by zero, we obtain that the following inequality holds K−1 X

2

|||u(k+1) − u(k) ||| ≤

k=0

1 ˆ (0) J(u ), η

(5.25)

for any K ∈ N. Without loss of generality, we can assume J(u(0) ) < ∞. Hence, by taking the limit K → ∞ on both sides of the inequality (5.25), we have ∞ X

2

|||u(k+1) − u(k) ||| < ∞.

k=0

As a result, we have that (5.23) holds and the theorem is proved. By the previous theorem, we have that, for any positive constant κ, there exists an iteration step K ∈ N such that the stopping condition 2

|||u(k+1) − u(k) ||| < κ, holds, for all k > K. Hence, the SQH algorithm always terminates after a finite number of iterations as long as κ > 0 is assumed. Moreover, for the wellposedness of the stopping condition, we have to show that the SQH method is able to return a discrete-PMP consistent weight sequence if this is amongst the generated iterates. Lemma 5.4 Let the Assumptions 5.1 a)–d) be satisfied and (u(k) )k∈N be a sequence generated by the SQH method for training an explicit RK neural network with stepsize δ ∈ (0, ∞). If the iterate u(k) is optimal in the sense of the discrete PMP, the algorithm stops and returns these optimal weights. Proof. Let u(k) be generated by the SQH algorithm and assume this iterate satisfies the discrete PMP. Then for this iterate it holds (k)

max H(xl (u(k) ), pl+1 (u(k) ), u) = H(xl (u(k) ), pl+1 (u(k) ), ul ), u∈U

for all l ∈ {0, . . . , L − 1}. This result implies (k)

(k)

(k)

H (xl (u(k) ), pl+1 (u(k) ), ul , ul ) ≥ H (xl (u(k) ), pl+1 (u(k) ), u, ul ),

Deep Learning in Residual Neural Networks

127

for all l ∈ {0, . . . , L − 1} and all u ∈ U. To guarantee that the algorithm stays in its determined discrete PMP optimal weight sequence, we have to exclude that there exists a weight sequence w ∈ Uad and some layer m ∈ {0, . . . , L−1}, where (k) (k) H (xm (u(k) ), pm+1 (u(k) ), u(k) ), pm+1 (u(k) ), wm , u(k) m , um ) = H (xm (u m ), (5.26) holds, but kwm − um k2 > 0. Assume that the SQH method generates such a weight sequence w, within its k-th iteration, that is, after the discrete PMP optimal sequence u(k) has been determined. Together with (5.26) and the discrete PMP optimality of u(k) , we have 2 (k) kwm − u(k) ), pm+1 (u(k) ), wm ) − H(xm (u(k) ), pm+1 (u(k) ), u(k) m k = H(xm (u m ) (k) ≤ H(xm (u(k) ), pm+1 (u(k) ), u(k) ), pm+1 (u(k) ), u(k) m ) − H(xm (u m ) = 0. (k+1)

(k)

= wl for all As a result, we must have wm = um and therefore, ul (k+1) 2 l ∈ {0, . . . , L − 1}. Hence, it holds |||w − u ||| = 0 and the SQH algorithm terminates, returning the discrete PMP optimal weights u(k) .

5.6

Numerical Experiments

In this section, we discuss RK neural networks with a SQH learning procedure in solving an univariate approximation problem and a regression problem. For both experiments, we consider the following objective functional S L−1 Xυ 2 1X ˆ J(u) = h(xiu,L ) − y i + δ kul k2 , S i=1 2 l=1

with the weight υ > 0, and the hypothesis function is given by h(x) = h1N , xi2 . Further, we choose the hyperbolic tangent as activation function, which results in   F (x, u) = tanh IN ⊗ x> , 1 u = tanh (W x + b) . (5.27) We use the RK NN architecture given by xil+1 = xil + δF (xil , ul ).

(5.28)

We apply feature scaling to the data-label pairs in the form of min-max normalisation, to achieve a data set satisfying D(n) ⊆ [−1, 1]d × [−1, 1]p .

128

The Sequential Quadratic Hamiltonian Method

Notice that we use this transformed data solely for training the RK neural network. For evaluation on the test set, we apply an inverse transformation to the predictions made with the trained RK neural network in order to regain the original range of the labels. In these experiments the dimension of the labels is p = 1 and d resembles the number of features describing the dataset. For the (0) (0) initial guess of the value of the weights, that is, u(0) := (vec(Wl , bl ))L−1 l=0 , we have the following random initialisation (0)

Wl

(0)

∼ N (0, 1) ,

bl

l = 0, . . . , L − 1.

= 0,

Now, we consider the problem of approximating the following piecewise smooth function ϕ : [0, 1] → R,

x 7→ 3 s(x − 0.313) + s(x − 0.747) + 2 cos(4πx),

and

  x0

This function is proposed in [282] to test the performance of a neural network for function approximation; see Figure 5.2. In this experiment, we choose L = 20 trainable layers, with N = 5 nodes each and δ = 0.25. Prior to the feature-scaling, we generate the data points ¯ for training, given by {xi }Si=1 and for testing, given by {¯ xi }Si=1 . Each of these sets consists of grid points uniformly distributed over [0, 1] with the convention ¯ {xi }Si=1 ∩{¯ xi }Si=1 = ∅. In order to fit the dimension of the input layer of the RK neural network, we use a mapping P : R → R5 that concatenates each data point to a five-dimensional vector of identical entries. The SQH parameters’ values are given by  = 1 (initialisation), κ = 10−10 , σ = 1.1, ζ = 0.9 and η = 10−5 . The optimisation parameter in Jˆ is set to υ = 10−4 . 7

6

5

4

3

2

1

0

-1

-2

-3 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

FIGURE 5.2: The function ϕ (continuous blue line) approximated by the ResNet architecture (5.28) with weights determined by the SQH method (stardot red line).

Deep Learning in Residual Neural Networks

129

3.5

3

2.5

2

1.5

1

0.5 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

FIGURE 5.3: Classification plot visualising the predicted class of Iris plant. Since the HP function is differentiable with respect to the weights, the maximisation step at each layer can be performed by a BFGS scheme. However, results of numerical experiments show that the H-max approach is much more efficient in maximising the augmented HP function at each layer. In this way, the minimisation of Jˆ results significantly faster with the stopping tolerance achieved with two orders of magnitude less CPU time. In Figure 5.2, we depict the function ϕ, whose values on a mesh of S = 5000 points serve as the training set. The test is performed on S¯ = 800 points, providing the output that is also plotted in the figure. Next, we consider a classification problem based on the “Iris” data set that stems from the work by R.A. Fisher in the field of discriminant analysis [114]. The data set contains 3 classes of 50 instances each, where each class refers to a type of Iris plant. One class is linearly separable from the other two; but the latter two are not linearly separable from each other. An entry of the data set is characterised by 4 features and the corresponding class. The 4 known features are sepal length, sepal width, petal length, and petal width (all in cm), whereas the predicted attribute is the class to which the Iris plant belongs: Iris Setosa; Iris Versicolor; Iris Virginica. Our data set contains 150 samples, and we split this data into S = 120 samples for the training set and S¯ = 30 samples for the test set with 10 samples for each class. To model the relation between the attributes we use the RK architecture (5.28) with δ = 0.25 and L = 40 layers with N = 8 nodes each. Thus, we use a mapping P : R4 → R8 that concatenates the 4 input features of each data point to a eight-dimensional vector of entries that are pairwise identical. In our loss functional, we choose υ = 10−4 and, although we are considering a classification problem, we use the hypothesis function h(x) = h1N , xi2 , which is more appropriate for a regression problem. The training is performed with the SQH Algorithm 5.1, and H-max update. The values of the SQH parameters are given by:  = 1 (initialisation), κ = 10−10 , σ = 1.1, ζ = 0.9 and η = 10−5 . In Figure 5.3, we depict the training data set and the satisfactory results of classification (in a regression form) of the trained RK neural network applied to the test data set.

Chapter 6 Control of Stochastic Models

6.1 6.2 6.3 6.4 6.5 6.6

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Formulation of Ensemble Optimal Control Problems . . . . . . . . . . . The PMP Characterisation of Optimal Controls . . . . . . . . . . . . . . . . The Hamilton-Jacobi-Bellman Equation . . . . . . . . . . . . . . . . . . . . . . . . . Two SQH Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

131 133 136 140 143 147

In this chapter, the problems of determining open- and closed-loop controls for stochastic drift-diffusion processes are formulated as ensemble optimal control problems governed by the corresponding Fokker-Planck equations, which model the evolution of the probability density functions of the processes. The characterisation of these controls in the framework of the Pontryagin maximum principle is discussed, and a connection to the related dynamic programming framework is illustrated. In correspondence to the two classes of control mechanisms, two sequential quadratic Hamiltonian algorithms are developed and validated with numerical experiments.

6.1

Introduction

We consider a n-dimensional process X driven by the following stochastic differential equation (SDE) with a given initial condition [93, 207]. We have dX(t) = b(X(t), u(X(t), t)) dt + µ(X(t)) dW (t), X(t0 ) = X0 ,

(6.1)

where t ∈ (t0 , T ]. Notice that the state variable X(t) ∈ Ω ⊆ Rn is subject to deterministic infinitesimal increments driven by the vector valued drift function b, and to random increments proportional to a multi-dimensional Wiener process dW (t) ∈ Rm , with stochastically independent components, and µ ∈ Rn×m is the dispersion coefficient matrix, which we suppose to be full rank. Concerning existence and uniqueness of solutions X(t) to (6.1), for a given realisation of W (t), global Lipschitz and growth conditions on b and µ with DOI: 10.1201/9781003152620-6

131

132

The Sequential Quadratic Hamiltonian Method

respect to X are required; see, e.g., [123, 149]. We consider processes that satisfy these conditions. Notice that there are infinitely many realisations of W (t); thus, we have infinitely many trajectories X(t) in some given time interval [t0 , T ]. We refer to the set of all trajectories as the ensemble. In the drift-diffusion process above, the state configuration of the stochastic process at t0 is given by X0 , which in turn can be specified through its statistical distribution. Further, we consider a control function u in the drift and suppose that u ∈ U, where U represents the set of Markovian controls containing all jointly measurable functions u with values in a compact set Kad ⊂ Rn . As in the deterministic case, the purpose of the control could be to drive the stochastic process to follow (in a given statistical sense) a desired trajectory or attain a required terminal configuration. However, since X(t) is a random variable, its direct insertion in a deterministic cost functional results into a random objective, which requires an averaging step in order to properly define an optimisation problem [116]. Therefore, in the framework of stochastic optimal control theory [32, 116, 120], given a stochastic process X(t) subject to a control function u, a control problem is defined based on an expected value cost functional that has the following structure Z

T

J(X, u) = E[

`(t, X(t), u(X(t), t)) dt + γ(X(T ))],

(6.2)

t0

where E[·] represents the expected value with respect to the probability measure induced by the process X(t). Notice that, with this setting, the construction of the optimal control function must be based on the knowledge of all possible realisations of the stochastic trajectories. For this purpose, we remark that the state of a stochastic process can be completely characterised by the shape of its statistical distribution, which is represented by the corresponding probability density function (PDF). Further, we have a fundamental result in statistical mechanics showing that the evolution of the PDF associated to X(t) is governed by the so-called FokkerPlanck (FP) equation [117, 210] or Kolmogorov forward equation [158], which is a time-dependent partial differential equation (PDE) of parabolic type; see, e.g., [224] for a derivation of the FP equation starting from a discrete probabilistic process and for many related references. Correspondingly, the initial data required for the FP problem is given by the initial PDF distribution of X0 . The FP problem corresponding to (6.1) is as follows: ∂t f (x, t) +

n X i=1

∂xi (bi (x, u) f (x, t)) −

n X

∂x2i xj (aij (x) f (x, t)) = 0,

i,j=1

(6.3)

f (x, t0 ) = f0 (x), where f denotes the PDF of the stochastic process, f0 represents the initial

Control of Stochastic Models

133

PDF distributionR of the initial state of the process X0 , and hence we require f0 (x) ≥ 0 with Ω f0 (x) dx = 1. The diffusion coefficient aij represents the ijth element of the matrix a = µ µ> /2. We can see that in passing from trajectories to PDFs, the space of the state X has become the space of the independent variable x with the same dimension. We consider both (6.1) and (6.3) in the time interval [t0 , T ] and, without further specifications on X, we have that (6.3) defines a Cauchy problem in Rn . However, it is also possible to include the presence of barriers for the process X that possibly define a convex domain Ω ⊂ Rn where the process lives in the sense that X(t) ∈ Ω. Consequently, Ω is the domain where f is defined. In this latter case, we assume that the boundary ∂Ω is Lipschitz, and denote with Q := Ω × (t0 , T ) the space-time cylinder. Clearly, the choice of the barriers that bound the value of X(t) in Ω translate to boundary conditions for the solution of the FP equation [93, 123]. Specifically, in the case of absorbing barriers, we have homogeneous Dirichlet boundary conditions for f on ∂Ω. On the other hand, reflecting barriers correspond to flux-zero boundary conditions. In fact, notice that the PDE in (6.3) can be written in the flux form ∂tP f = ∇ · F(f ), where the ith component of the flux F is given n by Fi (f ) = j=1 ∂xj (aij (x) f (x, t)) − bi (x, u) f (x, t); thus, reflecting barriers require F(f ) · ν = 0, where ν represents the outward normal to ∂Ω. We remark that, in the absence of random increments (µ = 0), the Cauchy problem (6.1) is deterministic. In this case, we can also consider the case where the initial condition X0 is given in terms of an initial PDF distribution, thus obtaining an ensemble of trajectories whose distribution can also be described by a density function. It was L. Boltzmann [41] who demonstrated that this density function is the solution to the Liouville equation (name coined by Boltzmann) given by ∂t f (x, t) + ∇ · (b(x, u) f (x, t)) = 0,

(6.4)

with initial condition f (x, t0 ) = f0 (x).

6.2

Formulation of Ensemble Optimal Control Problems

We notice that, with the knowledge of the PDF, one can rewrite the functional (6.2) as follows: Z TZ Z J(f, u) := `(t, x, u(x, t)) f (x, t) dt dx + γ(x)f (x, T ) dx, (6.5) t0





where f represents the state variable that describes the ensemble of trajectories. With the help of this explicit structure, we can reinterpret the meaning

134

The Sequential Quadratic Hamiltonian Method

of J and thus the choice of the terms ` and γ. As in Chapter 1, we consider an objective functional J modelling the purpose and the cost of the action of the control, and we assume that ` : [t0 , T ] × Rn × Rn → R has the following composite structure `(t, x, u) = h(t, x) + g(u). (6.6) In a deterministic context, a typical choice of these functions is the following h(t, x) = |x − Xd (t)|2 ,

g(u) =

ν 2 |u| , 2

γ(x) =

α |x − XT |2 , 2

(6.7)

where | · | denotes the Euclidean norm, Xd ∈ L2 (t0 , T ; Rn ) represents a desired deterministic trajectory, and XT ∈ Rn represents a desired target configuration at final time. In our case, where the purpose is to drive the entire ensemble of trajectories, the role of h and γ in (6.5) is to define attracting potentials, that is, well-centred at desired minimum points, such that the minus gradient of the potential is directed towards this minimum. Thus, the tracking term h and the terminal cost γ have the significance of ‘valleys’, where the PDF tends to concentrate in order to reach a minimum of the functional. Notice that, in the case of unbounded domains, the choice of h and γ as convex functions may be problematic because of integrability issues. However, other potentials can be chosen, and we take the following ones:   |x − Xd (t)|2 h(t, x) = −αh exp − , (6.8) 2µ2h and

  |x − XT |2 γ(x) = −αγ exp − , 2µ2γ

(6.9)

where the parameters µh , µγ and αh , αγ are positive. The description of the functional J is completed by specifying the cost of the control function denoted with g. In particular, as in (6.7), this function may correspond to L2 -costs of the control. At this point, let us comment on the fact that ensemble optimal control problems arise in correspondence to kinetic models, whose solution represents a material or probability density. In this field, we mention the works focusing on the Liouville equation [22, 23, 57, 251], on Fokker-Planck-type equations [231, 232, 263], and on kinetic models with Keilson-Storer collision terms [24, 25]. We remark that, in the framework of ensemble optimal control problems, there is a fundamental difference in our understanding of the control function in the stochastic process depending on whether or not it is a function of the state X or it is only a function of time. In the former case, we have a so-called closed-loop control, in the sense that a sudden change of the state of the process X(t) provides instantaneously (feedback) the optimal control for the new state configuration. In this case, the functional dependence of u = u(x, t) is for both arguments unknown and must be determined. (Notice that the

Control of Stochastic Models

135

case u = u(x) also belongs to this realm.) In the latter case, we can have u = uvw (x, t) := (v(t) + w(t) ◦ x) (◦ denotes the Hadamard product), where v, w : [t0 , T ] → Rn are the functions to be determined, whereas the dependence on x is given. In this case, u represents an open-loop control function. In [58, 57] a discussion on the advantages and limitations of these two settings is presented that motivates the study of the open-loop control given above as an approximation of the closed-loop control. This discussion originates from the need to establish a trade-off between the complexity of implementing a feedback control strategy and the performance of the controlled system. Since the cost of implementing a closed-loop control mechanism is often prohibitive and may be not justified by real applications, one can attempt to strike a balance between the desired performance of the system and the cost of implementing an effective control. A way to achieve this goal is to consider an ensemble control strategy based on the density of the ensemble of trajectories modelled by the FP equation (or the Liouville equation in the deterministic case) that accommodates uncertainty in the initial conditions and on the dynamics. The ensemble approach, together with the open-loop control mechanism given above, aims at achieving robustness with controls that are easier to implement. In the following, we discuss both closed- and open-loop settings and present a comparison with numerical experiments. Next, we specify two ensemble optimal control problems in a twodimensional setting. For the open-loop case, we consider the following stochastic process dX(t) = (v(t) + w(t) ◦ X(t)) dt + µ dW (t). (6.10) In the closed-loop case, our SDE model is given by dX(t) = u (X(t), t) dt + µ dW (t),

(6.11)

In both models, where the drift coincides with the control mechanism, we assume a constant diagonal matrix dispersion with identical diagonal elements µ > 0, and dW (t) ∈ R2 , with Ω ⊂ R2 a square domain defined by absorbing barriers. We also assume that the admissible control values are bounded such that v(t), w(t) ∈ Kvw and u(x, t) ∈ Ku , where Kvw and Ku represent compact convex subsets of R2 . Correspondingly, the sets of admissible control functions are denoted with Vad and Uad , respectively. In the closed-loop case, our ensemble optimal control problem is given by Z TZ Z min J (f, u) := ` (t, x, u(x, t)) f (x, t) dxdt + γ (x) f (x, T ) dx f,u

t0





µ2 ∆f (x, t) = 0, s.t. ∂t f (x, t) + ∇ · (u(x, t) f (x, t)) − 2 f (x, t0 ) = f0 (x) u ∈ Uad . (6.12)

136

The Sequential Quadratic Hamiltonian Method

In the open-loop case, we have a similar structure. However, it is convenient to explicitly report the control functions as follows: Z Z TZ ` (t, x, v(t), w(t)) f (x, t) dxdt + γ (x) f (x, T ) dx min J (f, v, w) := f,v,w

t0





µ2 s.t. ∂t f (x, t) + ∇ · ((v(t) + w(t) ◦ x) f (x, t)) − ∆f (x, t) = 0, 2 f (x, t0 ) = f0 (x) v, w ∈ Vad . (6.13)

6.3

The PMP Characterisation of Optimal Controls

In this section, we discuss some results concerning the solution of the FP problem (6.3) and of the related ensemble optimal control problems. Further, we illustrate the characterisation of optimal controls in the framework of the Pontryagin maximum principle. We start our discussion with the open-loop setting, in which case we have the following result [37, 53, 109]. Theorem 6.1 Let f0 ∈ L∞ (Ω) ∩ H01 (Ω), and v, w ∈ Vad . Then the forward FP problem ∂t f (x, t) + ∇ · ((v(t) + w(t) ◦ x) f (x, t)) −

µ2 ∆f (x, t) = 0, 2

(6.14)

f (x, t0 ) = f0 (x),   admits a unique solution f ∈ L2 0, T ; H 2 (Ω) ∩ L∞ 0, T ; H01 (Ω) , and this solution satisfies the following: kf kL∞ (Q) ≤ C kf0 kL∞ (Ω) , where C := C Ω, µ, T, kv + x ◦ wkL∞ (Q) , k

Pn

i=1

(6.15)

 wi kL∞ (Q) > 0.

With standard variational calculus one obtains the adjoint FP problem for (6.13). It is given by ∂t p(x, t) + ((v(t) + w(t) ◦ x) · ∇p(x, t) +

µ2 ∆p(x, t) = ` (t, x, v(t), w(t)) , 2

p (x, T ) = −γ (x) , (6.16)

Control of Stochastic Models

137

Assuming that v, w ∈ Vad , and (6.8) then ` ∈ L∞ (Q). Furthermore, with the choice (6.9), we have γ ∈ L∞ (Ω) ∩ H 1 (Ω). Thus, we  can prove existence  of a unique solution to (6.16) with p ∈ L2 0, T ; H 2 (Ω) ∩ L∞ 0, T ; H01 (Ω) ; see [109]. Furthermore, we have [53]: Theorem 6.2 For the solution to (6.16), it holds  kpkL∞ (Q) ≤ C k` (·, ·, v, w) kLq (Q) + kγkL∞ (Ω) ,  for C := C Ω, µ, T, maxi=1,...,n kv i + xi wi kL∞ (Q) > 0. Notice that, in the adjoint FP problem (6.16), the solution of the forward FP problem does not appear. This is due to the linearity of our cost functional with respect to the PDF. With these preliminary results, existence of solutions to (6.13) are proved in [53]. Our focus is the characterisation of solutions to (6.13) in the PMP framework. In this case, the Hamilton-Pontryagin function is given by  H (x, t, f, v, w, ζ) := ζ · (v + x ◦ w) − ` (t, x, v, w) f, where the arguments of H correspond to the values of the depicted functions. The term ζ stands for the value of the gradient of the adjoint variable. The following lemma provides a direct relationship between the values of the objective functional at different triples (f, v, w) and the values of the corresponding HP function. The triple (f1 , v1 , w1 ) denotes the solution f1 to the FP problem with v = v1 and w = w1 , and similarly with other indices. We have [53] Lemma 6.1 Let (f1 , v1 , w1 ) and (f2 , v2 , w2 ) be solutions to the FP problem in (6.13), and let p1 be the solution to the adjoint problem (6.16) with v = v1 and w = w1 . Then it holds J (f1 , v1 , w1 ) − J (f2 , v2 , w2 ) Z TZ   =− H (x, t, f2 , v1 , w1 , ∇p1 ) − H (x, t, f2 , v2 , w2 , ∇p1 ) dxdt. (6.17) t0



Proof. In the calculation that follows, for ease of notation, we omit to write the functional dependency on x and t. We have J (f1 , v1 , w1 ) − J (f2 , v2 , w2 ) Z TZ Z = ` (v1 , w1 ) f1 dxdt + γ f1 (·, T ) dx Z

t0 T





Z

Z



` (v2 , w2 ) f2 dxdt − t0 Z T

Z 

t0



=



γ f2 (·, T ) dx Ω

 ` (v1 , w1 ) f2 + ` (v1 , w1 ) (f1 − f2 ) − ` (v2 , w2 ) f2 dxdt

Z γ (f1 − f2 ) (·, T ) dx.

+ Ω

(6.18)

138

The Sequential Quadratic Hamiltonian Method

Next, we elaborate on the following term using the setting of the adjoint problem to replace ` (v1 , w1 ). We have Z TZ ` (v1 , w1 ) (f1 − f2 ) dxdt t0

Ω T Z

µ2 ∇p1 · ∇ (f1 − f2 ) 2 t0 Ω   + (v1 + x ◦ w1 ) · ∇p1 (f1 − f2 ) dxdt Z TZ  µ2 = − p1 ∂t (f1 − f2 ) − (∇f1 − ∇f2 ) · ∇p1 (6.19) 2 t0 Ω Z  − ∇ ((v1 + x ◦ w1 ) (f1 − f2 )) p1 dxdt − γ (f1 − f2 ) (·, T ) dx Z



=

∂t p1 (f1 − f2 ) −



Z

T

Z



 =− p1 ∇ ((v2 + x ◦ w2 ) f2 ) − ∇ ((v1 + x ◦ w1 ) f2 ) dxdt t Ω Z 0 − γ (f1 − f2 ) (·, T ) dx, Ω

Combining (6.18) and (6.19), we obtain J (f1 , v1 , w1 ) − J (f2 , v2 , w2 ) Z TZ  = ` (v1 , w1 ) f2 + ∇ ((v1 + x ◦ w1 ) f2 ) p1 − ` (v2 , w2 ) f2 Ω t0  − ∇ ((v2 + x ◦ w2 ) f2 ) p1 dxdt Z TZ  = ` (v1 , w1 ) f2 − (v1 + x ◦ w1 ) · ∇p1 f2 − ` (v2 , w2 ) f2 t0 Ω  + (v2 + x ◦ w2 ) · ∇p1 f2 dxdt Z TZ   =− H (x, t, f2 , v1 , w1 , ∇p1 ) − H (x, t, f2 , v2 , w2 , ∇p1 ) dxdt. t0



The classical way of proving the PMP characterisation of a solution to the FP ensemble optimal control (6.13) is by means of needle variations. In order to define a needle variation for given v˜, w ˜ ∈ Vad at tˆ ∈ (0, T ), we consider the ball Sk tˆ centred in tˆ as follows: ( (   v˜ (t) t ∈ (t0 , T ) \Sk tˆ w ˜ (t) t ∈ (t0 , T ) \Sk tˆ   vk (t) := , wk (t) := , v t ∈ Sk tˆ ∩ (t0 , T ) w t ∈ Sk tˆ ∩ (t0 , T ) where v, w ∈ Kvw . These variations should be understood  component-wise for all components of v and w. Notice that limk→∞ |Sk tˆ | = 0. Now, we can state the following PMP characterisation.

Control of Stochastic Models  Theorem 6.3 Let f¯, v¯, w ¯ be a solution to (6.13). Then it holds Z  H x, t, f¯(x, t), v¯(t), w(t), ¯ ∇¯ p(x, t) dx Ω Z  = max H x, t, f¯(x, t), v, w, ∇¯ p(x, t) dx, v,w∈Kvw

139

(6.20)



for almost all t ∈ (t0 , T ) where p¯ is the solution to (6.16) for v = v¯ and w = w. ¯ Proof. Since vk , wk ∈ Vad for all k ∈ N, we have with Lemma 6.1 that for any k ∈ N the following holds. (Whenever possible, we omit to write the functional dependency on x and t.) 1

  J (fk , vk , wk ) − J f¯, v¯, w ¯ ˆ |Sk t | ! Z TZ    1 ¯ ¯  =− H x, t, f , vk , wk , ∇pk − H x, t, f , v¯, w, ¯ ∇pk dxdt |Sk tˆ | t0 Ω ! Z Z    1  =− H x, t, f¯, v, w, ∇¯ p − H x, t, f¯, v¯, w, ¯ ∇¯ p dxdt |Sk tˆ | Sk (tˆ) Ω Z  Z 1  (∇pk − ∇¯ p) · (v + x ◦ w) f¯ − |Sk tˆ | Sk (tˆ) Ω   + (∇¯ p − ∇pk ) · (¯ v + x ◦ w) ¯ f¯ dxdt Z  Z    1  =− H x, t, f¯, v, w, ∇¯ p − H x, t, f¯, v¯, w, ¯ ∇¯ p dxdt |Sk tˆ | Sk (tˆ) Ω Z  Z  1  (pk − p¯) ∇ (v + x ◦ w) f¯ + |Sk tˆ | Sk (tˆ) Ω   + (¯ p − pk ) ∇ (¯ v + x ◦ w) ¯ f¯ dxdt ,

0≤

(6.21) for all v, v ∈ Kvw , and (fk , vk , wk ) denotes a FP solution triple. Now, we refer to [53] for a proof of the following result. lim kpk − p¯kL∞ (Q) = 0.

k→∞

Further analysis in [53] shows that the last line in (6.21) goes to zero as k → ∞. Thus, by taking this limit on both sides of the inequality (6.21), we obtain Z    0≥ H x, t, f¯, v, w, ∇¯ p − H x, t, f¯, v¯, w, ¯ ∇¯ p dx, Ω

for all v, w ∈ Kvw and for almost all t ∈ (t0 , T ), renaming tˆ into t.

140

The Sequential Quadratic Hamiltonian Method

We remark that the integral over Ω in (6.20) results from the fact that the controls depend only on the time variable, and so the needle variation. This is in contrast to the case where the control depends on both variables (x, t), see, e.g., [51, 52, 222], in which case the needle variation is defined in Q. In the closed-loop case, the forward FP problem with homogeneous Dirichlet boundary condition is discussed in [11, 115], and also in this case, a L∞ bound for the PDF, analogous to Theorem 6.1, can be shown based on [37, Theorem 3.1]. The adjoint FP problem for this case is given by ∂t p(x, t) + u(x, t) · ∇p(x, t) +

µ2 ∆p(x, t) = ` (t, x, u(x, t)) , 2

(6.22)

p (x, T ) = −γ (x) . Notice that, also in this case, in the adjoint FP problem the PDF does not appear. Since u ∈ Uad ⊂ L∞ (Q), one can obtain a L∞ bound for the solution of the adjoint problem that is analogous to that of Theorem 6.2. The HP function corresponding to (6.12) is given by  H (x, t, f, u, ζ) := ζ · u − ` (t, x, u) f. (6.23) A solution to (6.12) has the following PMP characterisation [53]  Theorem 6.4 Let f¯, u ¯ be a solution to (6.12). Then it holds   H x, t, f¯(x, t), u ¯(x, t), ∇¯ p(x, t) = max H x, t, f¯(x, t), u, ∇¯ p(x, t) , (6.24) u∈Ku

for almost all (x, t) ∈ Q, where p¯ is the solution to (6.22) for u = u ¯. Notice that, since the HP function is continuous on the control argument, the control obtained in (6.24) as a result of an argmax-function is measurable; see [226, 14.29 Example, 14.37 Theorem].

6.4

The Hamilton-Jacobi-Bellman Equation

In the framework of closed-loop controls and ensemble optimal control problems, a connection between the adjoint Fokker-Planck equation and the optimality condition, appearing in the PMP optimality system, and the Hamilton-Jacobi-Bellman equation has been discussed in [14] and further used and analysed in, e.g., [12, 53, 64, 231, 263]. In order to illustrate this connection, we consider the following functional Z

T

`(s, X(s), u(X(s), s))ds + γ(X(T )) | X(t0 ) = x0 ]. (6.25)

Ct0 ,x0 (u) = E[ t0

Control of Stochastic Models

141

This is the conditional expectation to the process X(t) driven by (6.11) and taking the value x0 at time t0 . The optimal control u ¯ that minimises Ct0 ,x0 (u) is given by u ¯ = argminu∈U Ct0 ,x0 (u). (6.26) Now, we consider a variable point (x0 , t0 ) as (x, t) and define a value function as follows: q(x, t) := min Ct,x (u) = Ct,x (¯ u). (6.27) u∈U

A fundamental result in stochastic optimal control theory is that the function q is the solution to the so-called Hamilton-Jacobi-Bellman (HJB) equation given by  ∂t q + H(x, t, ∇q, ∇2 q) = 0, (6.28) q(x, T ) = γ(x), with the HJB Hamiltonian function i h µ2 ∆q(x, t) . H(x, t, ∇q, ∇2 q) := min `(t, x, v) + v · ∇q(x, t) + v∈Kad 2

(6.29)

Furthermore, in the case of absorbing barriers for the stochastic process, we have that the value function must be zero at the boundary of the domain Ω. Notice that the diffusion coefficient does not depend on the control, thus the second-order differential term can be put outside the parenthesis in (6.29). The HJB framework represents an essential tool in the so-called dynamic programming approach to compute closed-loop controls [29, 36]. This framework poses the challenging task to analyse existence and uniqueness of solutions to the nonlinear HJB equation; see [94] for a fundamental work in this field. However, this task is facilitated in the case of uniform parabolicity as in our case with µ > 0. By comparison, one can see that the optimisation problem (6.26) can be equivalently stated as the ensemble optimal control problem (6.12) with f0 (x) = δ(x − x0 ) (the Dirac delta), and u ∈ U. We remark that, since our FP equation is uniformly parabolic, the PDF is almost everywhere nonnegative, and we can write the PMP condition (6.24) in the following form (∇¯ p(x, t) · u ¯(x, t) − ` (t, x, u ¯(x, t))) = max (∇¯ p(x, t) · u − ` (t, x, u)) , (6.30) u∈KU

for almost all (x, t) ∈ Q. This result and the fact that the PDF does not enter in the formulation of the adjoint FP problem imply that the optimal control u is independent of f , which is consistent with the requirement that u defines a feedback law. Now, based on the discussion above, we write the PMP condition in the form corresponding to a minimality condition with the HP function given by H (x, t, u, ζ) = (u · ζ + ` (t, x, u)) ,

142

The Sequential Quadratic Hamiltonian Method

where we omit f . Further, we refer to the adjoint variable defined in the previous section but with the opposite sign. Then, the PMP results expressed by the condition H (x, t, u(x, t), ∇p(x, t)) = min H (x, t, v, ∇p(x, t)) , v∈KU

(6.31)

for almost all (x, t) ∈ Q, where p solves the following adjoint problem ∂t p(x, t) + u(x, t) · ∇p(x, t) +

µ2 ∆p(x, t) + ` (t, x, u(x, t)) = 0, 2

(6.32)

p (x, T ) = γ (x) . Moreover, the adjoint variable satisfies homogeneous Dirichlet boundary conditions. Therefore at optimality, the adjoint equation together with the PMP condition can be written as ∂t p + H(x, t, ∇p, ∇2 p) = 0, which allows to identify p with the value function q. Equation (6.32) without the term ` resembles the Kolmogorov backward equation [158]. Notice that a similar derivation of the HJB equation holds in the deterministic case (µ = 0) with the FP equation replaced by the Liouville equation (6.4). In particular, consider the following optimal control problem governed by a control-affine system Z

T

min J(y, u) :=



`(y(t)) +

t0

 α |u(t)|2 dt + γ(y(T )) 2

0

s.t. y (t) = f (y(t)) + g(y(t)) u(t),

y(t0 ) = x0 ,

(6.33)

u ∈ L2 (t0 , T ; Rm ), where y(t) ∈ Rn , and `, f , and g are continuously differentiable maps. In this case, for a fixed (t0 , x0 ), the PMP optimality condition (in the form of minimality condition) leads to the following optimal control u(t) = −g(y(t))T p(t)/α,

(6.34)

where y and p correspond to the optimal trajectory. Now, in correspondence to (6.33), we consider the associated ensemble optimal control problem governed by the Liouville equation, where the feedback control function u ˜(x, t) is sought. However, in this case we do not have uniform parabolicity to claim that f (x, t) > 0, but the fact that the control acts on an existing state configuration x at a given time t which means f (x, t) > 0. Thus, the PMP minimality condition for our Liouville control problem gives the relation u ˜(x, t) = −g(x)> ∇˜ p(x, t)/α, (6.35) where p˜ solves the following adjoint Liouville equation ∂t p˜(x, t) + [f (x) + g(x) u ˜(x, t)] · ∇˜ p(x, t) + ` (x) + with terminal condition p˜ (x, T ) = γ (x).

α |˜ u(x, t)|2 = 0, 2

Control of Stochastic Models

143

Next, we insert (6.35) in the adjoint equation and obtain the following ∂t q(x, t) + f (x) · ∇q(x, t) + ` (x) −

1 ∇q(x, t)> g(x) g(x)> ∇q(x, t) = 0, 2α

where we have replaced the adjoint variable p˜ with the value function q as discussed above. In fact, this is the HJB equation of the dynamic programming approach. By solving this equation, with the terminal condition q (x, T ) = γ (x), we can use (6.35) to obtain the feedback law u(t) = u ˜(y(t), t). Notice that, by comparison of (6.34) with (6.35), at optimality we have p(t) = ∇˜ p(y(t), t), as proved in [86]. Furthermore, in this framework, it is recognised that the optimal trajectories characterised by the PMP optimality system for (6.33) represent the characteristic curves of the HJB equation [253].

6.5

Two SQH Methods

In correspondence to the PMP characterisation of solutions to our FP ensemble optimal control problems, we formulate two SQH procedures. In the first case, with (6.20) we implement the SQH method for solving the open-loop control problem. In the second case, in view of (6.31) we formulate a variant of the SQH method for determining the optimal close-loop control. This latter variant has similarity with a well-known implementation for solving the HJB equation [281]. For the implementation, numerical approximations of the FP and adjoint FP problems are required. For this purpose, we use the second-order accurate and positive preserving Chang-Cooper (CC) scheme combined with a secondorder backward Euler (BDF2) scheme; see, e.g., [11, 53, 72, 195]. As discussed in [11, 232], the numerical adjoint of this scheme provides a second-order accurate approximation of the adjoint FP problem. In the case of a SQH method to solve our open-loop FP ensemble optimal control problem, we consider the following augmented HP function  H (x, t, f, v, v˜, w, w, ˜ ζ) := H (x, t, f, v, w, ζ) −  |v − v˜|2 + |w − w| ˜2 . The SQH method for the open-loop setting is implemented with the following algorithm. Algorithm 6.1 (SQH method for open-loop ensemble controls) Input: initial approx. v 0 , w0 , max. number of iterations kmax , tolerance κ > 0,  > 0, σ > 1, η > 0, and ζ ∈ (0, 1); set τ > κ, k := 0. Compute f 0 by solving (6.14) with v = v 0 and w = w0 . while (k < kmax && τ > κ ) do

144

The Sequential Quadratic Hamiltonian Method

1) Compute pk by solving (6.16) with v = v k and w = wk . 2) Determine v k+1 , wk+1 ∈ Vad such that the following optimisation problem is satisfied Z  H x, t, f k , v k+1 , v k , wk+1 , wk , ∇pk dx Ω Z  = max H x, t, f k , v, v k , w, wk , ∇pk dx, v,w∈Kvw



for all t ∈ [0, T ]. 3) Compute f k+1 by solving (6.14) with v = v k+1 and w = wk+1 . 4) Compute τ = kv k+1 − v k k2L2 (t0 ,T ) + kwk+1 − wk k2L2 (t0 ,T ) .   5) If J f k+1 , v k+1 , wk+1 − J f k , v k , wk > −η τ , then increase  with  = σ  and go to Step 2.   Else if J f k+1 , v k+1 , wk+1 − J f k , v k , wk ≤ −η τ , then decrease  with  = ζ  and continue. 6) Set k := k + 1. end while Notice that the maximisation in Step 2 can always be performed, see [52, Lemma 4.1]. Further, if a control is attained that is PMP optimal, then the algorithm will stop, see [52, Lemma 4.3]. On the other hand, in Step 5, if no sufficient decrease of the value of J is attained, a larger value of  can always be found in finitely many steps such that we obtain an update of the control that satisfies the decrease condition given with η. In the case of the closed-loop control setting (6.12), we consider a variant of the SQH method that consistently implements the equivalence of the FP control problem with the HJB formulation. This variant is obtained according to (6.31) by introducing the following augmented HP function to implement a minimum principle ˜  (x, t, u, u H ˜, ζ) := (ζ · u + ` (t, x, u)) +  |u − u ˜|2 .

(6.36)

Thus, the density f is not considered in the optimisation process. However, we solve the following FP problem for evaluating the objective functional. We have ∂t f (x, t) + ∇ · (u(x, t) f (x, t)) −

µ2 ∆f (x, t) = 0, 2

f (x, t0 ) = f0 (x). (6.37)

The resulting method is named SQH direct Hamiltonian (SQH-DH), and is implemented in the following algorithm:

Control of Stochastic Models

145

Algorithm 6.2 (SQH-DH method for closed-loop ensemble controls) Input: initial approx. u0 , max. number of iterations kmax , tolerance κ > 0,  > 0, σ > 1, η > 0, and ζ ∈ (0, 1); set τ > κ, k := 0. Compute f 0 by solving (6.37) with u = u0 . while (k < kmax && τ > κ ) do 1) Compute pk by solving (6.32) with u = uk . 2) Find uk+1 ∈ Uad such that   ˜  x, t, v, uk , ∇pk , ˜  x, t, u, uk ∇pk = min H H v∈Ku

for all t ∈ [t0 , T ] . 3) Compute f k+1 by solving (6.37) with u = uk+1 . 4) Compute τ = kuk+1 − uk k2L2 (Q) .   5) If J f k+1 , uk+1 − J f k , uk > −η τ , then increase  with  = σ  and go to Step 2.   Else if J f k+1 , uk+1 − J f k , uk ≤ −η τ , then decrease  with  = ζ  and continue. 6) Set k := k + 1. end while It appears that, at convergence, the solution of the adjoint problem corresponds to solving the HJB equation as discussed in [281]. However, the gradual update of the control thanks to the quadratic penalisation makes the SQH-DH approach robust and does not suffer of instabilities for large time-step sizes. We see that, in both SQH optimisation schemes, Step 2 requires the pointwise optimisation of the augmented HP function in small dimensional compact sets. As already mentioned, these problems can be solved by, e.g., derivativefree optimisation methods [90, 171]. However, one advantage of the SQH strategy is that in many cases the pointwise optimisation of the augmented HP function can be performed beforehand by analytical means. Specifically, in the open-loop case can we consider ` (t, x, v, w) = h (t, x) +

ν (|v|2 + |w|2 ), 2

146

The Sequential Quadratic Hamiltonian Method

where h is given by (6.8). Thus, we have Z Z Z  H (x, t, f, v, v˜, w, w, ˜ ∇p) dx = ∇p · (v + x ◦ w) f dx − h f dx Ω Ω Ω Z Z  ν f dx −  |v − v˜|2 + |w − w| ˜2 dx − (|v|2 + |w|2 ) 2 Ω Ω  ν ˜2 , = v · % + w · ς − B − A (|v|2 + |w|2 ) −  |Ω| |v − v˜|2 + |w − w| 2 R f (x, t) dx, B (t) := Ω Rwhere |Ω| is the measure of Ω, A (t) := h(t, x) f (x, t) dx, and Ω Z Z % (t) := ∇p (x, t) f (x, t) dx, ς (t) := x ◦ ∇p (x, t) f (x, t) dx. Ω



Notice that the integral above, for a fixed t, gives a concave quadratic function of the components of v and w. Therefore, without constraints on these controls, the maximum is identified by setting to zero the derivatives of the integral with respect to these components. We have %i + 2  |Ω| v˜i , 2  |Ω| + A ν ςi + 2  |Ω| w ˜i w ¯i = . 2  |Ω| + A ν v¯i =

However, in the presence of box constraints with Kvw = [a, b]2 , a projection is required. Thus, we obtain the following update vi (t) = min (max (a, v¯i (t)) , b) ,

wi (t) = min (max (a, w ¯i (t)) , b) .

for any t ∈ [0, T ], and i = {1, 2}. Next, we consider the ensemble closed-loop control problem (6.12). In this case, we choose to use a minimum principle and consider ` (t, x, u) = h (t, x) +

ν 2 |u| . 2

The admissible set of values of the control is given by the interval Ku = [a, b]2 . We can proceed as above in order to determine the value of u where the following augmented HP function takes a minimum.   ν ˜  (x, t, u, u H ˜, ∇p) = ∇p · u + h (t, x) + |u|2 +  |u − u ˜|2 . 2 We obtain

    2˜ ui − ∇pi ui = min max a, ,b . 2 + ν

Control of Stochastic Models

6.6

147

Numerical Experiments

In this section, we discuss results of numerical experiments for validating the ensemble optimal control framework and the ability of the resulting open and closed-loop controls to drive stochastic processes where the controls are identified with the drift. We consider a two-dimensional setting with Ω = (−2, 2) × (−2, 2), and t0 = 0, T = 2. Uniform grids are defined on the space domain and on the time interval with number of subdivision given by Nx = 40 and Nt = 80, respectively. The initial condition for the FP problem is given by the following normalised Gaussian distribution f0 (x) =

1 − |x−x20 |2 e 2r , 2πr2

(6.38)

where r = 0.3 and x0 = (−1, 0). In the FP equation, we choose a diffusion 2 coefficient D = µ2 = 10−2 . In the ensemble objective functional, we choose h (t, x) = −

10−3 − |x−xd2(t)|2 2r , e 2πr2

where Xd : [0, T ] → R2 is given by the arc   t−1 Xd (t) = . sin (π t/2)

(6.39)

(6.40)

The terminal function γ is taken as γ(x) = h(T, x). Further, we take ν = 10−4 , a = −4 and b = 4. The parameters for both SQH Algorithms are set as follows. The initial value  = 102 , and the controls are initialised by zero functions. We choose η = 10−8 , σ = 20, ζ = 0.3, kmax = 200 and κ = 10−14 . We show the ability of the resulting optimal control problems to drive the stochastic models by using them in Monte Carlo simulation thus verifying that the ensemble of trajectories follows the desired path. Furthermore, by taking different initial conditions in the simulation of the controlled SDEs, we can verify the ability of the closed-loop control to appropriately drive the system from any initial configuration. On the other hand, we also validate the performance of the open-loop controls in approximating the closed-loop mechanism. In Figure 6.1a, we plot the open-loop optimal control functions v = (v1 , v2 ) and w = (w1 , w2 ) in [0, T ], and in Figure 6.1b, we depict the minimisation history of the cost functional along the SQH iterations. In Figure 6.2a, we plot the closed-loop optimal control functions u = (u1 , u2 ) in Ω at t = 1 and t = 1.5, and in Figure 6.2b, we depict the minimisation history of the cost functional along the SQH-DH iterations. Also in this case, notice that the functional is monotonically decreasing.

148

The Sequential Quadratic Hamiltonian Method 1.5

2 1.5 1

1

0.5 0

0.5

-0.5 -1

0 0

0.5

1

1.5

2

-0.4

-0.6

-0.6

-0.8

0

0.5

1

1.5

2

0

0.5

1

1.5

2

-0.8 -1 -1 -1.2 -1.2 -1.4

-1.4 -1.6

-1.6 0

0.5

1

1.5

2

(a) The optimal controls v and w. 10 -3

0

-0.5

-1

-1.5

-2

-2.5 0

5

10

15

20

25

30

35

40

(b) Minimisation history of the cost functional with the SQH iteration.

FIGURE 6.1: Results for the open-loop problem. In order to allow a more direct comparison of the controls obtained with the two settings, in Figure 6.3a, we plot again the closed-loop optimal control function u = (u1 , u2 ) in Ω at t = 1 and t = 1.5, and compare it with (v +w ◦x) depicted in Figure 6.3b at t = 1 and t = 1.5; notice the qualitative similarities. Next, we show that the two controls perform similarly well when the initial conditions for the two stochastic models coincide with x0 , where f0 is centred. For this purpose, in Figure 6.4 we plot the evolution of E[X(t)] for the two stochastic processes in [0, T ]. For a more detailed comparison, in the same figure, we also plot a few (10) stochastic trajectories of the two models. We see that the distribution of these trajectories confirms the plots of the mean E[X(t)]. On the other hand, we notice that the closed-loop control is more effective in attaining the tracking objective.

Control of Stochastic Models

149

(a) The optimal control u in Ω at t = 1 and t = 1.5. 10 -3

0

-0.5

-1

-1.5

-2

-2.5

-3

-3.5

-4

-4.5 0

2

4

6

8

10

12

14

16

18

(b) Minimisation history of the cost functional with the SQH-DH iteration.

FIGURE 6.2: Results for the closed-loop problem. Next, we explore the ability of the resulting controls in defining a feedback law. For this purpose, we insert the computed controls in our stochastic models and choose an initial condition X0 that differs from x0 . The resulting trajectories are plotted in Figure 6.5 and compared with the previous ones obtained with X0 = x0 . We see that the closed-loop control is able to drive the SDE to follow the desired trajectory and attain the target configuration. In comparison, the open-loop control is only partially able to perform the same task.

150

The Sequential Quadratic Hamiltonian Method

(a) The optimal control u in Ω at t = 1 (top) and t = 1.5 (bottom).

(b) The optimal control v + w ◦ x in Ω at t = 1 (top) and t = 1.5 (bottom).

FIGURE 6.3: Comparison of closed-loop (left) and open-loop controls.

Control of Stochastic Models 2

2

1.5

1.5

1

1

0.5

0.5

0

0

-0.5

-0.5

-1

-1

-1.5

-1.5

-2

151

-2

-2 2

-1.5

-1

-0.5

0

0.5

1

1.5

2

1.5

-2 2

-1.5

-1

-0.5

0

0.5

1

1.5

2

-1.5

-1

-0.5

0

0.5

1

1.5

2

1.5

1

1

0.5

0.5

0

0

-0.5

-0.5

-1

-1

-1.5

-1.5

-2

-2 -2

-1.5

-1

-0.5

0

0.5

1

1.5

2

-2

FIGURE 6.4: Evolution of E[X(t)] (circles) and a few trajectories (10); the dashed line depicts the desired trajectory. Left: the closed-loop case; right: the open-loop case. 2

2

1.5

1.5

1

1

0.5

0.5

0

0

-0.5

-0.5

-1

-1

-1.5

-1.5

-2

-2

-2 2

-1.5

-1

-0.5

0

0.5

1

1.5

2

1.5

-2 2

-1.5

-1

-0.5

0

0.5

1

1.5

2

-1.5

-1

-0.5

0

0.5

1

1.5

2

1.5

1

1

0.5

0.5

0

0

-0.5

-0.5

-1

-1

-1.5

-1.5

-2

-2 -2

-1.5

-1

-0.5

0

0.5

1

1.5

2

-2

FIGURE 6.5: Trajectories of the SDE models with the closed-loop control (top) and the open-loop control (bottom). Left: trajectories starting with X0 = (−1, 0); right: trajectories starting at X0 = (1, 1).

Chapter 7 PDE Optimal Control Problems

7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elliptic Optimal Control Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Sequential Quadratic Hamiltonian Method . . . . . . . . . . . . . . . . . Linear Elliptic Optimal Control Problems . . . . . . . . . . . . . . . . . . . . . . . A Problem with Discontinuous Control Costs . . . . . . . . . . . . . . . . . . . Bilinear Elliptic Optimal Control Problems . . . . . . . . . . . . . . . . . . . . . Nonlinear Elliptic Optimal Control Problems . . . . . . . . . . . . . . . . . . . A Problem with State Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Nonsmooth Problem with L1 Tracking Term . . . . . . . . . . . . . . . . . Parabolic Optimal Control Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hyperbolic Optimal Control Problems . . . . . . . . . . . . . . . . . . . . . . . . . .

153 155 160 164 167 169 171 174 177 181 185

This chapter is devoted to the formulation and analysis of the sequential quadratic Hamiltonian method for solving optimal control problems governed by partial differential equations. The focus of the chapter is on elliptic control problems with distributed linear and bilinear control mechanisms, and controland state-constraints. Furthermore, the cases of discontinuous control costs, mixed-integer control problems, and a problem with a nonsmooth tracking term are discussed. The last two sections are devoted to parabolic control problems and to distributed or boundary optimal control problems governed by a wave equation.

7.1

Introduction

Since its formulation, the Pontryagin maximum principle has been used to characterise solutions to optimal control problems governed by ODEs, while the analysis of this framework in the context of optimal control problems with PDEs as received less attention; however, see, e.g., [34, 42, 65, 67, 146, 180, 212, 221, 222, 256, 255, 258, 277]. Similarly, much less effort has been put in the construction of numerical algorithms for PDE optimal control problems that are based on the maximum principle; see, e.g., [151, 187]. On the other hand, very recently the SQH method applied to PDE control problems has been proposed [52, 51, 53]. DOI: 10.1201/9781003152620-7

153

154

The Sequential Quadratic Hamiltonian Method

In this chapter, we discuss the SQH method applied to elliptic, parabolic and hyperbolic optimal control problems with linear and bilinear control mechanisms, and nonsmooth costs of the controls. The cases with an L1 tracking term and with state constraints are also discussed. We also present for each problem some theoretical results concerning the corresponding PMP characterisation of optimality and the well posedness of the SQH algorithm. The problems considered in this chapter are listed below. The elliptic and parabolic problems are defined on a generic open and bounded set Ω ⊂ Rn with Lipschitz boundary and subject to homogeneous Dirichlet boundary conditions. In the hyperbolic case, we consider wave problems with distributed and boundary controls. We have P.1) Linear elliptic optimal control problems  R  min J (y, u) := Ω h (y(x)) + g (u (x)) dx, s.t. (∇y, ∇v) = (u, v) + (ϕ, v), v ∈ H01 (Ω), u ∈ Uad ; P.2) Bilinear ellipticoptimal control problems  R min J (y, u) := Ω h (y(x)) + g (u (x)) dx, s.t. (∇y, ∇v) + (u y, v) = (ϕ, v), v ∈ H01 (Ω), u ∈ Uad ; P.3) Nonlinear elliptic optimal controlproblems R  min J (y, u) := Ω h (y(x)) + g (u (x)) dx,  s.t. (∇y, ∇v) + y 3 , v = (u, v) + (ϕ, v), v ∈ H01 (Ω), u ∈ Uad ; P.4) Optimal controlproblems with state  constraints R := min J (y, u) h (y(x)) + g (u (x)) dx, Ω s.t. (∇y, ∇v) = (u, v) + (ϕ, v), v ∈ H01 (Ω), y ≤ ξ, u ∈ Uad . P.5) Optimal control problems with L1 -tracking term and nonsmoothgoverning model  R min J (y, u) := Ω h (y(x)) + g (u (x)) dx, s.t. (∇y, ∇v) + (max (y, 0) , v) = (u, v) + (ϕ, v), v ∈ H01 (Ω), u ∈ Uad ; P.6) Bilinear parabolic optimal control problems  R  min J (y, u) := Q h (y(x, t)) + g (u (x, t)) dxdt, s.t. (y 0 (t) , v) + (∇y(t), ∇v) + (u(t) y(t), v) = (ϕ, v), v ∈ H01 (Ω), u ∈ Uad ; P.7) Wave optimal control problems with distributed and boundary controls  R min J (y, u) := Q h (y(x, t)) + g (u (x, t)) dxdt,

PDE Optimal Control Problems

155

s.t. (y 00 (t) , v) + (∇y(t), ∇v) = (u(t), v), v ∈ H01 (Ω), u ∈ Uad ; etc. In most of these problems, the control function u is taken from the admissible set Uad = {u ∈ Lq (Ω) | u (x) ∈ Kad a.e. in Ω} , with Kad ⊆ R compact, q = 2 for n = 1, q ≥ n2 + 1 for n ≥ 2. In the parabolic and hyperbolic cases, we replace Ω with Q = Ω × [0, T ]. For more details on the weak formulation of the PDE problems listed above see, e.g., [109]. In the cost functionals above, we take h (y) :=

1 2 (y − yd ) , 2

(7.1)

except for P.5), for which we choose h (y) = |y − yd |,

(7.2)

where yd is a desired target function. Concerning the costs of the control, we consider L2 and L1 costs as follows: g (u) :=

1 2 u + |u|. 2

(7.3)

Alternatively, in some cases we also consider the following nonconvex functions ( |u| if |u| > s g (u) := , s > 0, (7.4) 0 else and g(u) = log(1 + |u(x)|).

7.2

Elliptic Optimal Control Problems

We formulate our elliptic problems in weak form [109] by introducing the bilinear form: B : H × H → R, (y, v) 7→ B (y, v), where we choose the function space H = H01 (Ω), unless otherwise specified. Then, a general weak formulation of any of the elliptic problems given above can be made as follows: Z B (y, v) = f (x, y(x), u(x)) v (x) dx, (7.5) Ω

for all v ∈ H. The solution of (7.5) for any given u ∈ Uad defines the controlto-state map u 7→ y = S(u). For the governing model of P.1), we have the bilinear form B (y, v) := (∇y, ∇v) and the right-hand side f = u + ϕ, where we suppose ϕ ∈ L∞ (Ω). This problem has a unique solution y ∈ H01 (Ω), see [1, 109]. Analogously for

156

The Sequential Quadratic Hamiltonian Method

P.2), with the same B and f := ϕ − uy, we have a unique solution. Similarly, problem P.3) with f := u − y 3 + ϕ has a unique solution y ∈ H01 (Ω), see [1, 109]. Our problem P.5) with B as above and f := u − max (y, 0) + ϕ has a unique solution y ∈ H01 (Ω), see [91]. The case P.4) is analogous to P.1). Where we additionally require that y ≤ ξ, ξ ∈ R. The weak formulation of our general elliptic optimal control problem is as follows: Z   min J (y, u) := h (y (x)) + g (u (x)) dx Z Ω (7.6) s.t. B (y, v) = f (x, y (x) , u (x)) v (x) dx, v ∈ H, Ω

u ∈ Uad For many choices of J that result in a weakly lower semicontinuous cost functional in the appropriate space, existence of solutions to (7.6) can be proved by the technique of minimising sequences; see, e.g., [3, 182, 269]. This is the case, for example, for P.1)–P.3) with (7.1) and (7.3). Now, we assume this kind of favourable structure and assume existence of a solution to (7.6), which we denote with (¯ y, u ¯). Our purpose is to discuss the PMP characterisation of this solution. Hence, we remark that B is self-adjoint and enters in the construction of the adjoint problem as follows: Z B (p, v) = (∂y h (y(x)) + ∂y f (x, y(x), u(x)) p (x)) v (x) dx, (7.7) Ω

for all v ∈ H. In this problem, p : Ω → R denotes the adjoint variable, which we assume to exist such that (7.7) holds for all v ∈ H. Specifically, we assume that p ∈ H01 (Ω) ∩ L∞ (Ω); see Appendix A.5. Next, we introduce the following Hamilton-Pontryagin function H (z, y, u, p) := p f (x, y, u) + h (y) + g (u) .

(7.8)

Notice that the definition of H given in (7.8) leads to the equivalent Pontryagin minimum principle (see also Section 6.4). Now, similar to [222], one can prove the following theorem [51]. Theorem 7.1 Let h and f be continuously differentiable, and let g be continuous and convex. Furthermore, assume that f is Lipschitz in u and ∂y f be uniformly bounded. Then any solution (¯ y, u ¯) to (7.6) fulfils H (x, y¯(x), u ¯(x), p¯(x)) = min H (x, y¯(x), w, p¯(x)) , w∈Kad

(7.9)

for almost all x ∈ Ω, where p¯ solves (7.7) with y = y¯ and u = u ¯. An important step in the proof of Theorem 7.1 is the introduction of an intermediate (or average) adjoint variable defined as the solution to the following problem (we omit the argument x whenever possible) Z  ˜ y (y1 , y2 ) + f˜y (x, y1 , y2 , u1 ) p˜ v dx, B (˜ p, v) = h (7.10) Ω

PDE Optimal Control Problems

157

for all v ∈ H. This formulation considers two admissible solution pairs (y1 , u1 ) and (y2 , u2 ), that is, both satisfying (7.5) with u1 , u2 ∈ Uad , and introduces the following functions Z 1 ˜ y (y1 , y2 ) := ∂y h(y2 + s (y1 − y2 )) ds h 0

and f˜y (x, y1 , y2 , u1 ) :=

Z

1

∂y f (x, y2 + s (y1 − y2 ), u1 ) ds. 0

With this setting, we can prove the following lemma. Lemma 7.1 Let h and f be continuously differentiable, and let g be continuous and convex. Suppose that (y1 , u1 ) and (y2 , u2 ) are two admissible solutions. Then the following equality holds Z   J(y1 , u1 ) − J(y2 , u2 ) = H (x, y2 , u1 , p˜) − H (x, y2 , u2 , p˜) dx (7.11) Ω

Proof. J (y1 , u1 ) − J (y2 , u2 ) =

Z 

 h (y1 ) + g (u1 ) − h (y2 ) − g (u2 ) dx



=

Z Z ZΩ 

1

 ∂y h(y2 + s (y1 − y2 )) (y1 − y2 ) ds + g(u1 ) − g(u2 ) dx

0

 ˜ y (y1 , y2 ) (y1 − y2 ) + g(u1 ) − g(u2 ) dx h = Ω Z Z ˜ = B(˜ p, y1 − y2 ) − fy (x, y1 , y2 , u1 ) p˜ (y1 − y2 ) dx + (g(u1 ) − g(u2 )) dx Ω Ω Z Z = f (x, y1 , u1 ) p˜ dx − f (x, y2 , u2 ) p˜ dx Ω Ω Z Z − f (x, y1 , u1 ) p˜ dx + f (x, y2 , u1 ) p˜ dx Ω ZΩ + (h(y2 ) + g(u1 ) − h(y2 ) − g(u2 )) dx, Ω

(7.12) where for the last equality we use (7.5) with v = p˜ and we have added and subtracted h(y2 ). The lemma is proved. Now, let us denote with δy = y1 − y2 the difference of two solutions to (7.5) corresponding to two different controls u1 , u2 ∈ Uad , whose difference is denoted with δu = u1 − u2 . Similarly, we denote with δp = p1 − p2 the difference of two solutions to (7.7) corresponding to the two different pairs (y1 , u1 ) and (y2 , u2 ). We make the following assumption. Assumption 7.1 In any compact subset included in the respective domains of definitions of f , h and g it holds:

158

The Sequential Quadratic Hamiltonian Method

A.1) the functions f , ∂y f , and ∂y h are measurable in x and Lipschitz continuous in y and u in the sense that there exist constants c1 , c2 , c3 > 0 such that the following holds uniformly in x:   |f (x, y1 , u1 ) − f (x, y2 , u2 )| ≤ c1 |y1 − y2 | + |u1 − u2 |   |∂y f (x, y1 , u1 ) − ∂y f (x, y2 , u2 )| ≤ c2 |y1 − y2 | + |u1 − u2 | |∂y h(y1 ) − ∂y h(y2 )| ≤ c3 |y1 − y2 |;

A.2) kδykL2 (Ω) ≤ c4 kδukL2 (Ω) , kδpkL2 (Ω) ≤ c5 kδukL2 (Ω) ; A.3) kpkL∞ (Ω) ≤ c6 for p solving (7.7) with given y and u ∈ Uad ; (For A.3) see Appendix A.5.) With these assumptions, we can prove a sufficient condition for existence of optimal controls that solve (7.6). We have Theorem 7.2 Let (¯ y, u ¯) solve (7.5) and p¯ solve the corresponding adjoint equation (7.7) with y = y¯ and u = u ¯. Let (¯ y, u ¯, p¯) fulfil 2

H (x, y¯(x), u ¯(x), p¯(x)) + r (w − u ¯(x)) ≤ H (x, y¯(x), w, p¯(x)) ,

(7.13)

for all w ∈ Kad and for almost all x ∈ Ω and r > 0 sufficiently large. Then (¯ y, u ¯) is an optimal solution to (7.6), that is, J (y, u) ≥ J (¯ y, u ¯) for all (y, u) solving (7.5) with u ∈ Uad . Proof. We have Z 

 J (y, u) − J (¯ y, u ¯) = h (y) + g (u) − h (¯ y ) − g (¯ u) dx Ω Z   = H (x, y, u, p¯) − p¯ f (x, y, u) − H (x, y¯, u ¯, p¯) + p¯ f (x, y¯, u ¯) dx ZΩ  = H (x, y, u, p¯) − H (x, y¯, u, p¯) + H (x, y¯, u, p¯) − H (x, y¯, u ¯, p¯) Ω  − p¯ f (x, y, u) + p¯ f (x, y¯, u ¯) dx Z   ≥ r (u − u ¯)2 + H (x, y, u, p¯) − H (x, y¯, u, p¯) − p¯ f (x, y, u) + p¯ f (x, y¯, u ¯) dx Ω

Z 

r (u − u ¯ )2 +

Z

1

∂y H(x, y¯ + s (y − y¯), u, p¯) (y − y¯) ds  − p¯ f (x, y, u) + p¯ f (x, y¯, u ¯) dx Z  Z 1   ∂y H(x, y¯ + s (y − y¯), u, p¯) − ∂y H(x, y¯, u, p¯) (y − y¯) ds = r (u − u ¯ )2 + 0 Ω  + ∂y H(x, y¯, u, p¯) (y − y¯) + p¯ f (x, y¯, u ¯) − p¯ f (x, y, u) dx =



0

PDE Optimal Control Problems 159 Z 1   = r (u − u ¯ )2 + ∂y H(x, y¯ + s (y − y¯), u, p¯) − ∂y H(x, y¯, u, p¯) (y − y¯) ds Ω 0   + ∂y H(x, y¯, u ¯, p¯) (y − y¯) + ∂y f (x, y¯, u) − ∂y f (x, y¯, u ¯) (y − y¯)   + p¯ f (x, y¯, u ¯) − f (x, y, u) dx Z  Z 1   = r (u − u ¯ )2 + ∂y H(x, y¯ + s (y − y¯), u, p¯) − ∂y H(x, y¯, u, p¯) (y − y¯) ds Ω 0    + ∂y f (x, y¯, u) − ∂y f (x, y¯, u ¯) (y − y¯) dx Z 

≥ r kδuk2L2 (Ω) − c¯ kδuk2L2 (Ω) , (7.14) for all u ∈ Uad , and c¯ > 0 is built with the constants c1 , c2 , c3 c4 , c5 , c6 , as explained below. By choosing r > c¯ the claim is proved. In this calculation, in the third line we have subtracted and added H (x, y¯, u, p¯). In the fourth line, we use (7.13). In the sixth line, we use the fact that Z 1 H (x, y, u, p¯) − H (x, y¯, u ¯, p¯) = ∂y H(t, y¯ + s (y − y¯), u ¯, p¯) (y − y¯) ds. 0

We proceed noticing that in ∂y H the control does only appear in ∂y f . Then, we add and subtract ∂y f (x, y¯, u ¯) (y − y¯) and combine with ∂y H(t, y¯, u, p¯) (y − y¯). Now, notice that choosing v = p¯ in (7.5), it holds Z Z B (y, p¯) = f (x, y, u) p¯ dx and B (¯ y , p¯) = f (x, y¯, u ¯) p¯ dx. Ω



Therefore taking the difference of these two equations, we obtain Z  B (δy, p¯) = − p¯ f (x, y¯, u ¯) − f (x, y, u) dx. Ω

Further, by choosing v = δy in (7.7), we have Z B (δy, p¯) = ∂y H(x, y¯, u ¯, p¯) δy dx. Ω

The last line follows by the Assumption 7.1 A.1) – A.3). If the cost functional is only lower semicontinuous, existence of optimal controls can be proven considering an admissible set that is compact in the optimisation space; but this requirement may result to be too restrictive in applications and leads to difficulties in proving the maximum principle. On the other hand, with J lower semicontinuous, it could be possible to prove existence of quasioptimal controls in the framework of Ekeland’s variational principle and the PMP [105, 132]. In order to illustrate this case, let us suppose that

160

The Sequential Quadratic Hamiltonian Method

h and g are bounded from below and, in particular, assume that gRis lower semicontinuous. Then, the map u 7→ G (u) : Uad → R with G (u) := Ω g (u (x)) dx is lower semicontinuous on Uad . This means that for any sequence (uk )k∈N with limk→∞ kuk − u ¯kL2 (Ω) = 0 we have that lim inf k→∞ G (uk ) ≥ G (¯ u). Therefore assuming a continuous control-to-state map S : Uad → L2 (Q) and a locally Lipschitz continuous function h, we also have that the reduced cost functional Jˆ (u) := J (S (u) , u) is lower semicontinuous on Uad ; see [52]. Next, consider Uad with the metric δ0 (u1 , u2 ) := | {t| u1 (z) 6= u2 (z)} |, giving the measure of the set where u1 ∈ Uad differs from u2 ∈ Uad . The proof of [105] also applies in this setting and provides that (Uad , δ0 ) is a complete metric space. Now, notice that since Jˆ is bounded from below, there exists an element u ˜ ∈ Uad within any minimising sequence such that, for any  > 0, it holds Jˆ (˜ u) ≤ inf v∈Uad Jˆ (v) + , see [6, II Theorem 4.1]. Consequently, by applying Ekeland’s variational principle, there exists a u∗ ∈ Uad , with Jˆ (u∗ ) ≤ Jˆ (˜ u), that satisfies Jˆ (u∗ ) < Jˆ (w)+ δ0 (w, u∗ ) for all w ∈ Uad \ {u∗ }. In this sense, u∗ can be named an -minimiser for our problem. This point has a natural PMP characterisation as follows. In the inequality above, take w equal to the needle variation of u∗ at any arbitrary point x ∈ Ω, denoted with uk , see [52] for details. Then one obtains that Jˆ (uk )− Jˆ (u∗ ) > −δ0 (uk , u∗ ) = − |Sk (x) |, where |Sk (x) | is the measure of the ball centred at x ∈ Ω, where ∗ uk differs  from u , and  |Sk | converges to zero for k → ∞. Thus we have 1 ∗ ˆ ˆ |Sk (x)| J (uk ) − J (u ) > −. According to [52], the limit for k → ∞ provides H (x, y ∗ (x), u∗ (x), p∗ (x)) ≤ H (x, y ∗ (x), w, p∗ (x)) + , for all w ∈ Kad and for almost all x ∈ Ω, see [6, II Theorem 2.7], where y ∗ is the solution to state equation for u∗ and p∗ is the solution to the adjoint equation with (y ∗ , u∗ ). Notice that the PMP can be proved with Ekeland’s variational principle; see, e.g., [180].

7.3

The Sequential Quadratic Hamiltonian Method

In this section, we illustrate the SQH method for solving our class of elliptic optimal control problems and discuss its convergence properties. We define the following augmented HP function 2

H (x, y, u, v, p) := H (x, y, u, p) +  (u − v) .

(7.15)

PDE Optimal Control Problems

161

The SQH algorithm is implemented as follows: Algorithm 7.1 (SQH method for elliptic optimal control problems) Input: initial approx. u0 , max. number of iterations kmax , tolerance κ > 0,  > 0, σ > 1, η > 0, and ζ ∈ (0, 1); set τ > κ, k := 0. Compute the solution y 0 to the governing problem Z  B (y, v) = f x, y(x), u0 (x) v (x) dx, v ∈ H. Ω

while (k < kmax && τ > κ ) do 1) Compute the solution pk to the adjoint problem Z    B (p, v) = ∂y h y k (x) + ∂y f x, y k (x), uk (x) p (x) v (x) dx, Ω

for all v ∈ H. 2) Determine uk+1 such that the following optimisation problem is satisfied   H x, y k (x), uk+1 (x), uk (x), pk (x) = min H x, y k (x), w, uk (x), pk (x) , w∈Kad

for almost all x ∈ Ω. 3) Compute the solution y k+1 to the forward problem Z  B (y, v) = f x, y(x), uk+1 (x) v (x) dx,

v ∈ H.



4) Compute τ := kuk+1 − uk k2L2 (Ω) .   5) If J y k+1 , uk+1 − J y k , uk > −η τ , then increase  with  = σ  and go to Step 2.   Else if J y k+1 , uk+1 − J y k , uk ≤ −η τ , then decrease  with  = ζ  and continue. 6) Set k := k + 1. end while We remark that the Lebesgue measurability of u obtained by minimising H pointwise can be guaranteed for a convex g. However, by direct inspection this property can be proved for some nonconvex g. In particular, in [52] it is shown that the control function u results measurable when choosing the discontinuous g given in (7.4).

162

The Sequential Quadratic Hamiltonian Method

Next, we discuss the minimising properties of the SQH iterates in the following theorem.   Theorem 7.3 Let y k , uk and y k+1 , uk+1 be generated by the SQH method, Algorithm 7.1, and uk+1 , uk be measurable; let the Assumptions 7.1 hold. Then, there exists a θ > 0 independent of , k, and uk such that for the  > 0 currently chosen by Algorithm 7.1, the following holds   (7.16) J y k+1 , uk+1 − J y k , uk ≤ − ( − θ) kuk+1 − uk k2L2 (Ω) .   In particular, it holds J y k+1 , uk+1 − J y k , uk ≤ −η τ for  ≥ θ + η and τ = kuk+1 − uk k2L2 (Ω) . Proof. The proof of this theorem is similar to that of Theorem 7.2. We have Z        k+1 k+1 k k k+1 k+1 k k J y

,u

− J y ,u

h y

=

−h y

+g u

−g u

dx



=

Z 

H x, y k+1 , uk+1 , pk − pk f x, y k+1 , uk+1







− H x, y k , uk , pk + pk f x, y k , uk



=

Z 



dx

H x, y k+1 , uk+1 , pk − H x, y k , uk+1 , pk







+ H x, y k , uk+1 , pk − H x, y k , uk , pk



− pk f x, y k+1 , uk+1 + pk f x, y k , uk





Z 





dx

−  (uk+1 − uk )2 + H x, y k+1 , uk+1 , pk − H x, y k , uk+1 , pk





Ω k

− p f x, y k+1 , uk+1 + pk f x, y k , uk =



Z  Z



dx

−  (uk+1 − uk )2

Ω 1

∂y H(x, y k + s (y k+1 − y k ), uk+1 , pk ) (y k+1 − y k ) ds

+ 0 k

− p f x, y k+1 , uk+1 + pk f x, y k , uk =



Z  Z



dx

−  (uk+1 − uk )2

Ω 1

+

∂y H(x, y k + s (y k+1 − y k ), uk+1 , pk )



0

− ∂y H(x, y k , uk+1 , pk ) (y k+1 − y k ) ds



+ ∂y H(x, y k , uk+1 , pk ) (y k+1 − y k ) + pk f x, y k , uk − pk f x, y k+1 , uk+1





dx

PDE Optimal Control Problems

=

Z  Z

163

−  (uk+1 − uk )2

Ω 1

∂y H(x, y k + s(y k+1 − y k ), uk+1 , pk ) − ∂y H(x, y k , uk+1 , pk ) (y k+1 − y k ) ds



+



0

+ ∂y H(x, y k , uk , pk ) (y k+1 − y k ) + ∂y f x, y k , uk+1 − ∂y f x, y k , uk







+ pk f x, y k , uk − f x, y k+1 , uk+1



=



Z  Z

(y k+1 − y k )

 

dx

−  (uk+1 − uk )2

Ω 1

+

∂y H(x, y k + s (y k+1 − y k ), uk+1 , pk )



0

− ∂y H(x, y k , uk+1 , pk ) (y k+1 − y k ) ds



+ ∂y f x, y k , uk+1 − ∂y f x, y k , uk









(y k+1 − y k ) dx

≤ − kuk+1 − uk k2L2 (Ω) + θ kuk+1 − uk k2L2 (Ω) ,

where θ > 0 depends on the constants c1 , c2 , c3 c4 , c5 , c6 that appear in the Assumptions 7.1 A.1)–A.3). This result proves (7.16). Notice that, in the fifth line of the calculation above, we have used the fact that   H x, y k , uk+1 , pk − H x, y k , uk , pk   = H x, y k , uk+1 , uk , pk − H x, y k , uk , uk , pk −  (uk+1 − uk )2 ≤ − (uk+1 − uk )2 , where the inequality results from uk+1 being the minimiser of H x, y k , u, uk , pk in Step 2 of Algorithm 7.1. By choosing  ≥ θ + η also the second claim is proved.   Further results concerning the sequences y k and uk generated by the iterated Step 2 to Step 4. of Algorithm 7.1 (with no stopping criterion) are given in the following theorem. Theorem 7.4 Let the assumptions of Theorem 7.3 be fulfilled. If in Algorithm 7.1, at every kth iterate,  = θ + η is chosen, then the following holds  a) the sequence (J y k , uk )k=0,1,2,... is monotonically decreasing and conˆ verges to some Jˆ∗ ≥ inf u∈Uad J(u); b) it holds limk→∞ kuk+1 − uk kL2 (Ω) = 0.



164

The Sequential Quadratic Hamiltonian Method

The proof of this theorem is the same as for Theorem 2.2. A consequence of this theorem is that Algorithm 7.1 is well defined for ¯ ¯ κ > 0. This means that there is k¯ ∈ N0 such that kuk+1 − uk kL2 (Ω) ≤ κ and consequently Algorithm 7.1 stops in finitely many steps; see Step 4. in Algorithm 7.1 and Theorem 7.3.

7.4

Linear Elliptic Optimal Control Problems

In this section, we discuss the application of the SQH method for solving a linear control problem of the class P.1) with a semismooth cost functional given by 1 α 2 h (y) := (y − yd ) , g (u) := u2 + β |u|. 2 2 The control problem is defined on a domain Ω = (0, 1) × (0, 1), and we use the standard 5-point finite-difference discretisation of the Laplacian on a uniform 1 grid with mesh size 4x = 100 . In this case, existence of an optimal control [269] and its PMP characterisation can be proved; see Theorem 7.1. Moreover, the Assumption 7.1 are satisfied so that the statements of Theorems 7.3 and 7.4 hold. Our elliptic optimal control problem is formulated as follows. Find y ∈ H01 (Ω) and u ∈ Uad , such that Z   α 1 2 (y (x) − yd (x)) + u2 (x) + β |u(x)| dx min J (y, u) := 2 Ω 2 (7.17) (∇y, ∇v) = (u + ϕ, v) , v ∈ H01 (Ω) u ∈ Uad , where we choose ϕ = 0. For the cost functional, we choose α = 10−5 and β = 10−3 , yd (x) := sin (2πx1 ) cos (2πx2 ) + 1. In Uad , we have Kad = [−100, 100]. The HP function for this problem is given by H (x, y, u, p) = p(u + ϕ) +

1 α 2 (y − yd ) + u2 + β |u|. 2 2

Corresponding to (7.17), we have the following adjoint problem (∇p, ∇v) = (y − yd , v) , which admits a unique solution p ∈ H01 (Ω).

v ∈ H01 (Ω) ,

(7.18)

PDE Optimal Control Problems

165

One can verify the estimates kδykL2 (Ω) ≤ c kδukL2 (Ω) , kδpkL2 (Ω) ≤ c kδukL2 (Ω) . Furthermore, we have that y ∈ L∞ (Ω), see Appendix A.5, and thus applying Theorem A.4 to the adjoint equation considering the pointwise boundedness of the control, we have kpkL∞ (Ω) ≤ c for any solution (y, u) to the elliptic equation with u ∈ Uad . In order to quantify the accuracy with which PMP optimality is satisfied, we introduce the function   4H (x) := H (x, y, u, p) − min H (x, y, w, p) , w∈Kad

where y, u, and p are the return values from the SQH method upon converl that is the percentage of the gence. Furthermore, we report the number N% grid points at which the inequality 0 ≤ ∆H ≤ 10−l , l ∈ N, is fulfilled. This is to verify the PMP optimality (7.9) up to a tolerance, at least on a subset of grid points. In order to determine the pointwise minimum of the augmented HP function in Step 2 of Algorithm 7.1, we obtain the following analytic result. If Kad := [ulo , uup ] = [−100, 100], then the pointwise minimum of H with H given by (7.18) is attained either at     2uk − β − pk , uup , u = min max 0, 2 + α or at

   2uk + β − pk u = min max ulo , ,0 . 2 + α 

In the algorithm, we choose among these two candidate values for u based on the corresponding value of H . Alternatively, the value of u where H attains its minimum can also be calculated by, e.g., a secant method. In Figure 7.1, we depict the optimal control and the corresponding state obtained with Algorithm 7.1 for the linear elliptic optimal control problem (7.17). In this case, the parameters are as follows. We initialise with u0 = 0 and  = 1. We set and for u0 = 0, κ = 10−8 , σ = 1.1, ζ = 0.9, η = 10−5 . In Figure 7.2, we plot the convergence history of the SQH method in terms of reduction of the value of the cost functional, which demonstrates that the SQH method provides a minimising sequence. In Table 7.1, we report results of numerical experiments with different values of the tolerance κ of the SQH stopping criterion. We see that more stringent values of κ result in an increasing number of iterations and a more accurate fulfilment of the PMP optimality conditions. In this table, we denote with kup the number of successful updates within the total number ktot of SQH iterations.

166

The Sequential Quadratic Hamiltonian Method

FIGURE 7.1: Results for the linear elliptic optimal control problem with L2 and L1 costs of the control: the optimal control (top) and the corresponding state. TABLE 7.1: Convergence behaviour of the SQH method with different choices of κ. κ 10−4 10−6 10−8

kup 349 472 605

ktot 648 907 1187

maxx∈Ω ∆H (x) 1.67 · 10−5 1.77 · 10−7 1.51 · 10−9

2 N% %

4 N% %

6 N% %

8 N% %

10 N% %

100 100 100

100 100 100

76.20 100 100

65.51 75.21 100

63.45 64.68 76.46

PDE Optimal Control Problems

167

0.65

0.6

0.55

0.5

0.45

0.4

0.35

0.3

0.25

0.2

0.15 10 0

10 1

10 2

10 3

FIGURE 7.2: Results for the linear elliptic optimal control problem with L2 and L1 costs of the control: the minimisation of J.

7.5

A Problem with Discontinuous Control Costs

In the field of optimisation with discontinuous functionals, much less theoretical and numerical results are available. Among the few, we mention the theoretical works [8, 156, 235] that focus on specific applications. On the other hand, the numerical solution of discontinuous problems usually relies on some smoothing or relaxation approach, which modifies the nature of the original problem, possibly introducing sufficient features to allow to prove existence of optimal controls. In this section, while we do not claim existence of optimal controls, we demonstrate that the SQH method would provide the appropriate framework to numerically investigate discontinuous optimisation problems. We consider the following linear elliptic optimal control problem with a discontinuous cost functional. As in P.1), we have Z   1 2 min J (y, u) := (y (x) − yd (x)) + g(u(x)) dx Ω 2 (7.19) (∇y, ∇v) = (u + ϕ, v) , v ∈ H01 (Ω) u ∈ Uad . In our experiments, the discontinuous cost of the control is defined as follows: ( β|u| if |u| > s α 2 g (u) := u + , 2 0 else where we take α = 10−10 and β = 10−3 , s = 10, and Kad = [−100, 100]. Notice that we have a cost of the control that is only of L2 -type if its value is below a given threshold and it measures a combination of L2 and L1 costs otherwise; with this choice the reduced cost functional Jˆ (u) := J (S (u) , u) is discontinuous in L2 (Q).

168

The Sequential Quadratic Hamiltonian Method

The target function is given by yd (x) = sin (2πx1 ) cos (2πx2 ) + 1, and we choose ϕ = 0. The HP function is given by H (x, y, u, p) = p(u + ϕ) +

1 2 (y − yd ) + g (u) . 2

Corresponding to (7.19), the adjoint problem is given by v ∈ H01 (Ω) .

(∇p, ∇v) = (y − yd , v) ,

As in the previous optimal control problem, we can verify that the Assumption 7.1 is satisfied. Although we have changed the functional, we use the same set of parameters of the SQH method as for the solution of the previous problem. This choice should demonstrate a certain degree of robustness of the SQH method. In fact, in this case the SQH method is able to compute a numerical solution to (7.19), but requires a larger number of iterations. The resulting solution is depicted in Figure 7.3. We can see the presence of the control constraints and the action of the discontinuous cost. In Figure 7.4, we plot the convergence history of the SQH method showing a monotone reduction of the value of the cost functional. In Table 7.2, we present results of SQH calculation for solving (7.19) with different values of the tolerance κ, which shows that the present discontinuous problem is more challenging for the SQH algorithm than the previous problem. Nevertheless, the fulfilment of the PMP optimality condition appears verified to high accuracy on a large percentage of grid points. We remark that a tuning of the SQH parameters would result in an improvement of the computational performance of the algorithm. For example, for the present problem, choosing σ = 10, ζ = 0.15, η = 10−7 , for κ = 10−4 , we have convergence with ktot = 322, kup = 179, and obtain maxx∈Ω ∆H (x) = 1.93 · 10−2 and 8 N% %

10 N% %

2 N% %

= 97.74,

4 N% %

= 91.31,

6 N% %

= 90.84,

= 90.61, and = 90.53. This improves greatly on the corresponding result (κ = 10−4 ) shown in Table 7.2. TABLE 7.2: The linear elliptic optimal control problem with a discontinuous cost of the control: convergence behaviour of the SQH method with different choices of κ. κ 10−4 10−6 10−8

kup 818 1225 2948

ktot 1600 2457 1458

maxx∈Ω ∆H (x) 4.02 · 10−2 4.06 · 10−2 4.06 · 10−2

2 N% %

4 N% %

6 N% %

8 N% %

10 N% %

93.07 93.23 93.25

87.41 87.59 87.61

87.20 87.51 87.57

87.16 87.46 87.52

87.16 87.46 87.52

PDE Optimal Control Problems

169

FIGURE 7.3: Results for the linear elliptic optimal control problem with a discontinuous cost of the control: the optimal control (top) and the corresponding state.

7.6

Bilinear Elliptic Optimal Control Problems

In this section, we consider a bilinear elliptic optimal control problem of the type P.2) as follows: Z   1 2 min J (y, u) := (y (x) − yd (x)) + g (u (x)) dx Ω 2 (7.20) (∇y, ∇v) + (u y, v) = (ϕ, v) , v ∈ H01 (Ω) u ∈ Uad .

170

The Sequential Quadratic Hamiltonian Method 0.65

0.6

0.55

0.5

0.45

0.4

0.35

0.3

0.25

0.2

0.15 10 0

10 1

10 2

FIGURE 7.4: Results for the linear elliptic optimal control problem with a discontinuous cost of the control: the minimisation of J. The bilinear (or control-affine) structure, due to the term u y, appears in, e.g., Helmholtz problems, where u corresponds to the square of a variable wave number. For this reason, in Uad we choose Kad ⊆ R+ 0 ; we also take ϕ = 1. With this setting the governing problem in (7.20) admits a unique solution y ∈ H01 (Ω), which is essentially bounded by a constant; see Theorem A.5 in Appendix A.5. In the cost functional of (7.20) we take the target function: yd (x) := sin (2πx1 ) cos (2πx2 ) and, as in the preceding discontinuous case, we choose ( β |u| if |u| > s α 2 g (u) := u + , 2 0 else where α = 10−10 , β = 10−5 and s = 20. The HP function for (7.20) is given by: H (x, y, u, p) =

1 2 (y − yd ) + g (u) + p ϕ − u y p. 2

(7.21)

According to (7.7), the adjoint problem is as follows: (∇p, ∇v) + (u p, v) = (y − yd , v) ,

v ∈ H01 (Ω) .

(7.22)

This problem admits a unique solution p ∈ H01 (Ω), which is essentially bounded by a constant; see Theorem A.5 in Appendix A.5. Based on the preceding results and further analysis, one can verify that the Assumption 7.1 is satisfied. Now, we apply the SQH method to solve our bilinear optimal control problem. For this purpose, we can take the values of the SQH parameters as mentioned at the end of the previous section, that is, σ = 10, ζ = 0.15, η = 10−7 , for κ = 10−8 and an initial  = 1. The results of the SQH algorithm are shown in Figure 7.6. We can see the action of the control constraints and of the discontinuous cost. In Figure 7.5, we depict the convergence history of the SQH method corresponding to

PDE Optimal Control Problems

171

0.1285

0.128

0.1275

0.127

0.1265

0.126

0.1255 10 0

10 1

10 2

10 3

FIGURE 7.5: Results for the bilinear elliptic optimal control problem with a discontinuous cost of the control: the minimisation of J. a monotone reduction of the value of the cost functional. At convergence at N2 ktot = 1126, kup = 622, we obtain maxx∈Ω ∆H (x) = 1.28 10−4 and %% = 100, 4 N% %

= 99.83,

7.7

6 N% %

= 98.16,

8 N% %

= 98.16, and

10 N% %

= 99.14.

Nonlinear Elliptic Optimal Control Problems

In this section, we discuss the following optimal control problem with a nonlinear elliptic PDE constraint Z   1 2 min J (y, u) := (y (x) − yd (x)) + g (u (x)) dx Ω 2  (7.23) (∇y, ∇v) + y 3 , v = (u + ϕ, v) , v ∈ H01 (Ω) u ∈ Uad , where we take the target function yd (x) := sin (2πx1 ) cos (2πx2 ), ϕ = 0, and the cost of the control is given by ( β |u| if |u| > s α 2 g (u) = u + . 2 0 else We make the choice α = 10−10 , β = 10−3 , s = 10, and Kad = [−100, 100]. The corresponding HP function is given by H (x, y, u, p) =

 1 2 (y − yd ) + g (u) + p u − y 3 + pϕ. 2

Further, the adjoint problem is formulated as follows:  (∇p, ∇v) + 3y 2 p, v = (y − yd , v) , v ∈ H01 (Ω) . Also in this setting one can prove existence of a unique solution p ∈ H01 (Ω).

172

The Sequential Quadratic Hamiltonian Method

FIGURE 7.6: Results for the bilinear elliptic optimal control problem with a discontinuous cost of the control: the optimal control (top) and the corresponding state. Notice that our nonlinear elliptic operator is monotone and the validity of Assumption 7.1 for the present control problem can be verified. We have kδykL2 (Ω) ≤ c kδukL2 (Ω) and kδpkL2 (Ω) ≤ c kδukL2 (Ω) analogously to the linear case in Section 7.5. Moreover, because of the favourable structure of the nonlinearity, one can prove that y ∈ L∞ (Ω), following the same line of reasoning given in Appendix A.5, and thus applying Theorem A.4 to the adjoint problem considering the pointwise boundedness of the control, we have kpkL∞ (Ω) ≤ c for any solution (y, u) to the state equation with u ∈ Uad . Since y is bounded by a constant independent of u ∈ Uad , we can verify the Lipschitz continuity of f (y, u) = u − y 3 .

PDE Optimal Control Problems

173

The parameters of the SQH Algorithm 7.1 are chosen as follows: σ = 10, ζ = 0.15, η = 10−7 , for κ = 10−6 and an initial  = 1. The nonlinear elliptic problem is solved by a Picard iteration until the L2 -norm of its residuum is less than 10−6 ; for this reason we also take κ with the same value. The results obtained with the SQH algorithm are shown in Figure 7.7. We can see the presence of the control constraints and the action of the discontinuous cost. In Figure 7.8, we plot the convergence history of the SQH method showing a monotone reduction of the value of the cost functional. At convergence, with ktot = 424, kup = 235, we have maxx∈Ω ∆H (x) = 1.02 · 10−2 and 2 N% %

= 99.81,

4 N% %

= 89.18,

6 N% %

= 86.54,

8 N% %

= 86.21, and

10 N% %

= 86.13.

FIGURE 7.7: Results for the nonlinear elliptic optimal control problem with a discontinuous cost of the control: the optimal control (top) and the corresponding state.

174

The Sequential Quadratic Hamiltonian Method 0.13

0.12

0.11

0.1

0.09

0.08

0.07

0.06

0.05 10 0

10 1

10 2

FIGURE 7.8: Results for the nonlinear elliptic optimal control problem with a discontinuous cost of the control: the minimisation of J.

7.8

A Problem with State Constraints

This section is devoted to solving the following elliptic optimal control problem with state constraints Z   := min J (y, u) h (y (x)) + g (u (x)) dx Ω

v ∈ H01 (Ω)

(∇y, ∇v) = (u + ϕ, v) , y ≤ ξ, u ∈ Uad .

(7.24)

2

In this problem, we take h (y) := 21 (y − yd ) , and the cost of the control is given by α g (u) := u2 + β |u| , 2 −4 −3 where α = 10 and β = 10 . For the admissible set of the controls, we choose Kad := [−50, 50]. The constraint on the state is given by ξ = 0.3, and we choose yd (x) := sin (2πx1 ) cos (2πx2 ) and ϕ = 0. We assume that (7.24) admits a unique solution denoted with (¯ y, u ¯). In this case, PMP optimality involves multipliers that are implicitly characterised by inequalities; see, e.g., [65, 67, 221]. However, we can consider a simpler approach that is based on the idea of augmented Lagrangian as discussed in [150]. In this approach, the optimal control problem (7.24) is approximated with the following Z   min J (y, u; ξ, γ) := hξ (y (x) ; γ) + g (u (x)) dx Ω

(∇y, ∇v) = (u, v) , u ∈ Uad

v ∈ H01 (Ω) 3

where hξ (y; γ) := h (y) + γ (max (0, y − ξ)) , γ ≥ 0.

(7.25)

PDE Optimal Control Problems

175

We assume that (7.25) admits a solution for any γ ≥ 0. Thus, for fixed γ, we can proceed and apply Theorem 7.1 stating that this solution is characterised by the PMP. Next, we discuss the connection between the two problems (7.24) and (7.25). For this purpose, we consider a monotone increasing sequence (γk ), and denote with (yk , uk ) the solution to (7.25) for γ = γk ; notice that yk does not necessarily satisfy the state constraint. Further, let us denote with (¯ y, u ¯) the solution to (7.24). We have the following theorem. Theorem 7.5 Let limk→∞ γk = ∞, let (yk , uk ) be the solution to (7.25) for γ = γk , and the corresponding Mk := {x ∈ Ω : yk (x) y, u ¯) R > ξ}; further 3let (¯ be the solution to (7.24). Then, it holds that limk→∞ Mk (yk (x) − ξ) dx = 0. Moreover, we have J (yk , uk ) ≤ J (¯ y, u ¯) , for all k ∈ N. Proof. We have that the set Mk is measurable as yk is measurable, see [7, X Theorem 1.9]. Thus integration over Mk is well defined. We have Z   3 h (¯ y (x)) + g (¯ u (x)) + γ (max (0, y¯ (x) − ξ)) dx = J (¯ y, u ¯; ξ, γ) , J (¯ y, u ¯) = Ω

as y¯ ≤ ξ and thus for an optimal solution (yk , uk ) to (7.25) it holds that J (yk , uk ; ξ, γ) ≤ J (¯ y, u ¯), that is, Z   3 h (yk (x)) + g (uk (x)) + γk (max (0, yk (x) − ξ)) dx ≤ J (¯ y, u ¯) . (7.26) Ω

R 3 Now if we assume that there is an  > 0 such that Ω (max (0, yk (x) − ξ)) dx = R 3 (yk (x) − ξ) dx >  for all k ∈ N, then we have a contradiction to (7.26) Mk due to the lower boundedness of h and g. Also from (7.26) we have that Z   3 J (¯ y, u ¯) ≥ h (yk (x)) + g (uk (x)) + γk (max (0, yk (x) − ξ)) dx ZΩ   ≥ h (yk (x)) + g (uk (x)) dx. Ω

This theorem states that increasing γk improves the solution to (7.25) with respect to the original task of solving the state-constrained optimal control problem (7.24), and the measure of the violation of the state constraint by the corresponding solution goes to zero for increasing γk . Moreover, in the case α > 0 and β = 0, it can be proven that for increasing γk the sequence (yk , uk ) converges to (¯ y, u ¯); see [150, Lemma 3.6]. Next, we present results obtained with the SQH algorithm solving (7.25) for a fixed value of γ. However, independently of the value of γ, we choose

176

The Sequential Quadratic Hamiltonian Method TABLE 7.3: Results that numerically validate Theorem 7.5. γ 1 10 100 1000 10000 100000

maxx∈Ω y (x) 0.5137 0.4379 0.3654 0.3267 0.3098 0.3033

|Mk | 0.072 0.056 0.036 0.023 0.014 0.010

J (y, u; ξ, γ) 9.44 10−2 9.51 10−2 9.58 10−2 9.63 10−2 9.65 10−2 9.66 10−2

the following values of the SQH parameters: σ = 1.1, ζ = 0.9, η = 10−5 , for κ = 10−8 and an initial  = 1. Clearly, we can make the calculations for the different γ sequentially by implementing an outer loop that increments γ and an inner loop that solves the corresponding optimisation problem. In Table 7.3, we show results for validating Theorem 7.5. We can see that for increasing γ the maximum of the state variable y converges to the desired upper bound, whereas the measure of the set Mk where the state variable violates the upper bound becomes smaller. In Figure 7.9, we plot the convergence history of the SQH method showing a monotone reduction of the value of the cost functional. The control and state functions obtained with the SQH algorithm are shown in Figure 7.10. Concerning the validation of PMP optimality we have the following result. At convergence, for the case γ = 100000, with ktot = 412, kup = 219, we have maxx∈Ω ∆H (x) = 1.15 · 10−5 and 8 N% %

= 42.28 and

10 N% %

2 N% %

= 100,

4 N% %

= 100,

6 N% %

= 87.70,

= 37.92. 0.13

0.125

0.12

0.115

0.11

0.105

0.1

0.095 10 0

10 1

10 2

10 3

10 4

FIGURE 7.9: Results for the elliptic optimal control problem with state constraints (γ = 100000): the minimisation of J.

PDE Optimal Control Problems

177

FIGURE 7.10: Results for the elliptic optimal control problem with state constraints (γ = 100000): the optimal control (top) and the corresponding state.

7.9

A Nonsmooth Problem with L1 Tracking Term

Consider the following nonsmooth optimal control problem Z   min J (y, u) := |y (x) − yd (x) | + β log (1 + |u (x) |) dx Ω

(∇y, ∇v) + (max (0, y) , v) = (u + ϕ, v) ,

v ∈ H01 (Ω) u ∈ Uad .

(7.27)

178

The Sequential Quadratic Hamiltonian Method

We have that h(y) = |y − yd | and f (y, u) = u + ϕ − max (0, y) are not differentiable with respect to y, and the cost of the control is nonconvex. In this case, a PMP characterisation of a solution to (7.27) with the technique in [222] is not possible. On the other hand, similar nondifferentiable structures have been considered in different contexts; see, e.g., [75, 76, 87, 233, 257] (see, in particular, [75, 257] for a regularisation approach based on duality). Therefore it is reasonable to consider the implementation and application of the SQH Algorithm 7.1 to solve (7.27). Furthermore, this problem provides the appropriate motivation for illustrating the so-called abslinearisation technique [276] in combination with the SQH method as proposed in [50]. In the present nonsmooth case, the HP function is given by H (x, y, u, p) = |y − yd | + β log (1 + |u|) + p(u + ϕ) − p max (0, y) . Thus, we cannot straightforwardly use the definition of the adjoint problem given in (7.7). On the other hand, for a function ψ(y) which is Lipschitz continuous with respect to y, we have the existence of a function ψ 0 such that Z 1 ψ (y1 ) − ψ (y2 ) = ψ 0 (y2 + η (y1 − y2 )) (y1 − y2 ) dη; (7.28) 0

see, e.g., [9, Theorem 7.3]. Therefore, in correspondence of ∂y h we introduce ( 1 if y ≥ yd , h1 (y) := −1 else and related to ∂y f we have ( −1 f1 (y) := 0

if y ≥ 0 . else

Thus, corresponding to (7.7), we define our adjoint problem as follows: Z   B(p, v) = h1 (y (x)) + f1 (y (x)) v (x) dx. (7.29) Ω

Notice that h1 and f1 are bounded and measurable [88] and therefore elements of L∞ (Ω). Thus (7.29) is uniquely solvable, and the solution is bounded kpkL∞ (Ω) ≤ c, c > 0 as f1 ≤ 0, see Theorem A.4 in Appendix A. We solve (7.27) with yd (x) = sin (2πx1 ) sin (2πx2 ), Kad = [−40, 40], β = 10−5 and ϕ = 0. The nonlinear elliptic problem is solved by a Picard iteration until the L2 -norm of its residuum is less than 10−6 . The parameters of the SQH Algorithm 7.1 are chosen as follows: σ = 10, ζ = 0.15, η = 10−7 , for κ = 10−6 and an initial  = 1. In Figure 7.11, we present the convergence history of the SQH method showing a monotone reduction of the value of the cost functional, and in

PDE Optimal Control Problems

179

0.45

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05 10 0

10 1

10 2

10 3

FIGURE 7.11: Results for the nonlinear elliptic optimal control problem with L1 tracking and nonconvex cost of the control: the minimisation of J. Figure 7.12 we depict the control and state functions obtained with the SQH algorithm. We can see the presence of the control constraints. At convergence, N2 with ktot = 557, kup = 306, we have maxx∈Ω ∆H (x) = 5.35 · 10−2 and %% = N6

N4

N8

N 10

83.88, %% = 73.62, %% = 73.54, %% = 73.54, and %% = 73.54. It appears that the SQH method performs well on the nonsmooth problem above, but we recognise the main difficulty in the analysis of (7.27) due to the nondifferentiability of the cost functional with respect to the state variable, which complicates the derivation of the adjoint equation. A way to circumvent this problem and solve (7.27) with the SQH method is discussed in [50]. The proposed approach is based on the so-called abs-linearisation technique [126, 276], where additional control functions are introduced such that the nonsmooth Lipschitz terms appearing in (7.27) are replaced by bilinear structures involving the state and new control variables. Specifically, in the cost functional, the tracking term could be replaced as follows: Z Z |y (x) − yd (x) | dx → σ(x) (y (x) − yd (x)) dx, Ω



where the auxiliary control function σ ∈ Σad := {σ ∈ L2 (Ω) : σ(x) ∈ [−1, 1] a.e. } should be such that it approximates sufficiently well the function sign (y − yd ). Similarly, one could introduce another auxiliary control function µ ∈ Mad := {µ ∈ L2 (Ω) : µ(x) ∈ [0, 1] a.e. }, and rewrite the governing model as follows: (∇y, ∇v) + (µ y, v) = (u + ϕ, v) ,

v ∈ H01 (Ω)

where µ should be such that µ(x) y(x) = max (0, y(x)). However, an iterative optimisation procedure involving abs-linearisation can have this equality satisfied only asymptotically, while its approximate validity is achieved by

180

The Sequential Quadratic Hamiltonian Method

FIGURE 7.12: Results for the nonlinear elliptic optimal control problem with L1 tracking and nonconvex cost of the control: the optimal control (top) and the corresponding state. augmenting the cost functional in (7.27) as follows: Z Z J (y, u, µ) = |y (x) − yd (x)| dx + β log (1 + |u (x) |) dx Ω Ω Z max (0, y(x)) − µ(x) y(x) 3 dx. +γ Ω

Notice the similarity of this approach with the one for state constraints discussed in Section 7.8. The reason for considering a cubic exponent in the third term of J above is for guaranteeing twice continuous differentiability [50]. Also the role of γ > 0 is similar, and one can adapt Rthe proof of Theorem 7.5 in order to prove that as γ = γk → ∞, then limk→∞ Mk max (0, yk (x)) −

PDE Optimal Control Problems

181

3 µk (x) yk (x) dx = 0 where Mk := {x ∈ Ω : max (0, yk (x))−µk (x) yk (x) 6= 0}; see [50] for more details.

7.10

Parabolic Optimal Control Problems

In this section, we discuss parabolic optimal control problems and their solution by the SQH method. In principle, any of the optimal control problems with elliptic models discussed in the previous sections can be reformulated in a corresponding parabolic control problem with a fixed endtime T . Therefore, for brevity of illustration of the SQH ability to solve the latter problems, we consider only two settings. The weak formulation of a parabolic equation is given by Z 0 (y (·, t) , v) + B (y (·, t) , v) = f (x, t, y (x, t) , u (x, t)) v (x) dx, (7.30) Ω

for almost every t ∈ [0, T ] and all v ∈ H where (·, ·) is the L2 (Ω) scalar ∂ y and f : Rn × R+ product, y 0 := ∂t 0 × R × Kad → R. We consider the bilinear form B(y, v) = (∇y, ∇v) as in the elliptic cases. Further, we assume that the control function u is an element of the following admissible set of controls Uad = {u ∈ Lq (Q) | u (x, t) ∈ Kad a.e. in Q} , where Q = Ω × [0, T ] denotes the space-time cylinder, the set Kad ⊆ R is compact, and q = 2 for n = 1, q > n2 for n ≥ 2. We require that (7.30) with homogeneous Dirichlet boundary conditions and initial condition y0 ∈ H01 (Ω) ∩ L∞ (Ω) is well defined and y : Q → R fulfils (7.30) for almost all t ∈ [0, T ] and all v ∈ H01 (Ω). In our terminology, a linear parabolic model is obtained from (7.30) with f = u + ϕ whereas a bilinear model results with f = −u y + ϕ, where ϕ ∈ L∞ (Q) is a given function. In both cases, one can prove existence of a unique solution y ∈ L2 0, T, H 2 (Ω) ∩ L∞ 0, T ; H01 (Ω) ; see, e.g., [109, 187]. Essential boundedness of this solution can be proved as discussed in Appendix A.5. In this framework, our parabolic optimal control problem is formulated as follows: Z   min J (y, u) := h (y (x, t)) + g (u (x, t)) dxdt Q Z (7.31) s.t. (y 0 (·, t) , v) + B (y (·, t) , v) = f (x, t, y (x, t) , u (x, t)) v (x) dx Ω

u ∈ Uad , for almost every t ∈ [0, T ] and all v ∈ H01 (Ω).

182

The Sequential Quadratic Hamiltonian Method

The corresponding adjoint equation is given by − (p0 (·, t) , v) + B (p (·, t) , v) Z = (∂y h (y(x, t)) + ∂y f (x, t, y(x, t), u(x, t)) p (x, t)) v (x) dx,

(7.32)



with the terminal condition p (·, T ) = 0. We require that there exists a unique solution p : Q → R such that (7.32) holds for almost every t ∈ [0, T ] and all v ∈ H01 (Ω). The existence of such a pcan be proved in the linear and bilinear cases and results p ∈ L2 0, T, H 2 (Ω) ∩ L∞ 0, T ; H01 (Ω) . The HP function for (7.31) is given by H (x, t, y, u, p) = h (y) + g (u) + p f (x, t, y, u) .

(7.33)

Now, we can formulate the necessary PMP optimality conditions as stated in Theorem 7.1; see [222] for a proof. If (y, u) solves (7.31) and p solves (7.32), then it holds H (x, t, y (x, t) , u (x, t) , p (x, t)) = min H (x, t, y (x, t) , w, p (x, t)) , (7.34) w∈Kad

for almost all (x, t) ∈ Q. For the analysis of PMP conditions in the case of parabolic time-optimal control problems, we refer to, e.g., [166, 223, 278]. Next, we present numerical results for the following one-dimensional bilinear parabolic optimal control problem Z   1 2 min J (y, u) := (y (x, t) − yd (x, t)) + g (u (x, t)) dxdt, Q 2 s.t. (y 0 (·, t) , v) + (∇y (·, t) , ∇v) + (u (·, t) y (·, t) , v) = (ϕ (·, t) , v) , u ∈ Uad , where the parabolic constraint, with initial condition y0 = 0, is satisfied for almost every t ∈ [0, T ] and all v ∈ H01 (Ω). Further, we have T = 1, Ω = (0, 1), and ϕ = 1 is a constant function. The cost of the control is given by ( β |u| if |u| > s α 2 g (u) := u + , 2 0 else where α = 10−6 , β = 10−6 and s = 20. The set of admissible values for the control is Kad = [−30, 50]. We construct a target function as a step function centred on the trajectory x ¯ (t) = 0.5 + 0.2 sin (2πt) as follows: ( 1 if x ¯ (t) − 0.1 ≤ x ≤ x ¯ (t) + 0.1 yd (x, t) = 2 . 0 else

PDE Optimal Control Problems

183

The parabolic model and its adjoint are approximated by standard finite 1 differences and implicit Euler scheme on a uniform mesh with ∆t = 200 and 1 ∆x = 100 . We make the following choice of values of the SQH parameters: σ = 1.1, ζ = 0.9, η = 10−5 , for κ = 10−8 and an initial  = 1. The control and state functions obtained with the SQH algorithm (as obvious extension of the SQH Algorithm 7.1) are shown in Figure 7.13. We can recognise the presence of the control constraints and of the discontinuity in the control cost. At convergence, with ktot = 1064, kup = 553, we have max(x,t)∈Q ∆H (x, t) = 2.68 · 10−4 and 8 N% %

10 N% %

2 N% %

= 100,

4 N% %

= 96.11,

6 N% %

= 89.46,

= 87.06, and = 84.00. In Figure 7.14, we plot the convergence history of the SQH method showing a monotone reduction of the value of the cost functional.

FIGURE 7.13: Results for the parabolic optimal control problem: the optimal control (top) and the corresponding state.

184

The Sequential Quadratic Hamiltonian Method 0.019

0.018

0.017

0.016

0.015

0.014

0.013

0.012

0.011 10 0

10 1

10 2

10 3

FIGURE 7.14: Results for the parabolic optimal control problem: the minimisation of J. We conclude this section by presenting results obtained with the SQH algorithm solving a parabolic optimal control problem in the case where the set Kad is discrete. Specifically, we consider a distributed linear control mechanism and Kad = {−10, −8, −6, −4, −2, 0, 2, 4, 6, 8, 10}, that is, we have a discrete set of admissible control values as in the case of mixed-integer control problems. In the function g(u) above, we take α = 10−3 and β = 0, and the target function is given by yd (x, t) = 2 sin(2π t/T ) exp] − (x − 1/2)2 /2]. All other parameters are as above. The resulting optimal control and the corresponding state are depicted in Figure 7.15.

FIGURE 7.15: Results for the parabolic optimal control problem with a discrete set of admissible control values: the optimal control (top) and the corresponding state.

PDE Optimal Control Problems

7.11

185

Hyperbolic Optimal Control Problems

This section illustrates the application of the SQH method for solving optimal control problems governed by a wave equation. We discuss the cases of distributed and boundary control problems with control constraints and a tracking functional with L2 and L1 costs. We consider fixed endtime problems and refer to, e.g., [164, 165, 247] for related time-optimal control problems. We remark that in [251] the SQH method has been successfully applied to the Liouville equation (6.4), that is, a conservative transport equation. The Liouville equation represents the fundamental building block of kinetic models in statistical mechanics, which motivates the formulation and analysis of related ensemble optimal control problems [22, 23]. However, since this equation has the same structure of the Fokker-Planck equation with zero diffusion, we point to Chapter 6 for further details and the implementation of a SQH algorithm. We also refer to [3, 273] for further discussion and references on optimal control problems governed by first-order hyperbolic equations. In this section, we focus on the following wave equation 2 2 ∂tt y(x, t) − v 2 ∂xx y(x, t) = f (x, t),

(7.35)

where v represents the constant wave speed, and x ∈ Ω ⊂ R and t ∈ (0, T ), T > 0. We define the space-time cylinder (strip) Q = Ω × [0, T ] and Σ = ∂Ω × [0, T ]. The initial conditions for our evolution model are given by y(x, 0) = y0 (x),

∂t y(x, 0) = y1 (x),

x ∈ Ω.

(7.36)

Further, on Σ we consider Dirichlet or Neumann boundary conditions as specified below. In our one-dimensional space setting, the problem (7.35) and (7.36) could model a vibrating string. Notice that our wave model can be rewritten in the form of Goursat and Darboux; see, e.g., [27, 213, 214] for related results also involving optimal control problems. It is also possible to rewrite (7.35) as a first-order-in-time system as follows [56]: ∂t y(x, t) = z(x, t), 2 ∂t z(x, t) = v 2 ∂xx y(x, t) + f (x, t).

In this case the initial conditions become y(x, 0) = y0 (x) and z(x, 0) = y1 (x). Clearly, this is a convenient setting for extending many results and techniques

186

The Sequential Quadratic Hamiltonian Method

discussed in the previous chapters to control problems governed by the wave equation. Our first optimal control problem governed by the wave equation is given by α β ν ky − yd k2L2 (Q) + ky(·, T ) − yT k2L2 (Ω) + kuk2L2 (Q) + γ kukL1 (Q) 2 2 2 2 2 s.t. ∂tt y(x, t) − v 2 ∂xx y(x, t) = u(x, t), in Q (7.37)

min J (y, u) :=

y(x, 0) = y0 (x), y(x, t) = 0,

∂t y(x, 0) = y1 (x),

in Ω

on Σ

u ∈ Uad

where α, β, γ ≥ 0, and ν > 0. In this case, we have a distributed control as forcing term; see, e.g., [159, 182] for discussion of this class of problems. We choose the following set of admissible controls:  Uad = u ∈ L2 (Q) | u (x, t) ∈ Kad a.e. in Q , where Kad ⊂ R is an interval. We take yd ∈ L2 (0, T ; L2 (Ω)) and yT ∈ L2 (Ω). We have homogeneous Dirichlet boundary conditions, and assume y0 ∈ H01 (Ω) and y1 ∈ L2 (Ω) such that the corresponding solution to the wave problem is y ∈ C([0, T ]; H01 (Ω)); see, e.g., [182]. Existence of solutions to optimal control problems similar to (7.37) is proved in [3, 180, 182], and its PMP characterisation is presented in, e.g., [180]; see also [212, 214, 258, 265]. In order to formulate the PMP optimality conditions, we introduce the following adjoint problem 2 2 ∂tt p(x, t) − v 2 ∂xx p(x, t) = −α (y(x, t) − yd (x, t)) , p(x, T ) = 0, ∂t p(x, T ) = β (y(x, T ) − yT (x)) , p(x, t) = 0, on Σ.

in Q in Ω

(7.38)

The HP function for (7.37) is given by H (x, t, y, u, p) = p u −

ν 2 u − γ |u|. 2

(7.39)

Now, we can state the PMP maximisation condition as follows: If (y, u) is the optimal process for (7.37), and p solves (7.38), then it holds H (x, t, y (x, t) , u (x, t) , p (x, t)) = max H (x, t, y (x, t) , w, p (x, t)) , (7.40) w∈Kad

for almost all (x, t) ∈ Q.

PDE Optimal Control Problems

187

In the present setting, in the SQH algorithm the augmented HP function is as follows: 2

H (x, y, u, v, p) = H (x, y, u, p) −  (u − v) . In fact, in contrast to the previous section, we are referring to a maximisation condition. Our next step is to discuss the SQH method for solving (7.37). In this case, we would like to mention the work [151] concerning the analysis of a successive approximations scheme for a Goursat-Darboux control problem, and refer to [31] for remarks concerning the Sakawa-Shindo strategy for a problem of this class. We present results of experiments with the optimal control problem (7.37) and the following setting. We have Ω = (0, L) and T = 1, and the initial conditions for the wave model are given by y0 (x) = sin(π x/L),

y1 (x) = 0.

We take L = 10 and we choose v = 10. The optimisation weights of the cost functional have the values α = 1, β = 1, ν = 10−5 and γ = 10−3 . The set of admissible control values is Kad = [−30, 30]. The target functions are given by yd (x, t) = (1 − t) sin(π x/L) + t sin(2 π x/L),

yT (x) = yd (x, T ) .

In order to implement the SQH scheme similarly to Algorithm 7.1, we approximate the wave model and its adjoint by the standard 5-point finite difference scheme [193], on a uniform mesh of Nt = 1000 time subintervals and Nx = 100 space subintervals. We make the following choice of the values of the SQH parameters: σ = 1.1, ζ = 0.9, η = 10−5 , for κ = 10−8 and an initial  = 1. With this setting, the SQH algorithm achieves convergence in ktot = 1569 and kup = 761 iterations, and the minimisation history of the cost functional is plotted in Figure 7.16. The optimal control and the state function that are obtained with the SQH algorithm are depicted in space and time in Figure 7.17. However, in order to show the action of the box constraints on the control and the fact that the L1 penalisation term in the cost functional promotes sparsity of the control, we depict in Figure 7.18 two snapshots of the control and state functions at t = T /2 and t = 3T /4. Next, we focus on a wave optimal control problem with a Neumann boundary control mechanism; see, e.g., [159, 174, 182, 198]. However, for illustration,

188

The Sequential Quadratic Hamiltonian Method 7

6

5

4

3

2

1

0 10 0

10 1

10 2

10 3

FIGURE 7.16: Minimisation history of the cost functional for the wave optimal control problem with distributed control. in this case we choose a setting that aims at driving the system towards a terminal state but without tracking in [0, T ]. Notice that the resulting optimal control problem has similarities with boundary controllability problems discussed in, e.g., [127, 183], and with wave stabilisation problems [183, 236]. Our wave optimal control problem with a Neumann boundary control is formulated as follows: 1 ν ky(·, T ) − yT (·)k2L2 (Ω) + kuk2L2 (Σ) + γ kukL1 (Σ) 2 2 2 2 s.t. ∂tt y(x, t) − v 2 ∂xx y(x, t) = 0, in Q (7.41) y(x, 0) = y0 (x), ∂t y(x, 0) = y1 (x), in Ω ∂n y(x, t) = u(x, t), on Σ u ∈ Uad

min J (y, u) :=

where γ ≥ 0, and ν > 0, and ∂n y represents the normal derivative of y on Σ. In this case, we have the following set of admissible controls:  Uad = u ∈ L2 (Σ) | u (x, t) ∈ Kad a.e. in Σ . (7.42) Assuming that y0 ∈ H 1 (Ω), y1 ∈ L2 (Ω), yT ∈ L2 (Ω), and u ∈ Uad , it results that the solution to the governing wave problem is y ∈ C([0, T ]; H 1/2 (Ω)); see [198] for further details and references. Existence of optimal controls for (7.41) is proved in [3, 182], and its PMP charaterisation is proved in [198]. The adjoint problem for (7.41) is obtained from (7.38) with α = 0, β = 1, and homogeneous Neumann boundary conditions for the adjoint variable: ∂n p(x, t) = 0 on Σ. The HP function for (7.41) is given by H (x, t, y, u, p) = v 2 p u −

ν 2 u − γ |u|, 2

which is defined for the variables p and u on Σ.

(7.43)

PDE Optimal Control Problems

189

FIGURE 7.17: Results for the wave optimal control problem with distributed control: the optimal control (top) and the corresponding state. For the numerical solution of the wave problem and of the adjoint wave problem, we use the same 5-point scheme mentioned above. However, in order to implement the Neumann boundary control, we use a standard technique 2 where the finite-difference stencil of ∂xx is also considered on the boundary points and combined with the centred second-order approximation to ∂n y = u to remove the need of a ghost point. It is clear that a change on the boundary value can influence the solution in the entire domain within a characteristic time t0 = L/v. This is the time a wave needs to travel from one end of the string to the other. Thus, for our purpose, it is obvious to choose T > t0 . Specifically, we take T = 4 and the setting of the previous experiment with L = 10 and v = 10 so that t0 = 1.

190

The Sequential Quadratic Hamiltonian Method

25

1 state target

20 0.8 15

10

0.6

5 0.4 0

-5

0.2

-10 0 -15

-20 0

1

2

3

4

5

6

7

8

9

10

-0.2 0

1

2

3

4

x

5

6

7

8

9

10

x

30

1 state target

0.8 20 0.6 10 0.4

0

0.2

0 -10 -0.2 -20 -0.4

-30 0

1

2

3

4

5

6

7

8

9

10

-0.6 0

x

1

2

3

4

5

6

7

8

9

10

x

FIGURE 7.18: Results for the wave optimal control problem with distributed control. At the top, the optimal control (left) and the corresponding state compared with the target at t = 1/2; at the bottom the optimal control (left) and the corresponding state compared with the target at t = 3/4. Further, we choose the following initial conditions and target function: y0 (x) = sin(π x/L),

y1 (x) = 0,

yT (x) = cos(3 π x/L).

Notice that the target function is asymmetric with respect to the midpoint of the interval [0, L]. In the cost functional, we set ν = 10−12 , and γ = 2 · 10−1 . The set of admissible control values is Kad = [−0.04, 0.04]. These values are chosen adhoc to show active control constraints; see Figure 7.19 where one can also see the effect of the L1 (Σ) penalisation in (slightly) promoting sparsity along the time evolution. The wave propagation driven by the optimal Neumann boundary control is shown in Figure 7.20. Also in this figure, we depict the resulting state at final time y(x, T ) and see that it approximates the desired target yT (x). A typical exact controllability problem consists in determining a control function that drives the system to rest in a given finite time. In the following,

PDE Optimal Control Problems

191

0.04

0.03

0.02

0.01

0

-0.01

-0.02

-0.03

-0.04 0

0.5

1

1.5

2

2.5

3

3.5

4

2.5

3

3.5

4

t 0.04

0.03

0.02

0.01

0

-0.01

-0.02

-0.03

-0.04 0

0.5

1

1.5

2

t

FIGURE 7.19: The optimal Neumann boundary control function on the left boundary point (top) and on the right boundary point. we consider a wave optimal control problem having a similar aim and using a Dirichlet boundary control mechanism. Dirichlet boundary control problems are discussed in many of the references already mentioned in this section. In addition, we refer to [197, 238] for detailed discussion and analysis of these optimal control problems in the PMP framework. The aim to drive the wave to rest at the end of the given time horizon, means that y(x, T ) and ∂t y(x, T ) should be as small as possible. Therefore we formulate the following wave optimal control problem with a Dirichlet boundary control: 1 ν 1 ky(·, T )k2L2 (Ω) + k∂t y(·, T )k2L2 (Ω) + kuk2L2 (Σ) + γ kukL1 (Σ) 2 2 2 2 2 s.t. ∂tt y(x, t) − v 2 ∂xx y(x, t) = 0, in Q (7.44) y(x, 0) = y0 (x), ∂t y(x, 0) = y1 (x), in Ω y(x, t) = u(x, t), on Σ u ∈ Uad .

min J (y, u) :=

192

The Sequential Quadratic Hamiltonian Method

0.1 state target

0.05

0

-0.05

-0.1

-0.15

-0.2 0

1

2

3

4

5

6

7

8

9

10

x

FIGURE 7.20: The wave propagation driven by the optimal Neumann boundary control (top) and comparison at final time of the controlled state with the given target function. In correspondence to (7.44), we have the following adjoint problem: 2 2 ∂tt p(x, t) − v 2 ∂xx p(x, t) = 0, in Q p(x, T ) = −∂t y(x, T ), ∂t p(x, T ) = y(x, T ), p(x, t) = 0, on Σ.

in Ω

(7.45)

The HP function for (7.44) is given by H (x, t, y, u, p) = −v 2 ∂n p u −

ν 2 u − γ |u|. 2

(7.46)

Next, we report results of a numerical experiment with the SQH method applied to solving (7.44). In this experiment, we make the same choice of initial conditions and optimisation weights as in the previous experiment. We also

PDE Optimal Control Problems

193

0.2

0.15

0.1

0.05

0

-0.05

-0.1

-0.15

-0.2 0

0.5

1

1.5

2

2.5

3

3.5

4

2.5

3

3.5

4

t 0.2

0.15

0.1

0.05

0

-0.05

-0.1

-0.15

-0.2 0

0.5

1

1.5

2

t

FIGURE 7.21: The optimal Dirichlet boundary control function on the left boundary point (top) and on the right boundary point (bottom). keep the same values of the parameters of the SQH algorithm. In Figure 7.21, we plot the values of the control function on the boundaries, and in Figure 7.22 we depict the evolution of the wave driven by this control. We conclude this section with calculations showing that, in the above cases of wave models with linear boundary control mechanisms, the PMP condition is sufficient to characterise an optimal control. For this purpose, we focus on the following optimal control problem. We have 1 ν 1 ky(·, T )k2L2 (Ω) + k∂t y(·, T )k2L2 (Ω) + kuk2L2 (Σ) + γ kukL1 (Σ) 2 2 2 2 2 s.t. ∂tt y(x, t) − v 2 ∂xx y(x, t) = 0, in Q (7.47) y(x, 0) = y0 (x), ∂t y(x, 0) = y1 (x), in Ω y(0, t) = 0, a y(L, t) + b ∂x y(L, t) = u(t), t ∈ [0, T ] u ∈ Uad .

min J (y, u) :=

194

The Sequential Quadratic Hamiltonian Method

FIGURE 7.22: The wave propagation driven by the optimal Dirichlet boundary control function. At the boundary x = 0, we set homogeneous Dirichlet boundary conditions, and we consider a more general Robin boundary control mechanism on the boundary x = L. The corresponding adjoint problem is given by (7.45), where the boundary condition at x = L is given by a p(L, t) + b ∂x p(L, t) = 0. This is also the setting in [238]. Now, we consider two admissible controls u and u ¯, and the corresponding states y and y¯, and adjoint variables p and p¯. We denote with L the d’Alembert 2 2 y. Therefore, for δy = y − y¯ and δu = u − u ¯, it y − v 2 ∂xx operator: L(y) = ∂tt holds L(δy) = 0, δy(0, t) = 0,

δy(x, 0) = 0, ∂t δy(x, 0) = 0, a δy(L, t) + b ∂x δy(L, t) = δu(t).

We also have L(¯ p) = 0. Consequently, we have Z

T

Z

I := 0

L

  p¯ L(δy) − δy L(¯ p) dx dt = 0.

0

On the other hand, by integration by parts and using the conditions given above, we obtain Z L Z T   2 I= p¯ ∂t δy − ∂t p¯ δy (T ) dx − v p¯ ∂x δy − ∂x p¯ δy (L) dt =: I1 − I2 . 0

0

Therefore I1 = I2 . Next, we evaluate I2 for different boundary control mechanisms. We have RT Case 1: Dirichlet control, a = 1, b = 0; I2 = −v 2 0 ∂x p¯ δu dt.

PDE Optimal Control Problems 195 RT Case 2: Neumann control, a = 0, b = 1; I2 = v 2 0 p¯ δu dt. RT RT Case 3: Robin control, a = 1, b = 1; I2 = −v 2 0 ∂x p¯ δu dt = v 2 0 p¯ δu dt, Now, we analyse the difference of cost functionals J(y, u) − J(¯ y, u ¯) and to simplify the notation we define g(u) =

ν 2 u + γ |u|. 2

Therefore we have Z Z   1 L 1 L 2 y (T ) − y¯2 (T ) dx + ∂t y 2 (T ) − ∂t y¯2 (T ) dx 2 0 2 0 Z T  + g(u) − g(¯ u) dt.

J(y, u) − J(¯ y, u ¯) =

0

Now, we elaborate on the first two integrals by using the equality r2 − s2 = 2s(r − s) + (r − s)2 . We have Z L  1 J(y, u) − J(¯ y, u ¯) = y¯(T ) δy(T ) + (δy(T ))2 dx 2 0 Z L  1 + ∂t y¯(T ) ∂t δy(T ) + (∂t δy(T ))2 dx 2 0 Z T  + g(u) − g(¯ u) dt 0

Z

L

 y¯(T ) δy(T ) + ∂t y¯(T ) ∂t δy(T ) dx +



Z

0

Z =

L

 ∂t p¯(T ) δy(T ) − p¯(T ) ∂t δy(T ) dx +

Z

0

T

 g(u) − g(¯ u) dt

0 T

 g(u) − g(¯ u) dt.

0

Notice that the first before last integral equals −I1 and can be replaced with −I2 . Therefore, in Case 1, we obtain Z T Z T  2 J(y, u) − J(¯ y, u ¯) ≥ v ∂x p¯ (u − u ¯) dt + g(u) − g(¯ u) dt 0 T

Z

0

 H(y, u, p¯) − H(y, u ¯, p¯) dt,

=− 0

where H is the HP function given by (7.46). In Case 2, we obtain Z T Z 2 J(y, u) − J(¯ y, u ¯) ≥ −v p¯ (u − u ¯) dt + 0

Z =−

T

 g(u) − g(¯ u) dt

0

T

 H(y, u, p¯) − H(y, u ¯, p¯) dt,

0

where H is the HP function given by (7.43).

196

The Sequential Quadratic Hamiltonian Method

In Case 3, we obtain Z Z 1 2 T 1 2 T J(y, u) − J(¯ y, u ¯) ≥ v ∂x p¯ (u − u ¯) dt − v p¯ (u − u ¯) dt 2 2 0 0 Z T  g(u) − g(¯ u) dt + 0

Z =−

T

 H(y, u, p¯) − H(y, u ¯, p¯) dt,

0

where

1 2 ν v (p − ∂x p) u − u2 − γ |u|. 2 2 We see that in all cases, if u ¯ satisfies the PMP maximality condition, then J(y, u) ≥ J(¯ y, u ¯) holds for all admissible u and corresponding y = y(u); moreover, equality holds only if u = u ¯. Notice that similar calculations and stability estimates of the form kδyk + k∂t δyk ≤ kδuk, see, e.g., [151, 197], allow to prove well-posedness of the SQH method as already illustrated in previous chapters. H (y, u, p) =

Chapter 8 Identification of a Diffusion Coefficient

8.1 8.2 8.3 8.4 8.5

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An Inverse Diffusion Coefficient Problem . . . . . . . . . . . . . . . . . . . . . . . . The SQH Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Finite Element Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

197 198 200 204 208

This chapter illustrates the formulation of an inverse problem for the identification of the diffusion coefficient of an elliptic model. The resulting optimisation problem is solved with the sequential quadratic Hamiltonian method. In this case, the proof of the Pontryagin maximum principle appears an open problem. However, subject to assumptions of sufficient regularity, wellposedness of the SQH method is proved and its computational performance is validated with a diffusion problem approximated by finite elements.

8.1

Introduction

In many application problems, the need arises to identify the diffusion coefficient of elliptic models based on measurements of the state of the model and on a priori information on the coefficient sought. In particular, this problem appears in medical imaging with electrical impedance tomography [45, 271], in parameter identification procedures in groundwater hydrology [155, 284], and other applications; see, e.g., [21]. A representative model for inverse diffusion coefficient problems is the following elliptic partial differential equation (PDE) − ∇ · (µ ∇y) = f,

in Ω,

(8.1)

where Ω ⊂ Rn is a bounded Lipschitz domain, n ≥ 1, and µ denotes the diffusion coefficient. Further f ∈ L∞ (Ω) is a given function, and appropriate boundary conditions for y on ∂Ω are specified. A typical inverse problem with (8.1) is to identify µ so that the solution y of the corresponding elliptic problem matches a given observation data yd DOI: 10.1201/9781003152620-8

197

198

The Sequential Quadratic Hamiltonian Method

in some optimal sense. In particular, this data could be collected at various points of the domain and interpolated on Ω, and optimality is defined in the sense of minimising an objective functional consisting of a least-squares bestfit term and a regularisation term for µ. For this and many other classical inverse problems we refer to, e.g., [21, 107, 144, 169, 186]. In this chapter, we aim at extending the range of applicability of the SQH method to solve the classical inverse problem of estimation of the diffusion coefficient in a finite element framework; see, e.g., [18, 80, 260]. However, along this with effort (?), we point out some fundamental difficulties in proving the PMP for this class of problems.

8.2

An Inverse Diffusion Coefficient Problem

Our inverse diffusion coefficient problem is formulated as follows: Z   1 2 (y(x) − yd (x)) + g(µ(x)) dx, min J(y, µ) := Ω 2 s.t. (µ ∇y, ∇v) + (b y, v) = (ϕ, v) , v ∈ H01 (Ω), µ ∈ Σ,

(8.2) (8.3) (8.4)

where Ω is a open bounded set of R2 , x = (x1 , x2 ) ∈ R2 , yd ∈ L∞ (Ω) represents the measured state of the system, b ∈ L∞ (Ω) is a nonnegative reaction coefficient and ϕ ∈ L∞ (Ω) is a given function. The diffusion coefficient µ is sought in the following admissible set  Σ := µ ∈ L2 (Ω) : µ(x) = µ0 in Ω \ ω, µ(x) ∈ KΣ a.e. in Ω , (8.5) where µ0 ∈ KΣ is the known value of the diffusion coefficient near the boundary of Ω, and ω ⊂ Ω is the subdomain where the value of µ is sought in KΣ , which is a bounded set given by KΣ = [µlo , µup ], where µlo > 0 and µup > µlo are the lower and upper bounds for the coefficient; we assume |µlo | + |µup | < +∞, which implies that µ is in L∞ (Ω), and (8.3) is uniformly elliptic. Therefore classical results apply [124] that guarantee existence of a unique weak solution y ∈ H01 (Ω) to our governing model for all µ in the admissible set. Concerning the Tikhonov functional J, we have the bestfit term ky − yd k2L2 (Ω) , with given observation data yd , and the regularisation term g : R → R is assumed to have the following structure g(µ) := where α ≥ 0.

α (µ − µ0 )2 , 2

(8.6)

Identification of a Diffusion Coefficient

199

We remark that the analysis of (8.2)–(8.4) involves investigation of the boundedness of the gradient of the solution to (8.3). This is a long standing topic in the theory of PDEs; see, e.g., [77, 124, 170] and references therein. In our setting we assume that ∇y ∈ L∞ loc (Ω) holds, which is true if µ is continuous; see [192] and references therein. This property becomes global subject to further conditions on the setting of the problem, e.g., the smoothness of ∂Ω and the type of boundary conditions; see, e.g., [204, 268]. In this context, we also have the following stability estimates that also prove continuity of the map µ 7→ y = S(µ) between the appropriate spaces. Lemma 8.1 Let two pairs (y1 , µ1 ), (y2 , µ2 ) ∈ H01 (Ω) × Σ both satisfy (8.3); moreover, assume that ∇y1 , ∇y2 ∈ L∞ (Ω) holds. Then there exists a constant C1 > 0 such that the following estimate holds ky1 − y2 kL2 (Ω) ≤ C1 kµ1 − µ2 kL2 (Ω) . Furthermore, there exists a constant C2 > 0 such that ky1 − y2 kH 1 (Ω) ≤ C2 kµ1 − µ2 kL2 (Ω) . Proof. To prove this lemma, one considers the following two equations (µ1 ∇y1 , ∇v) = (ϕ, v) and (µ2 ∇y2 , ∇v) = (ϕ, v) , which hold for all v ∈ H01 (Ω). Therefore, since (µ1 ∇y1 , ∇v) = (µ2 ∇y2 , ∇v) and subtracting (µ1 ∇y2 , ∇v) from both sides, we obtain (µ1 ∇ (y1 − y2 ) , ∇v) = ((µ2 − µ1 ) ∇y2 , ∇v) .

(8.7)

Taking the L2 scalar product in (8.7) with v = ∇(y1 − y2 ), we obtain     µ1 ∇ (y1 − y2 ) , ∇ (y1 − y2 ) = (µ2 − µ1 ) ∇y2 , ∇ (y1 − y2 ) . Now, because of the bound µ1 ≥ µlo > 0 and ∇y2 ∈ L∞ (Ω), we have a constant C = k∇y2 kL∞ (Ω) such that k∇ (y1 − y2 ) kL2 (Ω) ≤ C kµ2 − µ1 kL2 (Ω) .

(8.8)

Using the Poincaré - Friedrichs inequality ky1 −y2 k2L2 (Ω) ≤ C∗ k∇ (y1 − y2 ) k2L2 (Ω) , where C∗ = C∗ (Ω), we obtain ky1 − y2 kL2 (Ω) ≤ C1 kµ2 − µ1 kL2 (Ω) . √ where C1 = C∗ C. Thus we have proved the first claim. The second claim follows from this result and the estimate (8.8). Our next step towards the solution of (8.2)–(8.4) is the characterisation of the optimal coefficient sought in the PMP framework. Unfortunately, in order to achieve the essential boundedness of the gradient of the solution of

200

The Sequential Quadratic Hamiltonian Method

the elliptic model, the diffusion coefficient should belong to a more regular space than L2 as the space H 1 . Then the classical needle variation becomes impossible. Moreover, a lack of sufficient regularity of the diffusion coefficient implies also that the gradient of the adjoint variable is not sufficiently regular; but this regularity is required to prove the PMP. In this situation, while the proof of the PMP remains an open problem, it seems reasonable to consider only classical variations of the diffusion coefficient and resort to a characterisation in the Lagrange framework or similarly formulate a linearised maximum principle as in [254]. In the following, we tacitly assume sufficient regularity of the components defining our problem such that we can assume the validity of the PMP characterisation given below. Therefore we can define the HP function for our optimisation problem as follows: H(x, y, µ, p) :=

1 2 (y − yd ) + g(µ) − µ ∇y · ∇p. 2

(8.9)

We assume that p ∈ H01 (Ω) denotes the adjoint variable that solves the following problem (µ ∇p, ∇v) + (b p, v) = (y − yd , v),

v ∈ H01 (Ω).

(8.10)

Notice that the adjoint problem has the same structure of the governing model. We suppose that the gradients of the state and adjoint functions are essentially bounded. We remark that formally in correspondence to a strong formulation of the governing model, the HP function is as follows: H(x, y, µ, p) :=

1 2 (y − yd ) + g(µ) + p ∇ · (µ ∇y). 2

(8.11)

Now, subject to our assumptions, we assume the following PMP characterisation. If (y, µ, p) is an optimal solution to our optimisation problem, then it must satisfy the following PMP optimality condition H (x, y(x), µ(x), p(x)) = min H (x, y(x), w, p(x)) , w∈KΣ

a.e. in Ω,

(8.12)

where y and p are the state and the adjoint variables corresponding to the optimal µ.

8.3

The SQH Method

For the formulation of the SQH method for solving the inverse problem (8.2)–(8.4), we consider the augmented HP function given by H (x, y, µ, υ, p) := H(x, y, µ, p) +  (µ − υ)2 ,

(8.13)

Identification of a Diffusion Coefficient

201

where  > 0. This choice is motivated by the theoretical analysis given below, and implemented in the following SQH algorithm. Algorithm 8.1 (SQH method for identification of diffusion) Input: initial approx. µ0 , max. number of iterations kmax , tolerance κ > 0,  > 0, σ > 1, η > 0 and ζ ∈ (0, 1); set τ > κ, k := 0. Compute the solution y 0 to the governing problem (µ0 ∇y, ∇v) + (b y, v) = (ϕ, v) ,

v ∈ H01 (Ω).

while (k < kmax && τ > κ ) do 1) Compute the solution pk to the adjoint problem (µk ∇p, ∇v) + (b p, v) = (y k − yd , v),

v ∈ H01 (Ω).

2) Determine µk+1 such that the following optimisation problem is satisfied   H x, y k (x), µk+1 (x), µk (x), pk (x) = min H x, y k (x), w, µk (x), pk (x) , w∈KΣ

for almost all x ∈ Ω. 3) Compute the solution y k+1 to the forward problem (µk+1 ∇y, ∇v) + (b y, v) = (ϕ, v) ,

v ∈ H01 (Ω)

4) Compute τ := kµk+1 − µk k2L2 (Ω) .   5) If J y k+1 , µk+1 − J y k , µk > −η τ , then increase  with  = σ  and go to Step 2.   Else if J y k+1 , µk+1 − J y k , µk ≤ −η τ , then decrease  with  = ζ  and continue. 6) Set k := k + 1. end while Next, we prove the following result stating that, in Step 5 of Algorithm 8.1, if an update of µ does not attain a sufficient decrease of the value of the functional J, then it is possible to improve descent by choosing a larger  in H .   Theorem 8.1 Let y k , µk and y k+1 , µk+1 be generated by the SQH method, Algorithm 8.1 applied to (8.2)–(8.4), and µk+1 , µk be measurable; assume that the gradients ∇y k and ∇pk are essentially bounded. Then, there exists a θ > 0 independent of , k, and µk such that for the  > 0 currently chosen by Algorithm 8.1, the following holds   J y k+1 , µk+1 − J y k , µk ≤ − ( − θ) kµk+1 − µk k2L2 (Ω) .

202

The Sequential Quadratic Hamiltonian Method   In particular, it holds J y k+1 , µk+1 − J y k , µk ≤ −η τ for  ≥ θ + η and τ = kµk+1 − µk k2L2 (Ω) . Proof. We have the following HP function H(x, y, µ, p) = h(y) + g(µ) − µ ∇y · ∇p, where h(y) = that

1 2

2

(y − yd ) and g(µ) =

α 2 (µ

− µ0 )2 . In Algorithm 8.1, we have

  H x, y k , µk+1 , µk , pk ≤ H x, y k , w, µk , pk , for all w ∈ KΣ . Therefore   H x, y k , µk+1 , µk , pk ≤ H x, y k , µk , µk , pk = H(x, y k , µk , pk ),

(8.14)

for almost all x ∈ Ω. Then, we have Z 

 J(y ,µ ) − J(y , µ ) = h(y k+1 ) + g(µk+1 ) − h(y k ) − g(µk ) dx Ω Z   = h(y k+1 ) + g(µk+1 ) − µk+1 ∇y k+1 · ∇pk + µk+1 ∇y k+1 · ∇pk dx ZΩ  + − h(y k ) − g(µk ) + µk ∇y k · ∇pk − µk ∇y k · ∇pk dx ZΩ   = H(x, y k+1 , µk+1 , pk ) − H(x, y k , µk , pk ) dx ZΩ  + µk+1 ∇y k+1 · ∇pk − µk ∇y k · ∇pk dx ΩZ  = H(x, y k+1 , µk+1 , pk ) − H(x, y k , µk+1 , pk ) Ω  + H(x, y k , µk+1 , pk ) − H(x, y k , µk , pk ) dx Z   + µk+1 ∇y k+1 · ∇pk − µk ∇y k · ∇pk dx Ω Z Z k+1 k 2 + (µ − µ ) dx −  (µk+1 − µk )2 dx Ω Ω Z   ≤ H(x, y k+1 , µk+1 , pk ) − H(x, y k , µk+1 , pk ) dx ZΩ Z  + µk+1 ∇y k+1 · ∇pk − µk ∇y k · ∇pk dx −  (µk+1 − µk )2 dx; k+1



k+1

k

k



Identification of a Diffusion Coefficient

203

notice that the inequality results from (8.14). We continue our calculation as follows: Z   = h(y k+1 ) − h(y k ) − µk+1 ∇y k+1 · ∇pk + µk+1 ∇y k · ∇pk dx ZΩ Z  + µk+1 ∇y k+1 · ∇pk − µk ∇y k · ∇pk dx −  (µk+1 − µk )2 dx Ω Z ZΩ    k+1 k k+1 k k k (µk+1 − µk )2 dx = h(y ) − h(y ) + µ − µ ∇y · ∇p dx −  Ω Ω Z    1 = (y k − yd ) (y k+1 − y k ) + (y k+1 − y k )2 + µk+1 − µk ∇y k · ∇pk dx 2 Ω Z − (µk+1 − µk )2 dx Z Ω   1 = µk ∇(y k+1 − y k ) · ∇pk + (y k+1 − y k )2 + µk+1 − µk ∇y k · ∇pk dx 2 Ω Z − (µk+1 − µk )2 dx; Ω

for the last equality, we have used the adjoint equation (8.10) with v = ∇pk . We proceed with the last part of our calculation as follows: Z  1 = (µk − µk+1 ) ∇y k+1 · ∇pk + (y k+1 − y k )2 2 Ω   k+1 k k k + µ − µ ∇y · ∇p dx Z − (µk+1 − µk )2 dx Ω Z    1 (µk − µk+1 ) ∇ y k+1 − y k · ∇pk + (y k+1 − y k )2 dx = 2 Ω Z − (µk+1 − µk )2 dx Ω

 1 ≤ k∇pk kL∞ (Ω) kµk+1 − µk kL2 (Ω) k∇ y k+1 − y k kL2 (Ω) + ky k+1 − y k k2L2 (Ω) 2 −  kµk+1 − µk k2L2 (Ω) ≤ θ kµk+1 − µk k2L2 (Ω) −  kµk+1 − µk k2L2 (Ω) In the last two steps, we have used the Cauchy-Schwarz inequality and the estimates of Lemma (8.1); clearly θ depends on k∇pk kL∞ (Ω) , C1 and C2 . This theorem proves that, if (µk , y k ) is not already optimal, it is possible to choose  > θ to obtain a minimising step. With Theorem 8.1, further results on the convergence of the SQH method discussed in the previous chapters can be extended to our case. In particular, we have

204

The Sequential Quadratic Hamiltonian Method

Theorem 8.2 Let the sequence (y k ) and (µk ) be generated by Algorithm 8.1 (loop over step 2 to step 4). Then, the sequence of functional values J(y k , µk ) monotonically decreases with   lim J y k+1 , µk+1 − J y k , µk = 0, k→∞

and

lim µk+1 − µk L2 (Ω) = 0.

k→∞

Theorem 8.2 guarantees that Algorithm 8.1 is well defined for κ > 0. Hence, there is an iteration number k0 ∈ N0 such that kµk0 +1 − µk0 kL2 (Ω) ≤ κ and therefore Algorithm 8.1 stops in finitely many steps.

8.4

Finite Element Approximation

In this section, we illustrate the finite element (FEM) approximation to the governing model (8.3) and to its optimisation adjoint (8.10). Notice that these two problems have the same structure and are solved separately in the SQH method. Therefore we focus on (8.3), since the same consideration apply to (8.10) replacing ϕ with (y − yd ). In the FEM method [18, 80, 260], a triangulation Th is constructed over ¯ that is, Ω ¯ is partitioned into a finite union of triangular the closed domain Ω, finite elements K` ∈ Th , ` = 1, . . . , NK . In this notation, h usually denotes the maximum diameter of the elements in the partition of the domain. For simplicity, we take Ω = (0, 1) × (0, 1) and consider a uniform triangulation as the one shown in Figure 8.1. On the given finite element, we consider piecewise linear polynomials which are used to construct a set of functions (φi )i=1,...,Nh that form a basis of

FIGURE 8.1: A uniform FEM triangulation of a square domain; NK = 128.

Identification of a Diffusion Coefficient

205

FIGURE 8.2: A finite element basis function. the finite dimensional space Vh given by Vh = span {φ1 , . . . , φNh }, where Nh denotes the dimension of Vh . The resulting basis functions appear as in Figure 8.2. Notice that we have homogeneous Dirichlet boundary conditions (also called essential boundary conditions) for y and p so that only basis functions centred on the internal vertices of the triangulation  are considered. In the following, we denote with Vi = xi1 , xi2 , i = 1, 2, 3 the set of coordinates of the vertices (or nodes) of a generic triangular finite element K. However, in our case, it is possible to enumerate the basis functions with the Cartesian coordinates (l, m) of the vertices of the triangles. Then, for the basis function φl m centred at the interior node (xl1 , xm 2 ), we have  x1 − xl1 x2 − xm  2  1 − − , (x1 , x2 ) ∈ 1   h h  m   x − x 2 2   1− , (x1 , x2 ) ∈ 2   h   l  x1 − x1   , (x1 , x2 ) ∈ 3   1− h xl1 − x1 xm − x φl m (x1 , x2 ) = 2  1− − 2 , (x1 , x2 ) ∈ 4   h h  m  x − x  2  1− 2 , (x1 , x2 ) ∈ 5    h l   x1 − x1    1− , (x1 , x2 ) ∈ 6   h  0 otherwise, where 1, 2, . . . , 6 enumerate the triangles surrounding the node (xl1 , xm 2 ); see [18, 260] for more details. In this FEM setting, a function yh ∈ Vh is defined as follows: yh (x1 , x2 ) =

Nh X

yi φi (x1 , x2 ),

(8.15)

i=1

where yi , i = 1, . . . , Nh , denote the coefficients of yh . Notice that yh (Vi ) = yi , that is, the vector (yi )i=1,...,Nh represents the FEM function yh at the nodes of

206

The Sequential Quadratic Hamiltonian Method

the triangulation. We P have the same representation for the numerical adjoint Nh variable: ph (x1 , x2 ) = i=1 pi φi (x1 , x2 ). The FEM approach is based on the weak formulation of (8.3) as follows: Find y ∈ V := H01 (Ω) such that Z Z Z µ ∇y · ∇v dx + b y v dx = ϕ v dx, v ∈ H01 (Ω). (8.16) Ω





Now, let Vh be the finite dimensional subspace of H01 (Ω) described above. The FEM approximation of the solution to (8.3) is given by yh ∈ Vh that satisfies the following problem Z Z Z µ ∇yh · ∇vh dx + b yh vh dx = ϕ vh dx, vh ∈ Vh . (8.17) Ω





In this formulation, by choosing vh to be any of the basis functions φj , we obtain the following algebraic problem: T Find Y = (y1 , . . . , yNh ) ∈ RNh such that Nh Z X i=1

 Z b φi φj dx yi = ϕ φj dx

Z µ ∇φi · ∇φj dx +





j = 1, . . . , Nh .



Hence, the computation of yh requires to solve the linear system A Y = F , T with A = (aij ) and F = (F1 , . . . , FNh ) , given by Z Z aij = aji = µ ∇φi · ∇φj dx + b φi φj dx, Ω



and

Z Fj =

ϕ φj dx. Ω

The task of computing the entries aij and Fj can be made efficiently by performing this computation on each element K separately [18, 172]. Specifically, the computation of the stiffness matrix A can be restricted to the computation of the local element stiffness matrix AK = (aK ij ) with Z Z aK = µ ∇φ · ∇φ dx + b φi φj dx. i j ij K

K

Similarly, the vector F can be obtained assembling the contribution to the integral above on each element K. However, because we have a variable coefficient problem the integrals above have to be computed by quadrature schemes. Now, while we refer to [18] for a detailed discussion on integration schemes and on the FEM implementation, we R proceed with a partial description of the quadrature problem for the term µ ∇φi · ∇φj dx based on a simple barycentric formula as discussed in [172]; K see also [18]. The purpose of this discussion is to identify the appropriate way to represent the function µ on the FEM mesh.

Identification of a Diffusion Coefficient

207

Consider the basis function φi on K with vertices Vi , Vj , and Vk . Since φi is linear on K, we can write it in the following form φi (x1 , x2 ) = ai + bi x1 + ci x2 ,

(x1 , x2 ) ∈ K,

where φi (Vj ) = δij . For the coefficients ai , bi , and ci we have ai =

xj1 xk2 − xk1 xj2 , 2|K|

bi =

xj2 − xk2 , 2|K|

ci =

xk1 − xj1 . 2|K|

The gradient of φi is given by ∇φi = (bi , ci )T . Therefore, we have Z Z µ ∇φi · ∇φj dx = (bi bj + ci cj ) µ dx. K

K

Now, let V¯ = (Vi + Vj + Vk )/3 be the ‘barycentre’ of the finite element K; ¯ V = (¯ x1 , x ¯2 ). We have the following simple quadrature formula Z µ dx ≈ µ ¯ |K|, K

where µ ¯ = µ(¯ x1 , x ¯2 ). Hence, we obtain Z µ ∇φi · ∇φj dx ≈ µ ¯ |K| (bi bj + ci cj ). K

R With a similar reasoning, we can define a quadrature for K b φi φj dx and R for K ϕ φj dx; see [18]. This means that we need to specify the functions µ, b and ϕ at the barycentre of each triangle K. In the case of b and ϕ, we provide this data directly assuming that these functions are continuous. On the other hand, we define our numerical approximation to µ, denoted with µh = (µ1 , . . . , µNK ), as the vector of values representing µ ¯, that is, the values of µ at the barycentres of the triangles. With this preparation, it appears clearly that Step 2 in Algorithm 8.1 consists of a sweep, through all finite elements, that updates µh as follows: α  µ` = argmin (v − µ0 )2 − v ∇y`k · ∇pk` +  (v − µk` )2 . 2 v∈KΣ for ` = 1, . . . , NK . Since KΣ = [µlo , µup ], the minimum can be determined analytically and it is given by     2  µk` + α µ0 + ∇y`k · ∇pk` , µup . µ` = min max µlo , 2 + α Notice that, in this update process, we need to compute ∇y` and ∇p` (we omit k), that is, the gradients of yh and ph at the barycentre of K` . For this purpose, we use the fact that yh is linear in K` and its values on the vertices are available. Therefore we can determine a ¯, ¯b, and c¯ such that yh (x1 , x2 )|K` = a ¯ +¯b x1 +¯ c x2 , thus we have ∇y` = (¯b, c¯)T , and similarly for ph . For the solution

208

The Sequential Quadratic Hamiltonian Method

of the FEM linear systems for computing yh and ph , many efficient methods are available as, e.g., the conjugate gradient method (CG) [237]. However, in our experiments we use a direct solver. We would like to mention that, in the realm of inverse diffusion coefficient problems, only a few approximation estimates are available. In particular, we refer to the work [110] that considers the following optimisation problem 2

min J(a) := ku(a) − zkL2 (Ω) , s.t. − div(a∇u) = f, in Ω, ∂u = g, on ∂Ω, a ∂n where z is a measurement R Rof u, and the functions f and g satisfy the compatibility condition Ω f dx+ ∂Ω gds = 0. The main result in this work states that the L2 -error between the exact solution a and its FEM approximation ah is of size O(hr + h−2 δ), provided that the problem is approximated by polynomials of degree r, and kz − ukL2 (Ω) ≤ δ. Further works that focus on similar inverse diffusion coefficient problems can be found in [70, 154, 168, 169, 186, 261].

8.5

Numerical Experiments

In this section, we validate the ability of the proposed FEM SQH method to solve inverse diffusion coefficient problems with the structure (8.2)–(8.4). We take Ω = (0, 1) × (0, 1), and the continuous functions b(x, y) = x y and ϕ(x, y) = x − y. In order to construct a synthetic data representing measurements of the state, we solve (8.3) with µ given by  µ0 + 0.3 (x1 , x2 ) ∈ [0.4, 0.6] × [0.4, 0.6] µd (x1 , x2 ) = µ0 otherwise, where µ0 = 1. Further, we consider a small perturbing function e given by ej = (5/100) sin(4π j/Nh ),

j = 1, . . . , Nh ,

and take yd = y(µd ) (1 + e) as the measured state. For the weight of the cost of the control we choose α = 10−12 , and the set of admissible µ is defined with µlo = 0.1 and µup = 10. Moreover, we take ω = (0.1, 0.9) × (0.1, 0.9). Now, we compute the solution to (8.2)–(8.4) with the SQH method with the initialisation  = 1 and µ0 = µ0 . Moreover, we take κ = 10−8 , η = 10−8 , ζ = 0.9 and σ = 1.1. For Ω we consider a regular triangulation similar to that shown in Figure 8.1, with NK = 2048. In Figure 8.3, we depict µd and

Identification of a Diffusion Coefficient

209

FIGURE 8.3: The µd used to generate the synthetic data (top) and the computed µ. µ resulting from the FEM SQH computation. Notice that the FEM SQH method is able to satisfactorily recover a discontinuous diffusion coefficient, and a similar result is obtained with α = 0; compare with [70] where similar results are obtained, provided that sophisticated regularisation schemes are employed. In Figure 8.4, we plot the convergence history of the SQH method in terms of reduction of the value of the cost functional, which demonstrates that the SQH method provides a minimising sequence. Next, we consider a similar problem, but with homogeneous Neumann boundary conditions and no regularisation. We have Z 1 2 (y(x) − yd (x)) dx, min J(y, u) := (8.18) 2 Ω s.t. (µ ∇y, ∇v) + (b y, v) = (ϕ, v) , v ∈ H 1 (Ω), (8.19) µ ∈ Σ, (8.20) where all functions and the set Σ are chosen as above.

210

The Sequential Quadratic Hamiltonian Method 10 -8

10 -9

10 -10

10 -11

10 -12 10 0

10 1

10 2

10 3

10 4

FIGURE 8.4: Convergence history of the SQH method: the minimisation of J.

FIGURE 8.5: The µd used to generate the synthetic data (top) and the computed µ in the case with Neumann boundary conditions. We solve (8.18)–(8.20) with the SQH method with the same initialisation and choice of parameters given above. In Figure 8.5, we depict µd and µ resulting from the FEM SQH computation. In Figure 8.6, we plot the convergence history of the SQH method showing that it provides a minimising sequence.

Identification of a Diffusion Coefficient

211

10 -6

10 -7

10 -8

10 -9

10 -10

10 -11

10 -12 10 0

10 1

10 2

10 3

10 4

FIGURE 8.6: Convergence history of the SQH method: the minimisation of J in the case with Neumann boundary conditions.

Appendix A Results of Analysis

A.1

A.2 A.3 A.4 A.5

Some Function Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.1.1 Spaces of Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . A.1.2 Spaces of Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . A.1.3 Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Grönwall Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Derivatives in Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Implicit Function Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . L∞ Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

213 213 215 215 218 219 221 222

The purpose of this appendix is to provide some of the essential theoretical background for the discussion in this book.

A.1

Some Function Spaces

In this section, we give a very short overview of definitions and basic properties of some spaces of continuous functions, spaces of integrable functions, and Sobolev spaces that are considered in this book. For more details see, e.g., [5].

A.1.1

Spaces of Continuous Functions

We start describing function spaces that consist of continuous and continuously differentiable functions on an open interval I = (a, b), a < b. We denote with C k (I) the set of all continuous real-valued functions dedm fined on I such that u(m) := dx m u is continuous on I for all m with m ≤ k. If (1) 0 m = 1, we denote u with u ; similarly, if m = 2, we denote u(2) with u00 . ¯ the set of all u in Assuming that I is bounded, we denote with C k (I) k (m) C (I) such that u can be extended from I to a continuous function on I¯ ¯ can be equipped with (the closure of the set I) for all m ≤ k. The space C k (I) the norm X kukC k (I) sup |u(m) (x)|. ¯ := m≤k

x∈I

¯ is a Banach space. With this norm, the space C k (I) DOI: 10.1201/9781003152620-A

213

214

The Sequential Quadratic Hamiltonian Method

¯ instead of C 0 (I). ¯ We have When k = 0, we omit the index and write C(I) kukC(I) ¯ = sup |u(x)| = max |u(x)|. x∈I

x∈I¯

Similarly, if k = 1, we have 0 kukC 1 (I) ¯ = sup |u(x)| + sup |u (x)|. x∈I

x∈I

The support of u, supp u, of a continuous function u on I is defined as the closure in I of the set {x ∈ I : u(x) 6= 0}. That is, supp u is the smallest closed subset of I such that u = 0 in I\ supp u. For example, let w be the function defined on R given by ( − 1 e 1−|x|2 , |x| < 1, w(x) = 0, otherwise. Clearly supp w is the closed interval {x ∈ R : |x| ≤ 1}. We denote with C0k (I) the set of all u ∈ C k (I) such that supp u ⊂ I and supp u is bounded. With these spaces, we construct the following (non Banach) space C0∞ (I) = ∩k≥0 C0k (I). The function w defined above belongs to C0∞ (R). A function u : [a, b] → R is said to be absolutely continuous (AC) on [a, b], and we write u ∈ AC([a, b]; R), if for every  > 0, there is a δ > 0 such that whenever a finite collection of pairwise disjoint sub-intervals (xk , yk ) ⊂ I satisfies X (yk − xk ) < δ, k

then X

|u(yk ) − u(xk )| < .

k

The set of all absolutely continuous functions on [a, b] is denoted with AC[a, b]. By construction, we have that every AC function is uniformly continuous and thus continuous, and every Lipschitz-continuous function is absolutely continuous. We also consider the space of piecewise C 1 functions as follows: Definition A.1 A function y ∈ C[a, b] is called piecewise in C 1 if there are at most finitely many points a = x0 < x1 < ... < xN +1 = b such that y ∈ 1 C 1 [xk , xk+1 ], k = 0, ..., N . We denote this space with Cpw [a, b].

Results of Analysis

A.1.2

215

Spaces of Integrable Functions

A non negative measurable function u is called Lebesgue integrable if its Lebesgue integral is finite. An arbitrary measurable function is integrable if u+ and u− are each Lebesgue integrable; here, u+ and u− denote the positive and negative parts of u, respectively. Next, we illustrate a class of spaces that consists of Lebesgue integrable functions. Let p be a real number, 1 ≤ p < ∞. We denote by Lp (I) the set of all real-valued functions defined on I such that Z b |u(x)|p dx < ∞. a

Functions which are equal almost everywhere (i.e. equal, except on a set of measure zero) on I are identified with each other. Lp (I) is endowed with the norm !1/p Z b p kukLp (I) := |u(x)| dx . a

With this norm, the space Lp (I) is a Banach space. If 1 ≤ p ≤ q < ∞ and I is bounded, then Lq (I) ⊆ Lp (I), and for u ∈ Lq (I) it holds that kukLp (I) ≤ (b − a)1/p−1/q kukLq (I) . In the case p = 2, the space L2 (I) can be equipped with the inner product Rb (u, v) := a u(x)v(x) dx, and we have kukL2 (I) = (u, u)1/2 . It follows that L2 (I) is a Hilbert space. Thus, the following Cauchy-Schwarz inequality holds |(u, v)| ≤ kukL2 (I) kvkL2 (I) ,

u, v ∈ L2 (I).

A corollary of this result is the validity of the triangular inequality: ku + vkL2 (I) ≤ kukL2 (I) + kvkL2 (I) for all u, v ∈ L2 (I). Saying that u is absolutely continuous is equivalent to the statement that u has a derivative u0 almost everywhere, this derivative is Lebesgue integrable, and it holds Z x u(x) = u(a) + u0 (t) dt, x ∈ [a, b]. a

We remark that u being absolutely continuous is equivalent to the existence of a Lebesgue integrable function g on (a, b) such that Z x u(x) = u(a) + g(t) dt, x ∈ [a, b]. a

A.1.3

Sobolev Spaces

These function spaces consist of functions u ∈ L2 (I) whose so-called weak derivatives Dm u are also elements of L2 (I). To illustrate these derivatives,

216

The Sequential Quadratic Hamiltonian Method

let us first suppose that u ∈ C k (I), and let v ∈ C0∞ (I). Then, we have the following integration-by-parts formula b

Z

(m)

u

m

b

Z

u(x) v (m) (x) dx,

(x) v(x) dx = (−1)

m ≤ k,

a

a

for all v ∈ C0∞ (I). This formula is the starting point in the formulation of the weak derivative of order m for a function u that is not m times differentiable in the usual sense. Specifically, assume that u is locally integrable on I (i.e. u ∈ L1 (ω) for each bounded open set ω with ω ¯ ⊂ I.). We would like to find a locally integrable function denoted with wm such that it holds Z b Z b m wm (x) v(x) dx = (−1) u(x) v (m) (x) dx, m ≤ k, a

a

for all v ∈ C0∞ (I). If such a function exists, then we call it the weak derivative of u of order m and write Dm u := wm . We adopt the notational convention that D0 u := u. Clearly, if u ∈ C k (I) then its weak derivatives of order m ≤ k coincide with those in the classical (pointwise) sense. For illustration, we discuss the first-order weak derivative of the hat function u(x) = (1 − |x|)+ := max(0, (1 − |x|)) defined on I = R. This function is not differentiable at the points x = 0 and x = ±1. However, u is locally integrable on I, and its weak derivative is obtained as follows. Take any v ∈ C0∞ (I), we have Z

+∞

u(x) v 0 (x) dx

−∞

Z

+∞

(1 − |x|)+ v 0 (x) dx =

= −∞ Z 0

Z

0

−1

Z

1

(1 − x) v 0 (x) dx

0

v(x) dx + (1 + x) v(x)|0−1 +

−1

Z

0

Z

1

v(x) dx + (1 − x)v(x)|10

0

Z

1

Z

+∞

(+1) v(x) dx = −

(−1) v(x) dx + −1

(1 − |x|)+ v 0 (x) dx

0

=− =

1

−1

(1 + x) v (x) dx +

=

Z

0

Therefore w1 is defined as follows:  0,    1, w1 (x) = −1,    0,

w1 (x) v(x) dx. −∞

x < −1, x ∈ (−1, 0), x ∈ (0, 1), x > 1.

Results of Analysis

217

This piecewise constant function is the first weak derivative of the continuous piecewise linear function u(x) = (1−|x|)+ . If no confusion may arise, we could write u0 = w1 in place of D1 u = w1 . Now, let k be a non negative integer. The Sobolev space of order k is defined by H k (I) = {u ∈ L2 (I) : Dm u ∈ L2 (I), m ≤ k}. It is equipped with the (Sobolev) norm  1/2 X kukH k (I) :=  kDm uk2L2 (I)  m≤k m m and the inner (u, v)H k (I) := m≤k (D u, D v). With this inner product, k H (‰) is a Hilbert space. Further, we define the Sobolev semi-norm

P

|u|H k (I) := kDk ukL2 (I) . Thus, we can write kukH k (I) =

P

k m=0

|u|2H m (I)

1/2

.

1

The set of all functions u in H (I) such that u = 0 on the boundary points ∂I = {a, b} defines the Sobolev space H01 (I) = {u ∈ H 1 (I) : u(x) = 0, x ∈ ∂I}, This is a Hilbert space with the same norm and inner product as H 1 (I). On H01 (I), we have the following result, known as the Poincaré-Friedrichs inequality. Lemma A.1 Let u ∈ H01 (I), on the bounded interval I = [a, b], then there exists a constant c? (I), independent of u, such that Z b Z b 2 2 |u(x)| dx ≤ c? |u0 (x)| dx. a

a 2

The constant c? is given by c? = (b − a) /π 2 ; see [26]. A consequence of this result is that |u|H 1 (I) represents a norm on H01 (I). Notice that, in general, one considers the Sobolev spaces W k,p (I) = {u ∈ Lp (I) : Dm u ∈ Lp (I), m ≤ k}, of which H k (I) := W k,2 (I) is a particular case. The space W k,p (I) is a Banach space. Next, consider an open interval I and assume that u is locally integrable on I, and has a weak derivative u0 ∈ L1 (I). Then there exists an absolutely continuous function v such that v(x) = u(x) for almost all x ∈ I, and it holds u0 (x) = lim

h→0

v(x + h) − v(x) , h

218

The Sequential Quadratic Hamiltonian Method

almost everywhere in I. Further, we have that each element u of the space W 1,p (I) coincides almost everywhere with an absolutely continuous function v having derivative v 0 ∈ Lp (I). Finally, let us remark that, on the bounded interval I, we have the compact ¯ for k p > 1. This means that for any sequence embedding W j+k,p (I) ⊂⊂ C j (I) j+k,p (u` ) which is bounded in W (I) one can extract a subsequence (u`0 ) that ¯ converges in C j (I).

A.2

The Grönwall Inequality

As general references concerning the Grönwall’s inequalities we refer to [19, 194]. Let 0 ≤ x0 < x1 < x2 < . . . and limk→∞ xk = ∞. Denote with Cpw ([x0 , ∞); R+ ) the set of all functions u : [x0 , ∞) → R+ , which are continuous on (xk , xk+1 ) with discontinuity of the first kind at the points xk , k ∈ N, u(xk + 0) − u(xk − 0) < ∞ and u(xk ) = u(xk − 0). Theorem A.1 Assume that, for x ≥ x0 , the following inequality holds Z x X u(x) ≤ a(x) + g(x, s) u(s)ds + βk (x)u(xk ), x0

x0 0. T (v, y) is weakly closed in H and convex if V is convex. In the definition, if w 6= 0, then tn → ∞. Roughly speaking, T (V, y) can be seen as a local approximation to V at y, and if y ∈ int(V ) then T (V, y) ⊂ V . Easier to visualise is the set of feasible directions defined as follows: F (V, y) = {w ∈ H : ∃ ε0 > 0 s.t. ∀ ε ∈ (0, ε0 ), y + εw ∈ V } Then T (V, y) is the closure of F (V, y) in H.

A.4

The Implicit Function Theorem

Let X, Y and Z be Banach spaces over R or C and the function E maps an open subset of X × Y into Z. In this general setting, one can prove the following implicit function theorem; see, e.g., [5].

222

The Sequential Quadratic Hamiltonian Method

Theorem A.3 Let (x0 , y0 ) ∈ X × Y with E(x0 , y0 ) = 0 be given. Assume that the map E : X × Y → Z is m times continuously Fréchet differentiable in a neighbourhood of (x0 , y0 ) and that the Fréchet derivative ∂y E(x0 , y0 ) ∈ L (Y, Z) is bijective (i.e. continuously invertible). Then there exists a function f : X → Y , and δ,  > 0, such that for all (x, y) ∈ Bδ (x0 ) × B (y0 ) the statements y = f (x) and E(x, y) = 0 are equivalent. Furthermore, the mapping f : X → Y is m times Fréchet differentiable in Bδ (x0 ). The derivative of f is given by −1

∂x f (x) = − (∂y E(x, f (x)))

∂x E(x, f (x)).

For a proof of this theorem see, e.g., [46].

L∞ Estimates

A.5

In the PMP framework including the numerical analysis of the SQH method, the L∞ (Ω) boundedness of solutions of the governing PDE model and of the related optimization adjoint PDE is crucial. In this appendix, we provide these estimates for the PDEs of the problems P.1) to P.4) in Chapter 7. We first consider the elliptic case for an open and bounded domain Ω. We have the following: B (y, v) = (h, v) in Ω y = 0 on ∂Ω

(A.6)

where B (y, v) : H01 × H01 → R is a bilinear map with the coercivity condition βkyk2H 1 (Ω) ≤ B (y, y), β > 0 and B (−k, v) ≤ 0 for k ≥ 0 if v ≥ 0 and 0

h ∈ Lq (Ω) , q ≥ n2 +1. We assume that (A.6) has a unique solution y ∈ H01 (Ω). Then the following theorem holds. Theorem A.4 The initial value problem (A.6) has an essential bounded solution for which holds kykL∞ (Ω) ≤ CkhkLq (Ω) , where C > 0. Proof. The proof is based on [283, Theorem 4.2.1]. We assume that h is not the zero function. In the case of h = 0, the solution y = 0 solves (A.6) and thus the statement is true. We choose the constant k ≥ 0. As y − k ∈

Results of Analysis

223

H 1 (Ω), we have that (y − k)+ := max (y − k, 0) ∈ H01 (Ω), see [97, Chapter 4, Proposition 6]. Then, we choose v = (y − k)+ in (A.6) and obtain the   following B y − k, (y − k)+ ≤ h, (y − k)+ where we use that     B y, (y − k)+ ≥ B y, (y − k)+ + B −k, (y − k)+ = B y − k, (y − k)+ and thus  βk (y − k)+ k2H 1 (Ω) ≤ h, (y − k)+ (A.7) 0   as (y − k)+ = 0 if y − k ≤ 0 and B y − k, (y − k)+ = B (y − k)+ , (y − k)+  if y − k > 0 and βk (y − k)+ k2H 1 (Ω) ≤ B (y − k)+ , (y − k)+ . We remark 0

that the function (y − k)+ ∈ H01 (Ω) is also an element of Lp (Ω) with k (y − k)+ kLp (Ω) ≤ M k (y − k)+ kH01 (Ω) , M > 0 where   ≤ ∞ 2≤p 0. Next, we define Ak := {x ∈ Ω| y (x) > k} which is measurable, see [7, X, Theorem 1.9] and |Ak (t) | is the measure of Ak (t). Due to (y (x) − k)+ = 0 for x ∈ Ω\Ak , we consequently have from (A.9) the following Z k (y − k)+ k2Lp (Ak ) ≤ β˜ h (x) (y − k)+ (x) dx. (A.10) Ak

In the next step, we have the estimate by Hölder’s inequality see, [109, page 622] k (y − k)+ k2Lp (Ak ) ≤ β˜

Z

 n 1 Z +1 n 2 |h (x) | 2 +1 dx

Ak

n  2+n 2+n (y − k)+n (x) dx

Ak n+2 n

which can be applied as (y − k)+ ∈ L (Ω). This is true because in the case n = 1 and n = 2, we have (y − k)+ ∈ Lp , 2 ≤ p < ∞ and in the case n ≥ 3, we 2n 2 have n−2 ≥ 2+n n equivalently n ≥ −4, [1, Theorem 2.14], and consequently ˜ n k (y − k)+ k2Lp (Ak ) ≤ βkhk L 2 +1 (Ω)

Z Ak

n  2+n 2+n . (y − k)+n (x) dx

(A.11)

224

The Sequential Quadratic Hamiltonian Method 1 1 p˜ + q˜

We apply Hölder’s inequality again with p˜ q˜ = p−1 , and we obtain the following ˜ k (y − k)+ k2Lp (Ak ) Z ˜ n +1 ≤ βkhk L2

We choose p =

(Ω)

˜ n Z  p−1 p ˜ 2+n 1dx

2+n

(y − k)+n



n  p(2+n) ˜ (x) dx

(A.12)

Ak

Ak

2+n ˜ n p

= 1, thus for a given p˜ we have

and conclude from (A.12) for k (y − k)+ k

L

2+n p ˜ n (Ak )

>0

the following Z Ak

| (y − k)+ (x) |

= k (y − k)+ k

L

n  p(2+n) ˜

2+n ˜ n p

dx

2+n p ˜ n (Ak )

˜ n ≤ βkhk L 2 +1 (Ω)

which is also true in the case k (y − k)+ k

(A.13)

˜ n  p−1 p ˜ 2+n , 1dx

Z Ak

2n

L n−2 (Ak )

= 0.

Furthermore, for m > k, we have that Am ⊆ Ak . Additionally it is y > m on Am and thus y ≥ y −k > m−k on Am due to k ≥ 0. Since y −k = (y − k)+ on Am , we obtain Z Z Z | (y − k)+ (x) |

2+n p ˜ n

dx ≥

(y − k)

Ak

2+n p ˜ n

(x) dx ≥ (m − k)

2+n p ˜ n

1dx. Am

Am

(A.14)

We combine (A.14) with (A.13) and obtain (m − k) |Am | p−1 ˜ n ˜ βkhkL n2 +1 (Ω) |Ak | p˜ 2+n and equivalently

|Am | ≤

! ˜ n βkhk L 2 +1 (Ω)

n p(2+n) ˜



2+n ˜ n p

˜ |Ak |p−1 .

m−k

(A.15)

In order to apply [283, Lemma 4.1.1], we need that p˜ − 1 > 1 and that p fulfils (A.8). For the case n = 1 and n = 2, we can choose any p˜ > 2, for example 2n ˜ ≤ n−2 p˜ = 3. For the case that n ≥ 3, we have to ensure that 2+n , which is n p p˜ ≤

2n2 n2 −4 .

Since the expression

2n2 n2 −4

> 2 for n ≥ 3 is equivalent to 0 > −8 2

and thus always true, we can choose p˜ = n2n 2 −4 in the case of n ≥ 3. Then we also have that 2+n p ˜ > 0. By applying [283, Lemma 4.1.1], we obtain that n n(p−2) ˜ p−1 ˜ ˜ ˜ ˜ n +1 |Am | = 0 for m ≥ βkhk 2 p−2 |Ω| p(2+n) where |Ω| is the measure of Ω. L2

(Ω)

This means that the set where p−1 ˜

n(p−2) ˜

˜ ˜ ˜ n y > βkhk 2 p−2 |Ω| p(2+n) L 2 +1 (Ω)

Results of Analysis

225

is of measure zero. With the same arguments, we have for (y + k)− := min (y + k, 0) and Ak := {x ∈ Ω| y < −k} that the set where y < n(p−2) ˜ p−1 ˜ ˜ ˜ ˜ n +1 −βkhk is of measure zero. Therefore, we obtain that 2 p−2 |Ω| p(2+n) L2

(Ω)

p−1 ˜

n(p−2) ˜

˜ p−2 ˜ ˜ kykL∞ (Ω) ≤ CkhkL n2 +1 (Ω) with C := β2 . |Ω| p(2+n) Next, we check that a similar L∞ (Ω) result holds for the problems in Chapter 7. In the case P.1) and P.4), Theorem A.4 holds immediately. For the case P.2) this result holds if we assume KU ⊆ R+ 0 because then we have that −ukv ≤ 0 for v ≥ 0 and we can continue the proof of Theorem A.4 from (A.7) to obtain the following theorem. Theorem A.5 For the solution y of an elliptic boundary value problem as in P.2) with KU ⊆ R+ 0 , we have kykL∞ (Ω) ≤ d kϕkLq (Ω) , with d > 0 for any right-hand side ϕ ∈ Lq (Ω). For the case P.3), we have the following consideration such that the proof of Theorem A.4 can be followed from (A.7). We have that ∇y, ∇ (y − k)+ ≤   ∇y, ∇ (y − k)+ + y 3 , (y − k)+ as y 3 (y − k)+ ≥ 0 due to (y − k)+ = 0 if y ≤ k ≥ 0. With similar considerations the L∞ (Ω)-result is proved for P.5) as max (0, y) ≥ 0. For the parabolic case, we have results analogous to that in Theorem A.4 proved in [52, Appendix]. With this result, a similar L∞ (Q)-result to Theorem A.5 holds immediately, assuming that KU ⊆ R+ 0 , in the bilinear case.

Bibliography

[1] R. A. Adams and J. Fournier. Sobolev Spaces, volume 140 of Pure and Applied Mathematics (Amsterdam). Elsevier/Academic Press, Amsterdam, second edition, 2003. [2] C. C. Aggarwal. Neural Networks and Deep Learning - A Textbook. Springer, Berlin, Heidelberg, 2018. [3] N. U. Ahmed and K. L. Teo. Optimal Control of Distributed Parameter Systems. Elsevier Science Inc., USA, 1981. [4] A.V. Albul, B.N. Sokolov, and F.L. Chernous’ko. A method of calculating the control in antagonistic situations. USSR Computational Mathematics and Mathematical Physics, 18(5):21–27, 1978. [5] H. Amann and J. Escher. Analysis I-II-III. Birkhäuser, 2002. [6] H. Amann and J. Escher. Analysis I. Birkhäuser Basel, 2006. [7] H. Amann and J. Escher. Analysis III. Birkhäuser Basel, 2009. [8] A. Ambrosetti and R. E.L. Turner. Some discontinuous variational problems. Diff. & Integral Equat, 1:341–349, 1988. [9] L. Ambrosio, G. Da Prato, and A.C.G. Mennucci. Introduction to Measure Theory and Integration. Edizioni della Normale, 2011. [10] R. Andreani, G. Haeser, M. L. Schuverdt, L. D. Secchin, and P. J. S. Silva. On scaled stopping criteria for a safeguarded augmented lagrangian method with theoretical guarantees. Mathematical Programming Computation, 14(1):121–146, 2022. [11] M. Annunziato and A. Borzì. A Fokker-Planck control framework for multidimensional stochastic processes. J. Comput. Appl. Math., 237(1):487–507, 2013. [12] M. Annunziato and A. Borzì. A Fokker–Planck control framework for stochastic systems. EMS Surveys in Mathematical Sciences, 5:65–98, 2018.

227

228

Bibliography

[13] M. Annunziato and A. Borzì. A sequential quadratic Hamiltonian scheme to compute optimal relaxed controls. ESAIM: COCV, 27:49, 2021. [14] M. Annunziato, A. Borzì, F. Nobile, and R. Tempone. On the connection between the Hamilton-Jacobi-Bellman and the Fokker-Planck control frameworks. Applied Mathematics, 5:2476–2484, 2014. [15] M. Annunziato and A. Borzì. A Fokker-Planck-based control of a twolevel open quantum system. Mathematical Models and Methods in Applied Sciences (M3AS), 23:2039–2064, 2013. [16] E. Assémat, M. Lapert, Y. Zhang, M. Braun, S. J. Glaser, and D. Sugny. Simultaneous time-optimal control of the inversion of two spin- 21 particles. Phys. Rev. A, 82:013415, 2010. [17] M. Athans and P.L. Falb. Optimal Control: An Introduction to the Theory and Its Applications. Dover Publications, 2007. [18] O. Axelsson and V. A. Barker. Finite Element Solution of Boundary Value Problems. Society for Industrial and Applied Mathematics, 2001. [19] D.D. Bainov and P.S. Simeonov. Integral Inequalities and Applications. Kluwer Academic Publishers, Dordrecht, 1992. [20] J. M. Ball. A version of the fundamental theorem for Young measures. In M. Rascle, D. Serre, and M. Slemrod, editors, PDEs and Continuum Models of Phase Transitions, pages 207–215, Berlin, Heidelberg, 1989. Springer Berlin Heidelberg. [21] H. T. Banks and K. Kunisch. Estimation Techniques for Distributed Parameter Systems. Systems & Control: Foundations & Applications. Birkhäuser Boston, 1989. [22] J. Bartsch, A. Borzì, F. Fanelli, and S. Roy. A theoretical investigation of Brockett’s ensemble optimal control problems. Calculus of Variations and Partial Differential Equations, 58:162, 2019. [23] J. Bartsch, A. Borzì, F. Fanelli, and S. Roy. A numerical investigation of Brockett’s ensemble optimal control problems. Numerische Mathematik, 149(1):1–42, 2021. [24] J. Bartsch and A. Borzì. MOCOKI: A Monte Carlo approach for optimal control in the force of a linear kinetic model. Computer Physics Communications, 266:108030, 2021. [25] J. Bartsch, G. Nastasi, and A. Borzì. Optimal control of the KeilsonStorer master equation in a Monte Carlo framework. Journal of Computational and Theoretical Transport, 50(5):454–482, 2021.

Bibliography

229

[26] M. Bebendorf. A note on the Poincaré inequality for convex domains. Zeitschrift für Analysis und ihre Anwendungen, 22(4):751–756, 2003. [27] S. A. Belbas. The dynamic programming approach to the optimal control of Goursat-Darboux systems. In 1989 American Control Conference, pages 1731–1731, 1989. [28] D. J. Bell and D. H. Jacobson. Singular Optimal Control Problems. Mathematics in Science and Engineering. Academic Press, 1975. [29] R. Bellman. Dynamic Programming. Princeton University Press, Princeton, 1957. [30] H. Benker and A. Hamel. Remarks on the algorithm of Sakawa for optimal control problems. Numerical Functional Analysis and Optimization, 19(3-4):257–272, 1998. [31] H. Benker and M. Handschug. An algorithm for abstract optimal control problems using maximum principles and applications to a class of distributed parameter systems. In R. Bulirsch, A. Miele, J. Stoer, and K. Well, editors, Optimal Control: Calculus of Variations, Optimal Control Theory and Numerical Methods, pages 31–42. Birkhäuser Basel, Basel, 1993. [32] A. Bensoussan. Estimation and Control of Dynamical Systems. Interdisciplinary Applied Mathematics. Springer International Publishing, 2018. [33] J. Bergmann and K. Nolte. Zur Konvergenz des Algorithmus von Krylow and Černous’ko, I. Mathematische Operationsforschung und Statistik. Series Optimization, 8(3):401–410, 1977. [34] M. Bergounioux and H. Zidani. Pontryagin maximum principle for optimal control of variational inequalities. SIAM Journal on Control and Optimization, 37(4):1273–1290, 1999. [35] L. D. Berkovitz and N. G. Medhin. Nonlinear Optimal Control Theory. Chapman & Hall/CRC Applied Mathematics & Nonlinear Science. CRC Press, Boca Raton, 2012. [36] D. Bertsekas. Dynamic Programming and Optimal Control: Volume I & II. Athena Scientific, 2012. [37] S. Bianchini, M. Colombo, G. Crippa, and L.V. Spinolo. Optimality of integrability estimates for advection–diffusion equations. Nonlinear Differential Equations and Applications, 24(4):33, 2017. [38] S. Bittanti, A. Locatelli, and C. Maffezzoni. Second-variation methods in periodic optimization. Journal of Optimization Theory and Applications, 14(1):31–49, 1974.

230

Bibliography

[39] F. Bloch. Nuclear induction. Phys. Rev., 70:460–474, Oct 1946. [40] V. G. Boltyanski˘ı, R. V. Gamkrelidze, and L. S. Pontryagin. On the theory of optimal processes. Dokl. Akad. Nauk SSSR (N.S.), 110:7–10, 1956. [41] L. Boltzmann. Vorlesungen über Gastheorie: Theorie van der Waals’; Gase mit zusammengesetzten Molekülen; Gasdissociation; Schlussbemerkungen. Vorlesungen über Gastheorie. J. A. Barth, Leipzig, 1896. [42] F. Bonnans and E. Casas. An extension of Pontryagin’s principle for state-constrained optimal control of semilinear elliptic equations and variational inequalities. SIAM Journal on Control and Optimization, 33(1):274–298, 1995. [43] J. F. Bonnans. On an algorithm for optimal control using Pontryagin’s maximum principle. SIAM Journal on Control and Optimization, 24(3):579–588, 1986. [44] B. Bonnard and M. Chyba. Singular Trajectories and their Role in Control Theory. Mathématiques et Applications. Springer-Verlag, Berlin Heidelberg, 2003. [45] L. Borcea. Electrical impedance tomography. 18(6):R99–R136, 2002.

Inverse Problems,

[46] A. Borzì. Modelling with Ordinary Differential Equations: A Comprehensive Approach. Numerical Analysis and Scientific Computing Series. Chapman & Hall/CRC, Abingdon and Boca Raton, 2020. [47] A. Borzì, G. Ciaramella, and M. Sprengel. Formulation and Numerical Solution of Quantum Control Problems. Society for Industrial and Applied Mathematics, Philadelphia, PA, 2017. [48] A. Borzì, G. Stadler, and U. Hohenester. Optimal quantum control in nanostructures: Theory and application to a generic three-level system. Phys. Rev. A, 66:053811, 2002. [49] U. Boscain, M. Sigalotti, and D. Sugny. Introduction to the Pontryagin maximum principle for quantum optimal control. PRX Quantum, 2:030203, 2021. [50] T. Breitenbach. On the SQH method for solving optimal control problems with non-smooth state cost functionals or constraints. Journal of Computational and Applied Mathematics, 415:114515, 2022. [51] T. Breitenbach and A. Borzì. On the SQH scheme to solve nonsmooth PDE optimal control problems. Numerical Functional Analysis and Optimization, 40(13):1489–1531, 2019.

Bibliography

231

[52] T. Breitenbach and A. Borzì. A sequential quadratic Hamiltonian method for solving parabolic optimal control problems with discontinuous cost functionals. Journal of Dynamical and Control Systems, 25(3):403–435, 2019. [53] T. Breitenbach and A. Borzì. The Pontryagin maximum principle for solving Fokker–Planck optimal control problems. Computational Optimization and Applications, 76:499–533, 2020. [54] T. Breitenbach and A. Borzì. A sequential quadratic Hamiltonian scheme for solving non-smooth quantum control problems with sparsity. Journal of Computational and Applied Mathematics, 369:112583, 2020. [55] A. Bressan. Noncooperative differential games. Milan Journal of Mathematics, 79(2):357–427, 2011. [56] H. Brezis. Functional Analysis, Sobolev Spaces and Partial Differential Equations. Universitext. Springer New York, 2010. [57] R. Brockett. Notes on the control of the Liouville equation. In Control of Partial Differential Equations, pages 101–129. Springer, 2012. [58] R. W. Brockett. Optimal control of the Liouville equation. In Proceedings of the International Conference on Complex Geometry and Related Fields, volume 39 of AMS/IP Stud. Adv. Math., pages 23–35. Amer. Math. Soc., Providence, RI, 2007. [59] A.E. Bryson and Y.C. Ho. Applied Optimal Control. Taylor & Francis Group, New York, 1975. [60] R. Byrd, P. Lu, J. Nocedal, and Ciyou Zhu. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput., 16:1190– 1208, 1995. [61] F. Calà Campana and A. Borzì. On the SQH method for solving differential Nash games. Journal of Dynamical and Control Systems, 28: 739–755, 2022. [62] F. Calà Campana, G. Ciaramella, and A. Borzì. Nash equilibria and bargaining solutions of differential bilinear games. Dynamic Games and Applications, 11: 1–28, 2021. [63] F. Calà Campana, A. De Marchi, A. Borzì, and M. Gerdts. On the numerical solution of a free end-time homicidal chauffeur game. ESAIM: ProcS, 71:33–42, 2021. [64] G. Carlier and J. Salomon. A monotonic algorithm for the optimal control of the Fokker-Planck equation. In 47th IEEE Conference on Decision and Control, pages 269–273, December 2008.

232

Bibliography

[65] E. Casas. Pontryagin’s principle for state-constrained boundary control problems of semilinear parabolic equations. SIAM Journal on Control and Optimization, 35(4):1297–1327, 1997. [66] E. Casas. Second order analysis for bang-bang control problems of PDEs. SIAM Journal on Control and Optimization, 50(4):2355–2372, 2012. [67] E. Casas, J.P. Raymond, and H. Zidani. Pontryagin’s principle for local solutions of control problems with mixed control-state constraints. SIAM Journal on Control and Optimization, 39(4):1182–1203, 2000. [68] C. Castaing, P. Raynaud de Fitte, and M. Valadier. Young Measures on Topological Spaces: With Applications in Control Theory and Probability Theory. Springer Netherlands, 2004. [69] L. Cesari. Optimization - Theory and Applications: Problems with Ordinary Differential Equations. Applications of mathematics. Springer New York, 1983. [70] T. Chan and X.-C. Tai. Identification of discontinuous coefficients in elliptic problems using total variation regularization. SIAM Journal on Scientific Computing, 25:881–904, 2003. [71] B. Chang, L. Meng, E. Haber, L. Ruthotto, D. Begert, and E. Holtham. Reversible architectures for arbitrarily deep residual neural networks. In AAAI-18, volume 32, pages 2811–2818, New Orleans, April 2018. [72] J.S. Chang and G. Cooper. A practical difference scheme for FokkerPlanck equations. Journal of Computational Physics, 6(1):1–16, 1970. [73] T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud. Neural ordinary differential equations. In S. Bengio, H. M. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, NeurIPS, pages 6572–6583, 2018. [74] F. L. Chernous’ko and A. A. Lyubushin. Method of successive approximations for solution of optimal control problems. Optimal Control Applications and Methods, 3(2):101–114, 1982. [75] C. Christof, C. Meyer, S. Walther, and C. Clason. Optimal control of a non-smooth semilinear elliptic equation. Mathematical Control and Related Fields, 8(1):247–276, 2018. [76] C. Christof and G. Müller. Multiobjective optimal control of a nonsmooth semilinear elliptic partial differential equation. ESAIM: COCV, 27:S13, 2021.

Bibliography

233

[77] A. Cianchi and V. Maz’ya. Global gradient estimates in elliptic problems under minimal data and domain regularity. Communications on Pure & Applied Analysis, 14:285, 2015. [78] G. Ciaramella and A. Borzì. Quantum optimal control problems with a sparsity cost functional. Numerical Functional Analysis and Optimization, 37(8):938–965, 2016. [79] G. Ciaramella, A. Borzì, G. Dirr, and D. Wachsmuth. Newton methods for the optimal control of closed quantum spin systems. SIAM Journal on Scientific Computing, 37(1):A319–A346, 2015. [80] P. G. Ciarlet. The Finite Element Method for Elliptic Problems. Society for Industrial and Applied Mathematics, 2002. [81] P. G. Ciarlet. Linear and Nonlinear Functional Analysis with Applications. Society for Industrial and Applied Mathematics, 2013. [82] J.A. Cid Araujo, R. López Pouso, and J. Rodríguez López. New Lipschitz–type conditions for uniqueness of solutions of ordinary differential equations. Journal of Mathematical Analysis and Applications, 514(2):126349, 2022. [83] F. Clarke. The maximum principle under minimal hypotheses. SIAM Journal on Control and Optimization, 14(6):1078–1091, 1976. [84] F. Clarke. The Pontryagin maximum principle and a unified theory of dynamic optimization. Proceedings of the Steklov Institute of Mathematics, 268(1):58–69, 2010. [85] F. Clarke. Functional Analysis, Calculus of Variations and Optimal Control. Graduate Texts in Mathematics. Springer London, 2013. [86] F. H. Clarke and R. B. Vinter. The relationship between the maximum principle and dynamic programming. SIAM Journal on Control and Optimization, 25(5):1291–1311, 1987. [87] C. Clason, B. Jin, and K. Kunisch. A semismooth Newton method for L1 data fitting with automatic choice of regularization parameters and noise calibration. SIAM Journal on Imaging Sciences, 3(2):199–231, 2010. [88] D. L. Cohn. Measure Theory. Springer, 2013. [89] F. Colonius. Optimal Periodic Control. Lecture Notes in Mathematics. Springer-Verlag Berlin Heidelberg, 2006. [90] A. R. Conn, K. Scheinberg, and L. N. Vicente. Introduction to Derivative-Free Optimization. Society for Industrial and Applied Mathematics, 2009.

234

Bibliography

[91] C. Constantin, C. Meyer, S. Walther, and C. Clason. Optimal control of a non-smooth semilinear elliptic equation. Mathematical Control & Related Fields, 8(1):247–276, 2018. [92] J. Cortes. Discontinuous dynamical systems. IEEE Control Systems Magazine, 28(3):36–73, 2008. [93] D. R. Cox and H. D. Miller. The Theory of Stochastic Processes. Wiley publications in statistics. Wiley, 1965. [94] M.G. Crandall, I. Hitoshi, and P.-L. Lions. User’s guide to viscosity solutions of second order partial differential equations. Bulletin of the American Mathematical Society, 27(1):1–67, 1992. [95] I. Csiszár and J. Körner. Information Theory: Coding Theorems for Discrete Memoryless Systems. Cambridge University Press, 2011. [96] B. Dacorogna. Direct Methods in the Calculus of Variations. Applied Mathematical Sciences. Springer New York, 2007. [97] R. Dautray and J.-L. Lions. Mathematical Analysis and Numerical Methods for Science and Technology. Vol. 2. Springer-Verlag, Berlin, 1988. [98] A. De Marchi and M. Gerdts. Free finite horizon LQR: A bilevel perspective and its application to model predictive control. Automatica, 100:299–311, 2019. [99] M. d. R. de Pinho and J. F. Rosenblueth. Necessary conditions for constrained problems under Mangasarian–Fromowitz conditions. SIAM Journal on Control and Optimization, 47(1):535–552, 2008. [100] A. V. Dmitruk. On the development of Pontryagin’s maximum principle in the works of A.Ya. Dubovitskii and A.A. Milyutin. Control and Cybernetics, 38(4A):923–957, 2009. [101] A. V. Dmitruk and N. P. Osmolovskii. On the proof of Pontryagin’s maximum principle by means of needle variations. Journal of Mathematical Sciences, 218(5):581–598, 2016. [102] E. J. Dockner, S. Jorgensen, N. Van Long, and G. Sorger. Differential Games in Economics and Management Science. Cambridge University Press, 2000. [103] A.Ya. Dubovitskii and A.A. Milyutin. Extremum problems in the presence of restrictions. USSR Comput. Math. and Math. Phys., 5(3):1–80, 1965. [104] P. Eichmeir, T. Lauß, S. Oberpeilsteiner, K. Nachbagauer, and W. Steiner. The Adjoint Method for Time-Optimal Control Problems. Journal of Computational and Nonlinear Dynamics, 16(2), 11 2020.

Bibliography

235

[105] I. Ekeland. On the variational principle. Journal of Mathematical Analysis and Applications, 47(2):324–353, 1974. [106] E. Emmrich. Discrete versions of Gronwall’s lemma and their application to the numerical analysis of parabolic problems. Tech. Rep. 637. TU Berlin, 1999. [107] H.W. Engl, M. Hanke, and A. Neubauer. Regularization of Inverse Problems. Mathematics and Its Applications. Springer Netherlands, 2000. [108] J. Engwerda. LQ Dynamic Optimization and Differential Games. Wiley, 2005. [109] L. C. Evans. Partial Differential Equations, volume 19 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 1998. [110] R. S. Falk. Error Estimates for the Numerical Identification of a Variable Coefficient. Mathematics of Computation - Math. Comput., 40, 03 1983. [111] H. O. Fattorini. Infinite Dimensional Optimization and Control Theory. Cambridge University Press, New York, 1999. [112] A.V. Fiacco and G.P. McCormick. Nonlinear Programming: Sequential Unconstrained Minimization Techniques. Classics in Applied Mathematics. SIAM, Philadelphia, 1990. [113] A. F. Filippov. Differential Equations with Discontinuous Righthand Sides. Mathematics and its Applications. Kluwer Academic Publishers, 1988. [114] R.A. Fisher. The use of multiple measurements in taxonomic problems. Annual Eugenics, 7(Part II):179–188, 1936. [115] A. Fleig and R. Guglielmi. Optimal control of the Fokker–Planck equation with space-dependent controls. Journal of Optimization Theory and Applications, 174(2):408–427, 2017. [116] W.H. Fleming and R.W. Rishel. Deterministic and stochastic optimal control. Applications of mathematics. Springer-Verlag, 1975. [117] A. D. Fokker. Die mittlere Energie rotierender elektrischer Dipole im Strahlungsfeld. Annalen der Physik, 348(5):810–820, 1914. [118] M. Frank and P. Wolfe. An algorithm for quadratic programming. Naval Research Logistics Quarterly, 3(1-2):95–110, 1956. [119] A. Friedman. Differential Games. Wiley-Interscience, 1971. [120] A. Friedman. Stochastic Differential Equations and Applications. Academic Press, 1975.

236

Bibliography

[121] R. V. Gamkrelidze. Principles of Optimal Control Theory. Plenum Press, New York and London, 1978. [122] J.A. Gibson and J.F. Lowinger. A predictive min-H method to improve convergence to optimal solutions. International Journal of Control, 19(3):575–592, 1974. [123] ˘I. ¯I. G¯ıhman and A.V. Skorokhod. Stochastic Differential Equations. Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer-Verlag, 1972. [124] D. Gilbarg and N.S. Trudinger. Elliptic Partial Differential Equations of Second Order. Springer-Verlag Berlin Heidelberg New York, 1998. [125] R.G. Gottlieb. Rapid convergence to optimum solutions using a min-h strategy. AIAA Journal, 5(2):322–329, 1967. [126] A. Griewank. On stable piecewise linearization and generalized algorithmic differentiation. Optimization Methods and Software, 28(6):1139– 1178, 2013. [127] M. Gugat, G. Leugering, and G. Sklyar. Lp-optimal boundary control for the wave equation. SIAM Journal on Control and Optimization, 44(1):49–74, 2005. [128] E. Haber and L. Ruthotto. Stable architectures for deep neural networks. Inverse Problems, 34(014004), 2018. [129] W. Hager. Runge-Kutta methods in optimal control and the transformed adjoint system. Numerische Mathematik, 87:247–282, 2000. [130] E. Hairer, C. Lubich, and G. Wanner. Geometric Numerical Integration - Structure-Preserving Algorithms for Ordinary Differential Equations. Springer Science and Business Media, Berlin Heidelberg, second edition, 2006, 2013. [131] H. Halkin. A maximum principle of the Pontryagin type for systems described by nonlinear difference equations. SIAM Journal on Control, 4:90–111, 1966. [132] A. Hamel. Suboptimality theorems in optimal control. In W. H. Schmidt, K. Heier, L. Bittner, and R. Bulirsch, editors, Variational Calculus, Optimal Control and Applications, pages 61–68, Basel, 1998. Birkhäuser Basel. [133] M. Han, G. Feichtinger, and R.F. Hartl. Nonconcavity and proper optimal periodic control. Journal of Economic Dynamics and Control, 18(5):975–990, 1994.

Bibliography

237

[134] R. F. Hartl, S. P. Sethi, and R. G. Vickson. A survey of the maximum principles for optimal control problems with state constraints. SIAM Review, 37(2):181–218, 1995. [135] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016. [136] D. Henrion, M. Kružík, and T. Weisser. Optimal control problems with oscillations, concentrations and discontinuities. Automatica, 103:159 – 165, 2019. [137] M. R. Hestenes. Multiplier and gradient methods. Journal of Optimization Theory and Applications, 4(5):303–320, 1969. [138] S. Hofmann and A. Borzì. A sequential quadratic hamiltonian algorithm for training explicit RK neural networks. Journal of Computational and Applied Mathematics, 405:113943, 2022. [139] F. J. M. Horn and R. C. Lin. Periodic processes: A variational approach. Industrial & Engineering Chemistry Process Design and Development, 6(1):21–30, 1967. [140] J. How. Lecture 7: Numerical solution in Matlab. In Principles of Optimal Control—MIT Course No. 16.323. MIT, Cambridge MA, 2008. MIT OpenCourseWare. [141] G. Huang, Z. Liu, and K. Q. Weinberger. Densely connected convolutional networks. CoRR, abs/1608.06993, 2016. [142] A.D. Ioffe and V.M. Tikhomirov. Theory of Extremal Problems. Studies in Logic and the Foundations of Mathematics. North-Holland Publishing Company, 1979. [143] R. Isaacs. Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. John Wiley and Sons., 1965. [144] V. Isakov. Inverse Problems for Partial Differential Equations. Applied Mathematical Sciences. Springer New York, 2013. [145] K. Ito and K. Kunisch. Lagrange Multiplier Approach to Variational Problems and Applications. Advances in design and control. Society for Industrial and Applied Mathematics, 2008. [146] K. Ito and K. Kunisch. Optimal control with Lp (Ω), p ∈ (0, 1), control cost. SIAM Journal on Control and Optimization, 52(2):1251–1275, 2014.

238

Bibliography

[147] B. Järmark. A new convergence control technique in differential dynamic programming. Technical Report TRITA-REG-7502, The Royal Institute of Technology, Stockholm, Sweden, Department of Automatic Control, 1975. [148] S. Jørgensen and G. Zaccour. Differential Games in Marketing. Springer US, 2003. [149] I. Karatzas and S.E. Shreve. Brownian Motion and Stochastic Calculus. Graduate Texts in Mathematics (113) (Book 113). Springer New York, 1991. [150] V. Karl and D. Wachsmuth. An augmented Lagrange method for elliptic state constrained optimal control problems. Computational Optimization and Applications, 69(3):857–880, 2018. [151] M.A. Kazemi-Dehkordi. A method of successive approximations for optimal control of distributed parameter systems. Journal of Mathematical Analysis and Applications, 133(2):484–497, 1988. [152] H. J. Kelley, R. E. Kopp, and H. G. Moyer. Successive approximation techniques for trajectory optimization. In Proc. IAS Symp. on Vehicle System Optimization, pages 10–25, New York, November 1961. [153] D.E. Kirk. Optimal Control Theory: An Introduction. Dover Publications, 2004. [154] I. Knowles. Parameter identification for elliptic problems. Journal of Computational and Applied Mathematics, 131(1):175–194, 2001. [155] I. Knowles and R. Wallace. A variational solution of the aquifer transmissivity problem. Inverse Problems, 12:953–963, 1996. [156] R. V. Kohn and G. Strang. Optimal design and relaxation of variational problems, i. Communications on Pure and Applied Mathematics, 39(1):113–137, 1986. [157] T. G. Kolda, R. M. Lewis, and V. Torczon. Optimization by direct search: New perspectives on some classical and modern methods. SIAM Review, 45(3):385–482, 2003. [158] A. Kolmogorov. Über die analytischen Methoden in der Wahrscheinlichkeitsrechnung. Mathematische Annalen, 104(1):415–458, 1931. [159] A. Kröner, K. Kunisch, and B. Vexler. Semismooth Newton methods for optimal control of the wave equation with control constraints. SIAM Journal on Control and Optimization, 49(2):830–858, 2011.

Bibliography

239

[160] M. Kružík and T. Roubíček. Optimization problems with concentration and oscillation effects: Relaxation theory and numerical approximation. Numerical Functional Analysis and Optimization, 20(5-6):511–530, 1999. [161] I. A. Krylov and F. L. Chernous’ko. On a method of successive approximations for the solution of problems of optimal control. USSR Computational Mathematics and Mathematical Physics, 2(6):1371–1382, 1963. Transl. of Zh. Vychisl. Mat. Mat. Fiz., 1962, Vol. 2, Nr. 6, 1132–1139. [162] I. A. Krylov and F. L. Chernous’ko. An algorithm for the method of successive approximations in optimal control problems. USSR Computational Mathematics and Mathematical Physics, 12(1):15–38, 1972. [163] S. Kullback and R. A. Leibler. On information and sufficiency. Ann. Math. Statist., 22(1):79–86, 03 1951. [164] K. Kunisch and D. Wachsmuth. On time optimal control of the wave equation and its numerical realization as parametric optimization problem. SIAM Journal on Control and Optimization, 51(2):1232–1262, 2013. [165] K. Kunisch and D. Wachsmuth. On time optimal control of the wave equation, its regularization and optimality system. ESAIM: Control, Optimisation and Calculus of Variations, 19(2):317–336, 2013. [166] K. Kunisch and L. Wang. Time optimal control of the heat equation with pointwise control constraints. ESAIM: Control, Optimisation and Calculus of Variations, 19(2):460–485, 2013. [167] A. J. Kurdila and M. Zabarankin. Convex Functional Analysis. Systems & Control: Foundations & Applications. Birkhäuser Basel, 2005. [168] T. Kärkkäinen. A linearization technique and error estimates for distributed parameter identification in quasilinear problems. Numerical Functional Analysis and Optimization, 17(3-4):345–364, 1996. [169] T. Kärkkäinen. An equation error method to recover diffusion from the distributed observation. Inverse Problems, 13(4):1033–1051, 1997. [170] O.A. Ladyzhenskaia and N.N. Ural’tseva. Linear and Quasilinear Elliptic Equations. Academic Press, 1968. [171] J. Larson, M. Menickelly, and S. M. Wild. Derivative-free optimization methods. Acta Numerica, 28:287–404, 2019. [172] M. G. Larson and F. Bengzon. The Finite Element Method: Theory, Implementation, and Applications. Springer, 2013.

240

Bibliography

[173] G. Larsson, M. Maire, and G. Shakhnarovich. Fractalnet: Ultra-deep neural networks without residuals. CoRR, abs/1605.07648, 2016. [174] I. Lasiecka and R. Triggiani. Exact controllability of the wave equation with Neumann boundary control. Applied Mathematics and Optimization, 19(1):243–290, 1989. [175] M.M. Lavrentiev. Some Improperly Posed Problems of Mathematical Physics. Springer Berlin Heidelberg, 1967. [176] Y. LeCun. A theoretical framework for back-propagation. In Proceedings of the 1988 Connectionist Models Summer School, pages 21–28, CMU, Pittsburg, Pa, 1988. Morgan Kaufmann. [177] S. Lenhart and J. T. Workman. Optimal Control Applied to Biological Models. Chapman & Hall/CRC, Boca Raton, 2007. [178] Q. Li, L. Chen, C. Tai, and W. E. Maximum principle based algorithms for deep learning. Journal of Machine Learning Research, 18:1–29, 2018. [179] Q. Li and S. Hao. An optimal control approach to deep learning and applications to discrete-weight neural networks. In Proceedings of Machine Learning Research, volume 80, 2018. [180] X. Li and J. Yong. Optimal Control Theory for Infinite Dimensional Systems. Optimal Control Theory for Infinite Dimensional Systems. Birkhäuser, 1995. [181] X. Lin and J. Frank. Symplectic Runge-Kutta discretization of a regularized forward-backward sweep iteration for optimal control problems. Journal of Computational and Applied Mathematics, 2021. [182] J.-L. Lions. Optimal Control of Systems governed by Partial Differential Equations. Springer-Verlag, 1971. [183] J.L. Lions. Exact controllability, stabilization and perturbations for distributed systems. SIAM Review, 30(1):1–68, 1988. [184] A. Locatelli. Optimal Control of a Double Integrator: A Primer on Maximum Principle. Studies in Systems, Decision and Control. Springer International Publishing, 2016. [185] Y. Lu, A. Zhong, Q. Li, and B. Dong. Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 3276–3285, Stockholmsmässan, Stockholm Sweden, 10-15 Jul 2018. PMLR.

Bibliography

241

[186] R. Luce and S. Perez. Parameter identification for an elliptic partial differential equation with distributed noisy data. Inverse Problems, 15(1):291–307, 1999. [187] A.T. Luk’yanov and S.YA. Serovaiskii. The method of successive approximations in a problem of the optimal control of one non-linear parabolic system. USSR Computational Mathematics and Mathematical Physics, 24(6):23–30, 1984. [188] L. Markus. Optimal control limit cycles or what control theory can do to cure a heart attack of cause one. Lect. Notes Math., 312:108–173, 1973. [189] M. McAsey, L. Mou, and W. Han. Convergence of the forward-backward sweep method in optimal control. Computational Optimization and Applications, 53(1):207–226, 2012. [190] I. Meghea. Ekeland Variational Principle: With Generalizations and Variants. Éd. des Archives contemporaines, 2009. [191] C. Meyer, A. Rösch, and F. Tröltzsch. Optimal control of PDEs with regularized pointwise state constraints. Computational Optimization and Applications, 33(2):209–228, 2006. [192] G. Mingione. Regularity of minima: An invitation to the dark side of the calculus of variations. Applications of Mathematics, 51(4):355, 2006. [193] A.R. Mitchell and D.F. Griffiths. The Finite Difference Method in Partial Differential Equations. John Wiley & Sons, 1980. [194] D.S. Mitrinovic, J. Pecaric, and A.M. Fink. Inequalities Involving Functions and Their Integrals and Derivatives. Mathematics and its Applications. Springer Netherlands, 2012. [195] M. Mohammadi and A. Borzì. Analysis of the Chang-Cooper discretization scheme for a class of Fokker-Planck equations. J. Numer. Math., 23(3):271–288, 2015. [196] B. Sh. Mordukhovich. Existence of optimum controls (appendix to the article by Gabasov and Kirillova, “methods of optimum control”). Journal of Soviet Mathematics, 7(5):850–886, 1977. [197] B.S. Mordukhovich and J.-P. Raymond. Dirichlet boundary control of hyperbolic equations in the presence of state constraints. Applied Mathematics and Optimization, 49(2):145–157, 2004. [198] B.S. Mordukhovich and J.-P. Raymond. Neumann boundary control of hyperbolic equations with pointwise state constraints. SIAM Journal on Control and Optimization, 43(4):1354–1372, 2004.

242

Bibliography

[199] J. F. Nash. Equilibrium points in n-person games. Proceedings of the National Academy of Sciences, 36(1):48–49, 1950. [200] J. F. Nash. Non-cooperative games. Annals of Mathematics, 54(2):286– 295, 1951. [201] C. Natemeyer and D. Wachsmuth. A proximal gradient method for control problems with non-smooth and non-convex control cost. Computational Optimization and Applications, 80(2):639–677, 2021. [202] H. Nikaido and K. Isoda. Note on noncooperative convex games. Pacific Journal of Mathematics 5, Supp. 1, pages 807–815, 1955. [203] P. Nistri. Periodic control problems for a class of nonlinear periodic differential systems. Nonlinear Analysis: Theory, Methods & Applications, 7(1):79–90, 1983. [204] R. Nittka. Regularity of solutions of linear second order elliptic and parabolic boundary value problems on Lipschitz domains. Journal of Differential Equations, 251(4):860–880, 2011. [205] J. Nocedal and S. Wright. Numerical Optimization. Springer-Verlag New York, 2006. [206] I. Nowak. Relaxation and Decomposition Methods for Mixed Integer Nonlinear Programming. International series of numerical mathematics. Birkhäuser, 2005. [207] B. Øksendal. Stochastic Differential Equations: An Introduction with Applications. Universitext. Springer Berlin Heidelberg, 2003. [208] N.P. Osmolovskii and H. Maurer. Applications to Regular and Bang-bang Control. Society for Industrial and Applied Mathematics, Philadelphia, 2012. [209] P. Pedregal. Parametrized Measures and Variational Principles. Progress in nonlinear differential equations and their applications. Birkhäuser, 1997. [210] M. Planck. Über einen Satz der statistischen Dynamik und seine Erweiterung in der Quantentheorie, pages 324–341. Sitzungsberichte der Königlich Preussischen Akademie der Wissenschaften zu Berlin, 1917. [211] V. I. Plotnikov and M. I. Sumin. Necessary conditions in a nonsmooth problem of optimal control. Mathematical notes of the Academy of Sciences of the USSR, 32(2):574–579, 1982. [212] V.I. Plotnikov and M.I. Sumin. The construction of minimizing sequences in problems of the control of systems with distributed parameters. USSR Computational Mathematics and Mathematical Physics, 22(1):49–57, 1982.

Bibliography

243

[213] V.I. Plotnikov and M.I. Sumin. Optimal control of distributed parameter systems described by nonsmooth Goursat–Darboux systems with constraints of inequality type. Differ. Uravn., 20(5):851–860, 1984. [214] V.I. Plotnikov and V.I. Sumin. The optimization of objects with distributed parameters described by Goursat-Darboux systems. USSR Computational Mathematics and Mathematical Physics, 12(1):73–92, 1972. [215] V.T. Polyak and N.V. Tret’yakov. The method of penalty estimates for conditional extremum problems. USSR Computational Mathematics and Mathematical Physics, 13(1):42–58, 1973. [216] L. S. Pontryagin. Linear differential games. SIAM Journal on Control, 12(2):262–267, 1974. [217] L. S. Pontryagin, V. G. Boltyanski˘ı, R. V. Gamkrelidze, and E. F. Mishchenko. The Mathematical Theory of Optimal Processes. John Wiley & Sons, New York-London, 1962. [218] L.S. Pontryagin. On the theory of differential games. Russian Mathematical Surveys, 21(4):193–246, 1966. Uspekhi Mat. Nauk, 1966, Volume 21, Issue 4(130), 219–274. [219] V. A. Popov. Convergence of the method of successive approximations in some optimal control problems. Soviet Mathematics, 33(4):68–74, 1989. [220] M.J.D. Powell. A method for nonlinear constraints in minimization problems. In R. Fletcher, editor, Optimization, pages 283–298, New York, NY, 1969. Academic Press. [221] J.-P. Raymond and H. Zidani. Pontryagin’s principle for stateconstrained control problems governed by parabolic equations with unbounded controls. SIAM Journal on Control and Optimization, 36(6):1853–1879, 1998. [222] J.-P. Raymond and H. Zidani. Hamiltonian Pontryagin’s principles for control problems governed by semilinear parabolic equations. Applied Mathematics and Optimization, 39(2):143–177, 1999. [223] J. P. Raymond and H. Zidani. Time optimal problems with boundary controls. Differential and Integral Equations, 13(7-9):1039 – 1072, 2000. [224] H. Risken. The Fokker-Planck Equation: Methods of Solution and Applications. Springer, Berlin Heidelberg, 2012. [225] R. T. Rockafellar. Monotone operators and the proximal point algorithm. SIAM Journal on Control and Optimization, 14(5):877–898, 1976.

244

Bibliography

[226] R. T. Rockafellar and R. J.-B. Wets. Variational Analysis, volume 317. Springer Science & Business Media, 2009. [227] R.T. Rockafellar. Convex Analysis. Princeton landmarks in mathematics and physics. Princeton University Press, 1970. [228] R.T. Rockafellar. Augmented Lagrange multiplier functions and duality in nonconvex programming. SIAM Journal on Control, 12(2):268–285, 1974. [229] I.M. Ross. A Primer on Pontryagin’s Principle in Optimal Control. Collegiate Publishers, 2009. [230] T. Roubiček. Relaxation in Optimization Theory and Variational Calculus. De Gruyter, Berlin and New York, 1997. [231] S. Roy, M. Annunziato, and A. Borzì. A Fokker-Planck feedback controlconstrained approach for modelling crowd motion. J. Comput. Theor. Transp., 45(6):442–458, 2016. [232] S. Roy, M. Annunziato, A. Borzì, and C. Klingenberg. A Fokker-Planck approach to control collective motion. Computational Optimization and Applications, 69(2):423–459, 2018. [233] S. Roy and A. Borzì. A new optimization approach to sparse reconstruction of log-conductivity in acousto-electric tomography. SIAM Journal on Imaging Sciences, 11(2):1759–1784, 2018. [234] L. I. Rozonoèr. Pontryagin maximum principle in the theory of optimum systems. Avtomat. i Telemeh., 20:1320–1334, 1959. English transl. in Automat. Remote Control, 20 (1959), 1288–1302. [235] G.I.N. Rozvany. Optimal plastic design with discontinuous cost functions. Journal of Applied Mechanics, 41(1):309–310, 1974. [236] D.L. Russell. Controllability and stabilizability theory for linear partial differential equations: Recent progress and open questions. SIAM Review, 20(4):639–739, 1978. [237] Y. Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, Second Edition, 2003. [238] I.S. Sadek, J.M. Sloss, S. Adali, and J.C. Bruch, JR. Optimal boundary control of the longitudinal vibrations of a rod using a maximum principle. Journal of Vibration and Control, 3(2):235–254, 1997. [239] S. Sager. Numerical Methods for Mixed-Integer Optimal Control Problems. Der andere Verlag, Marburg, 2005.

Bibliography

245

[240] Y. Sakawa. Trajectory planning of a free-flying robot by using the optimal control. Optimal Control Applications and Methods, 20(5):235–248, 1999. [241] Y. Sakawa and Y. Shindo. On global convergence of an algorithm for optimal control. IEEE Transactions on Automatic Control, 25(6):1149– 1153, 1980. [242] Y. Sakawa, Y. Shindo, and Y. Hashimoto. Optimal control of a rotary crane. J Optim Theory Appl, 35(34):535–557, 1981. [243] A.V. Sarychev and D.F.M. Torres. Lipschitzian regularity of minimizers for optimal control problems with control-affine dynamics. Applied Mathematics and Optimization, 41(2):237–254, 2000. [244] W. Schmidt. Iterative methods for optimal control processes governed by integral equations. In R. Bulirsch, A. Miele, J. Stoer, and K. Well, editors, Optimal Control: Calculus of Variations, Optimal Control Theory and Numerical Methods, pages 69–82. Birkhäuser Basel, Basel, 1993. [245] S. Ya. Serovaiskii. Counterexamples in Optimal Control Theory. Inverse and ill-posed problems series. Walter de Gruyter, 2011. [246] Y. Shindo and Y. Sakawa. Local convergence of an algorithm for solving optimal control problems. Journal of optimization theory and applications, 46(3):265–293, 1985. [247] T. Singh and H. Alli. Exact time-optimal control of the wave equation. Journal of Guidance, Control, and Dynamics, 19(1):130–134, 1996. [248] R.V. Southwell. Relaxation Methods in Engineering Science: A Treatese on Approximate Computation. Oxford University Press, 1940. [249] J. Speyer and R. Evans. A second variational theory for optimal periodic processes. IEEE Transactions on Automatic Control, 29(2):138–148, 1984. [250] A. Stachurski and Y. Sakawa. Convergence properties of an algorithm for solving non-differentiable optimal control problems. Numerical Functional Analysis and Optimization, 10(7-8):765–786, 1989. [251] M. Steinlein. The Pontryagin maximum principle for solving Liouville optimal control problems. Master’s thesis, Universität Würzburg, Würzburg, Germany, Sept. 2020. [252] R.F. Stengel. Optimal Control and Estimation. Dover Publications, New York, 1994. [253] N. N. Subbotina. The method of characteristics for Hamilton – Jacobi equations and applications to dynamical optimization. Journal of Mathematical Sciences, 135(3):2955–3091, 2006.

246

Bibliography

[254] M.I. Sumin. Optimal control of objects described by quasilinear elliptic equations. Differential Equations, 25(8):1004–1012, 1989. Translation from Differ. Uravn. 25, No. 8, 1406-1416 (1989). [255] M.I. Sumin. Optimal control of semilinear elliptic equation with state constraint: maximum principle for minimizing sequence, regularity, normality, sensitivity. Control and Cybernetics, 29(2):449–472, 2000. [256] M.I. Sumin. The first variation and Pontryagin’s maximum principle in optimal control for partial differential equations. Computational Mathematics and Mathematical Physics, 49(6):958–978, 2009. [257] M.I. Sumin. Regularization of the Pontryagin maximum principle in a convex optimal boundary control problem for a parabolic equation with an operator equality constraint. Ural Math Journal, 2(2):72–86, 2016. [258] M. B. Suryanarayana. Necessary conditions for optimization problems with hyperbolic partial differential equations. SIAM Journal on Control, 11(1):130–147, 1973. [259] H. J. Sussmann and J. C. Willems. 300 years of optimal control: from the brachystochrone to the maximum principle. IEEE Control Systems, 17(3):32–44, 1997. [260] E. Süli. Lecture Notes on Finite Element Method for Partial Differential Equations. University of Oxford, 2019. [261] X.-C. Tai and T. Kärkkäinen. Identification of a Nonlinear Parameter in a Parabolic Equation from a Linear Equation. Comp. Appl. Mat, 14:157–184, 1995. [262] K.L. Teo, C.J. Goh, and K.H. Wong. A Unified Computational Approach to Optimal Control Problems. Pitman Monographs & Surveys in Pure & AP. John Wiley & Sons Inc., 1991. [263] V. Thalhofer, M. Annunziato, and A. Borzì. Stochastic modelling and control of antibiotic subtilin production. Journal of Mathematical Biology, 73(3):727–749, 2016. [264] J.A. Thomas and T. M. Cover. Elements of Information Theory. WileyInterscience, 2006. [265] A.A. Tolstonogov. A theorem of existence of an optimal control for the Goursat-Darboux problem without convexity assumptions. Izvestiya: Mathematics, 64(4):807–826, 2000. [266] L. Tonelli. Sur une méthode directe du calcul des variations. Rendiconti del Circolo Matematico di Palermo (1884-1940), 39(1):233–264, 1915.

Bibliography

247

[267] D.F.M. Torres. Lipschitzian regularity of the minimizing trajectories for nonlinear optimal control problems. Mathematics of Control, Signals and Systems, 16(2):158–174, 2003. [268] G.M. Troianiello. Elliptic Differential Equations and Obstacle Problems. Plenum Press, 1987. [269] F. Tröltzsch. Optimal Control of Partial Differential Equations, volume 112 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2010. Theory, methods and applications. [270] J.L. Troutman. Variational Calculus with Elementary Convexity. Springer Series in Cognitive Development. Springer New York, 1983. [271] G. Uhlmann. Electrical impedance tomography and Calderón’s problem. Inverse Problems, 25(12):123011, dec 2009. [272] M. Ulbrich. Semismooth Newton Methods for Variational Inequalities and Constrained Optimization Problems in Function Spaces. Society for Industrial and Applied Mathematics, 2011. [273] S. Ulbrich. A sensitivity and adjoint calculus for discontinuous solutions of hyperbolic conservation laws with source terms. SIAM Journal on Control and Optimization, 41(3):740–797, 2002. [274] P. Varaiya. N-person nonzero sum differential games with linear dynamics. SIAM Journal on Control, 8(4):441–449, 1970. [275] L. N. Vicente and A. L. Custódio. Analysis of direct searches for discontinuous functions. Mathematical Programming, 133(1):299–325, Jun 2012. [276] A. Walther, O. Weiß, A. Griewank, and S. Schmidt. Nonsmooth optimization by successive abs-linearization in function spaces. Applicable Analysis, 101(1):225–240, 2022. [277] G. Wang. Optimal controls of 3-dimensional Navier–Stokes equations with state constraints. SIAM Journal on Control and Optimization, 41(2):583–606, 2002. [278] G. Wang, L. Wang, Y. Xu, and Y. Zhang. Time Optimal Control of Evolution Equations. Progress in Nonlinear Differential Equations and Their Applications. Springer International Publishing, 2018. [279] J. Warga. Optimal Control of Differential and Functional Equations. Academic Press, New York, 1972. [280] P.J. Werbos. Approximate dynamic programming for real-time control and neural modeling. In D. White and D. Sofge, editors, Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, volume 15, pages 493–525. Van Nostrand Reinhold, New York, 1992.

248

Bibliography

[281] J.H. Witte and C. Reisinger. A penalty method for the numerical solution of Hamilton-Jacobi-Bellman (HJB) equations in finance. SIAM Journal on Numerical Analysis, 49(1):213–231, 2011. [282] K. Wu and D. Xiu. An explicit neural network construction for piecewise constant function approximation. ArXiv, abs/1808.07390, 2018. [283] Z. Wu, J. Yin, and C. Wang. Elliptic & Parabolic Equations. World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2006. [284] W. W.-G. Yeh. Review of parameter identification procedures in groundwater hydrology: The inverse problems. J Water Resources Research, 22:95–108, 1986. [285] L. C. Young. Lectures on the Calculus of Variations and Optimal Control Theory. W.B. Saunders Company, Philadelphia, 1969. [286] M. Zhu and C. Fu. Convolutional neural networks combined with RungeKutta methods. ArXiv, abs/1802.08831, 2018.

Index

abs-linearisation technique, 178, 180 activation function, 116, 127 adjoint (or costate) equation, 15 admissible process, 3, 24 augmented Lagrangian approach, 63, 68 autonomous system, 23

Hamilton-Pontryagin (HP) function, 17

bang-bang control, 38, 88

Lagrange functional, 17 Legendre-Clebsch condition, 19, 41 linear-quadratic Nash games, 108 Liouville equation, 133, 142

canonical Pontryagin optimal control problem, 24 Carathéodory’s conditions, 4 Cesari’s conditions, 12 chattering control, 80, 83, 87 Clarke’s conditions, 5 condition of sufficient decrease, 52 control and state (mixed) constraints, 29 control-affine systems, 2, 54, 142, 170 control-to-state map, 5, 9 cost (or objective) functional, running cost, terminal observation, 2 Dirichlet boundary control, 191 dynamic programming approach, 141 Ekeland’s variational principle, 13, 160 FEM barycentric formula, 206 Fokker-Planck equation, 132 Fréchet differentiable functional, 219 free endpoint problem, 48 Hamilton-Jacobi-Bellman equation, 141, 143

integral constraints, 25 intermediate (or average) adjoint variable, 35, 156 Iris data set, 129

Mangasarian-Fromovitz constraint qualifications, 29 method of penalty estimates, 63, 68 mixed control and state constraints, 29, 43, 62 mixed-integer optimal control, 58, 69, 184 Nash equilibrium, 103 needle variation, 20, 21, 138, 200 Neumann boundary control, 187 Nikaido-Isoda function, 104 open- and closed-loop controls, 135 optimal periodic control problem, 27 optimal periodic process, 27 optimal steady-state process, 27 optimality system, 17, 24, 85, 119, 135 path constraints, 29 Poincaré-Friedrichs inequality, 217 probability density function, 85 quasioptimal control, 13, 159 249

250 reduced cost functional, 7 relaxed optimal control, 84 Robin boundary control, 194 singular control, 41 Sparsity-promoting cost functional, 60 state constraints, 25, 29, 174 time boundary-value problem, 67 time transformation for free endtime, 65 time-optimal control problem, 26, 38, 48, 182, 185 time-optimal process, 26

Index