285 26 6MB
English Pages 251 [252] Year 2021
Leonid T. Ashchepkov Dmitriy V. Dolgy Taekyun Kim Ravi P. Agarwal
Optimal Control Second Edition
Optimal Control
Leonid T. Ashchepkov • Dmitriy V. Dolgy Taekyun Kim • Ravi P. Agarwal
Optimal Control Second Edition
Leonid T. Ashchepkov Far Eastern Federal University Department of Mathematics, Institute of Mathematics and Computer Technologies Vladivostok, Russia Taekyun Kim Department of Mathematics Kwangwoon University Seoul, Korea (Republic of)
Dmitriy V. Dolgy Department of Mathematics, Institute of Mathematics and Computer Technologies Far Eastern Federal University Vladivostok, Russia Kwangwoon Glocal Education Center, Kwangwoon University Seoul, Korea (Republic of) Ravi P. Agarwal Mathematics Texas A&M University - Kingsville Kingsville, TX, USA
ISBN 978-3-030-91028-0 ISBN 978-3-030-91029-7 https://doi.org/10.1007/978-3-030-91029-7
(eBook)
Mathematics Subject Classification: 49-XX; 49-01; 93-XX; 93C05; 93C10 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
This material is based on lectures from a one-year course at the Irkutsk State University (Irkutsk, Russia), Far Eastern Federal University (Vladivostok, Russia), and Kwangwoon University (Seoul, Republic of Korea) as well as on workshops on optimal control offered to students at various mathematical departments at the university level. The main themes of the theory of linear and nonlinear systems are considered, including the basic problem of establishing the necessary and sufficient conditions of optimal processes. In the first part of the course, the theory of linear control systems is constructed on the basis of the separation theorem and the concept of a reachability set. We prove the closure of a reachability set in the class of piecewise continuous controls, and the problems of controllability, observability, identification, performance, and terminal control are also considered. The second part of the course is devoted to nonlinear control systems. Using the method of variations and the Lagrange multipliers rule of nonlinear problems, we prove Pontryagin’s maximum principle for
problems with mobile ends of trajectories. A generalization of the maximum principle to systems with intermediate phase states and discontinuous right-hand sides is presented. The fundamentals developed by Velichenko of the theory of the field of extremals, including the questions of the sufficiency of the maximum principle and the invariance of the perturbed systems, are considered. Sufficient conditions for optimality formulated by Krotov and their connection with the method of dynamic programming by Bellman are briefly discussed. Further, exercises and many additional tasks are provided for use as practical training in order for the reader to consolidate the theoretical material.
Preface
This manual is written based on the lectures and practical exercises of the optimal control course conducted by the authors at Irkutsk State University, Far Eastern Federal University (Irkutk, Vladivostok, Russia), and Kwangwoon University (Seoul, Republic of Korea) for students in the Faculty of Mathematics. The course begins with optimal control theory of linear systems. First, the fundamental concepts of the mathematical models of controlled objects are introduced, and then control and trajectory are used to introduce the concept of a reachable set of a linear system. The properties of the system are determined, including whether it is convex, has boundaries, is closed, and is continuous according to time. Almost all control problems of linear systems are formulated in terms of the reachable set – from the problem of controllability to the problem of identification. This concept is capacious in meaning and has a clear geometric interpretation, and it is natural to use the separation theorem of convex sets to obtain some results. The Cauchy formula, the concept of reachability, and the separation theorem form the basis for the whole theory [8–10] of linear controlled systems. The second part of the course is then devoted to the theory of nonlinear systems. The material is presented following a rise in complexity – from the simplest optimal control problem to the nonlinear control problem. The focus of this theory is Pontryagin’s maximum principle [13], and its justification, analysis, application, and modification depend on the type of problem of general nonlinear systems. It appears from many known proofs of the maximum principle that the simplest formula that uses small increments of the trajectory is chosen. There is only one step from here to the formula of small increments of a functional, as proposed by Rozonoer [14], and to the maximum principle for the simplest problem of optimal control. The problems of optimal control with constraints on the ends of a path are studied on the basis of the well-known formula of increments of the functional and the Lagrange multiplier rule [1, 4] for nonlinear problems. The use of nonlinear methods of optimization for optimal control is attractive not only to simplify the technique used to obtain the proof of the maximum principle but also to improve the methodology. The continuity of a finite-dimensional and infinite-dimensional vii
viii
Preface
optimization is established, and there is confidence in that simple and clear ideas underpin complex constructions. This methodology has been successfully applied [2] to the study of control systems of differential equations with discontinuous righthand sides [5]. The final part of the course is based on the works of Velichenko [6, 7], on the theory of a field of extremals and on the invariance theory of controlled systems. The purpose of this material is to introduce one of the most beautiful sections of the modern calculus of variations – the theory of field of extremals. The sufficiency of the maximum principle is established in terms of field theory, and the discussion of the issues that are involved is then completed. Along the way, the applications of the theory of the field of extremals are highlighted for the synthesis of invariant controlled systems [6]. The latter is important regarding its cognitive and applied aspects in that good theory is always practical. The exposition of the theory of the field of extremals has moved away from the original version intended by the author [7], and instead, the concept of L-continuity of the field uses the condition of the Lipschitz characteristics of proximity for two processes with respect to the initial conditions. Some changes have also been made in the derivation of large increments of the functional in the field of extremals. For the sake of completeness, the sufficient optimality conditions of Krotov [11] are concisely presented in the course, and the relation of these conditions is given using the method of dynamic programming introduced by Bellman [3]. The appendix provides auxiliary information concerning multidimensional geometry, convex analysis, and the theory of extremal problems all of which are used in the main parts of the text. The content in the appendix allows for the course to be self-contained, without the need to handle other textbooks. A considerable part of textbook is devoted to exercises and tasks. The purpose of the exercises is to reinforce the assimilation of the theoretical material and to independently apply new knowledge to solve similar or more complex theoretical problems. The practical training material aims to develop the skills and techniques required to obtain analytical solutions for certain classes of optimal control problems. Most of these were used directly in practical classes and as homework. Our teachers have significantly impacted the selection of the material and structure of the course, including professors Gabasov, Kirillova, and Vasiliev. Тhe communication with professors Srochko and Tyatyushkin, our colleagues from the University of Irkutsk, was also extremely useful. We would like to take this opportunity to express our sincere appreciation and thanks to all. Vladivostok, Russia Calgary, AB, Canada Vladivostok, Russia Seoul, Republic of Korea Seoul, Republic of Korea Kingsville, TX, USA
Leonid T. Ashchepkov Dmitriy Dolgy Taekyun Kim Ravi P. Agarwal
Preface to the Second Edition
In the second edition, the book was supplemented with new Chaps. 13 and 14. The motives for expanding the content were, on the one hand, the desire to show on complex control problems the universality of the technique used for obtaining the necessary optimality conditions such as the maximum principle and, on the other hand, to complete the discussion of the sufficiency of these conditions in terms of the field theory of extremals. Changes have been made to the proof of the maximum principle for the G-problem. Several other minor technical and editorial improvements have been made. Vladivostok, Russia Calgary, AB, Canada Vladivostok, Russia Seoul, Republic of Korea Seoul, Republic of Korea Kingsville, TX, USA
Leonid T. Ashchepkov Dmitriy Dolgy Taekyun Kim Ravi P. Agarwal
ix
Contents
Part I
Introduction
1
The Subject of Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 «Mass-Spring» Example . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Subject and Problems of Optimal Control . . . . . . . . . . . . . . 1.3 Place of Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
3 3 5 6
2
Mathematical Model for Controlled Object . . . . . . . . . . . . . . . . . 2.1 Controlled Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Control and Trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Mathematical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Existence and Uniqueness of a Process . . . . . . . . . . . . . . . . 2.5 Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
7 7 7 9 9 10 11
Part II
Control of Linear Systems
3
Reachability Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Cauchy Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Properties of the Fundamental Matrix . . . . . . . . . . . . . . . . . 3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Definition of a Reachability Set . . . . . . . . . . . . . . . . . . . . . 3.5 Limitation and Convexity . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Extreme Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9 Application of the Extreme Principle . . . . . . . . . . . . . . . . .
. . . . . . . . . .
17 17 19 21 23 25 27 30 33 36
4
Controllability of Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Point-to-Point Controllability . . . . . . . . . . . . . . . . . . . . . . . 4.2 Analysis of the Point-to-Point Controllability Criteria . . . . . 4.3 Auxiliary Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
41 41 42 45 xi
xii
Contents
4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13
Kalman Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Control with Minimal Norm . . . . . . . . . . . . . . . . . . . . . . . Construction of Control with Minimum Norm . . . . . . . . . . . Total Controllability of Linear System . . . . . . . . . . . . . . . . Synthesis of Control with a Minimal Norm . . . . . . . . . . . . . Krasovskii Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Total Controllability of Stationary System . . . . . . . . . . . . . Geometry of a Non-controllable System . . . . . . . . . . . . . . . Transformation of Non-controllable System . . . . . . . . . . . . Controllability of Transformed System . . . . . . . . . . . . . . . .
. . . . . . . . . .
47 48 49 50 52 53 55 56 57 58
5
Minimum Time Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Statement of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Existence of a Solution of the Minimum Time Problem . . . . 5.3 Criterion of Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Maximum Principle for the Minimum Time Problem . . . . . . 5.5 Stationary Minimum Time Problem . . . . . . . . . . . . . . . . . .
. . . . . .
63 63 64 65 68 69
6
Synthesis of the Optimal System Performance . . . . . . . . . . . . . . 6.1 General Scheme to Apply the Maximum Principle . . . . . . . 6.2 Control of Acceleration of a Material Point . . . . . . . . . . . . . 6.3 Concept of Optimal Control Synthesis . . . . . . . . . . . . . . . . 6.4 Examples of Synthesis of Optimal Systems Performance . . .
. . . . .
77 77 78 81 82
7
The Observability Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Statement of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Criterion of Observability . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Observability in Homogeneous System . . . . . . . . . . . . . . . . 7.4 Observability in Nonhomogeneous System . . . . . . . . . . . . . 7.5 Observability of an Initial State . . . . . . . . . . . . . . . . . . . . . 7.6 Relation Between Controllability and Observability . . . . . . . 7.7 Total Observability of a Stationary System . . . . . . . . . . . . .
. . . . . . . .
91 91 92 93 95 96 98 99
8
Identification Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Statement of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Criterion of Identifiability . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Restoring the Parameter Vector . . . . . . . . . . . . . . . . . . . . . 8.4 Total Identificaition of Stationary System . . . . . . . . . . . . . .
. . . . .
101 101 102 103 103
. . . . . .
109 109 110 112 113 113
Part III 9
Control of Nonlinear Systems
Types of Optimal Control Problems . . . . . . . . . . . . . . . . . . . . . . 9.1 General Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Objective Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Constraints on the Ends of a Trajectory. Terminology . . . . . 9.4 The Simplest Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Two-Point Minimum Time Problem . . . . . . . . . . . . . . . . . .
Contents
9.6 9.7 9.8
xiii
General Optimal Control Problem . . . . . . . . . . . . . . . . . . . . Problem with Intermediate States . . . . . . . . . . . . . . . . . . . . . Common Problem of Optimal Control . . . . . . . . . . . . . . . . .
113 114 114
10
Small Increments of a Trajectory . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Statement of a Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Evaluation of the Increment of Trajectory . . . . . . . . . . . . . . 10.3 Representation of Small Increments of Trajectory . . . . . . . . 10.4 Relation of the Ends of Trajectories . . . . . . . . . . . . . . . . . .
. . . . .
117 117 117 122 125
11
The Simplest Problem of Optimal Control . . . . . . . . . . . . . . . . . 11.1 Simplest-Problem. Functional Increment Formula . . . . . . . . 11.2 Maximum Principle for the Simplest Problem . . . . . . . . . . . 11.3 Boundary Value Problem of the Maximum Principle . . . . . . 11.4 Continuity of the Hamiltonian . . . . . . . . . . . . . . . . . . . . . . 11.5 Sufficiency of the Maximum Principle . . . . . . . . . . . . . . . . 11.6 Applying the Maximum Principle to the Linear Problem . . . 11.7 Solution of the Mass-Spring Example . . . . . . . . . . . . . . . . .
. . . . . . . .
127 127 129 131 131 133 135 137
12
General Optimal Control Problem . . . . . . . . . . . . . . . . . . . . . . . 12.1 General Problem. Functional Increment Formula . . . . . . . . . 12.2 Variation of the Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 Necessary Conditions for Optimality . . . . . . . . . . . . . . . . . 12.4 Lagrange Multiplier Rule . . . . . . . . . . . . . . . . . . . . . . . . . . 12.5 Universal Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . 12.6 Maximum Principle for the General Problem . . . . . . . . . . . 12.7 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.8 Sufficiency of the Maximum Principle . . . . . . . . . . . . . . . . 12.9 Maximum Principle for Minimum Time Problem . . . . . . . . 12.10 Maximum Principle and Euler-Lagrange Equation . . . . . . . . 12.11 Maximum Principle and Optimality of a Process . . . . . . . . .
. . . . . . . . . . . .
141 141 143 146 149 152 153 155 156 158 160 163
13
Problem with Intermediate States . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Problem with Intermediate State. Functional Increment Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2 Preliminary Necessary Conditions of Optimality . . . . . . . . . 13.3 Maximum Principle for the Problem with an Intermediate State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4 Discontinuous Systems . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
165
. .
165 168
. .
171 176
. . . . . . .
185 185 186 188 194 195 198
14
Extremals Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1 Specifying of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 Field of Extremals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.3 Exact Formula for Large Increments of a Functional . . . . . . 14.4 Sufficiency of the Maximum Principle . . . . . . . . . . . . . . . . 14.5 Invariance of the Systems . . . . . . . . . . . . . . . . . . . . . . . . . 14.6 Examples of an Invariant System . . . . . . . . . . . . . . . . . . . .
xiv
Contents
. . . . .
203 203 204 207 210
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
213
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
215
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
251
15
Sufficient Optimality Conditions . . . . . . . . . . . . . . . . . . . . . . . . . 15.1 Common Problem of Optimal Control . . . . . . . . . . . . . . . . 15.2 Basic Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3 Analytical Construction of the Controller . . . . . . . . . . . . . . 15.4 Relation with Dynamic Programming . . . . . . . . . . . . . . . . .
Notations
A)B A,B i ¼ m, . . ., n i ¼ m, m + 1, . . . am, . . ., an {ai} ¼ a1, a2, . . . R Rn 0
x1
1
B C x ¼ @...A xn x0 ¼ (x1, . . ., xn) n P c0 x ¼ ci xi i¼1
kxk ¼ (x0x)1/2 z ¼ (x, y)
B follows from A A is equivalent to B i represents integer values m, m + 1, . . ., n i represents integer values m Finite sequence of elements ai, i ¼ m, . . ., n Sequence of elements ai, i ¼ 1, 2, . . . Set of real numbers Linear space of vectors x of dimension n 1 with real coordinates Column vector from Rn
Row vector from Rn Dot product of vectors c, x 2 Rn
xk ! x
Euclidean norm of vector x Vector z ¼ (x1, . . ., xm, y1, . . ., yn) composed from the vectors x ¼ (x1, . . ., xm), y ¼ (y1, . . ., y n) lim x xk ¼ 0
Comparison of vectors x ¼ (x1, . . ., xn), y ¼ (y1, . . ., yn) in Rn: x¼y xy x>y x 6¼ y Rnþ x2X
xi ¼ yi, i ¼ 1, . . ., n xi yi, i ¼ 1, . . ., n xi > yi, i ¼ 1, . . ., n xi 6¼ yi for some i 2 {1, . . ., n} Set of vectors x 2 Rn, x 0 x belongs to set X
k!1
xv
xvi
Notations
x2 =X ∅ ∂X intX P(x), x 2 X {x 2 X : P} X ⊂ Y Y\X ¼ {y 2 Y : y 2 = X} X[Y X\Y X + Y ¼ {x + y : x 2 X, y 2 Y} X0 Y ¼ {(x, y) : x1 2 X, y 2 Y} a11 . . . a1n B C A ¼ @ ⋯⋯⋯ A a . . . amn 0 m1 1 a11 . . . am1 B C A0 ¼ @ ⋯⋯⋯ A kAk
a1n . . . amn
x does not belong to set X Empty set Set of boundary points of set X Set of interior points of set X Property P(x) holds for all x 2 X Set of all elements x from X with property P Х is a subset of Y Difference of sets X and Y Union of sets X and Y Intersection of sets X and Y Sum of sets X and Y Cartesian product of set X to set Y Matrix A of size m n
Transposed matrix A of size n m
Norm of matrix A. kAk ¼ λ1/2, where λ is the largest eigenvalue of matrix A0A. kAk m P n P i¼1 j¼1
B1 f:X!Y f(X) ¼ {f(x) : x 2 X} x ! g(x, y) y ! g(x, y) o(kxk) sign z K(T ! U ) L2(T ! Rr) C(X ! Y ) Ck(X ! Y )
!1=2 a2ij
Inverse matrix for square matrix B Function of argument x 2 X with values y ¼ f(x) 2 Y Range of f on X Function g of argument x with fixed y Function g of argument y with fixed x Small value of order higher than kxk: ko(kxk)k/ kxk ! 0 for kxk ! 0 sign z ¼ z/|z|, z 6¼ 0, sign 0 2 [1, 1] Class of piecewise continuous functions t ! u(t) acting from set T ⊂ R to set U ⊂ Rr Class of vector functions t ! u(t) acting from set T ⊂ R to Rr with a summable ku(t)k2 on T Class of continuous vector functions y ¼ f(x) acting from X ⊂ Rm to Y ⊂ Rn Class of vector functions f 2 C(X ! Y ) which have continuous X coordinate functions with all its partial derivatives of order k 1
Notations
xvii
x_ ðt Þ Φx ðxÞ ¼ ðΦx1 ðxÞ, . . . , Φxn ðxÞÞ 0
f 1x1 ðxÞ . . . f 1xn ðxÞ
1
B C f x ðxÞ ¼ @ ⋯⋯⋯⋯⋯⋯⋯ A f mx1 ðxÞ . . . f mxn ðxÞ [x, y] ¼ {z ¼ x + α(y x) : 0 α 1} J(z) ! min , z 2 D z ¼ arg min J ðezÞ ez2D Z ¼ arg min J ðezÞ ez2D
dx(t)/dt Gradient of scalar function Φ(x) in a point x ¼ (x1, . . ., xn) Matrix of partial derivatives of vector function f(x) ¼ ( f1(x), . . ., fm(x)) in a point x ¼ (x1, . . ., x n) Segment with the ends x, y in Rn Extreme problem of determining a global minimum of function J on set D Global minimum point of function J on D Set of global minimum points of J on D
Part I
Introduction
Chapter 1
The Subject of Optimal Control
Abstract On the example of control by mechanical system, we illustrate the features of the optimal control problem. It is covered the main issues of optimal control theory: the causes of arising, the subject, objectives and relation with other mathematical disciplines.
Optimal control theory began to take shape as a mathematical discipline in the 1950s. The motivation for its development were the actual problems of automatic control, satellite navigation, aircraft control, chemical engineering and a number of other engineering problems. As a first example of optimal control problems, consider a simple system of mechanics.
1.1
«Mass-Spring» Example
On a smooth horizontal rod, there is a stationary cylindrical body of mass m attached to the rack by a spring with a coefficient of elasticity k > 0 (Fig. 1.1). The body can slide without friction along the rod under the action of an alternating force F directed along the rod and of limited magnitude F0, |F| F0. The question is how should the force F act to move the body as far as possible to the right at a given moment of time t1 > 0? The system of coordinates is introduced for Fig. 1.1. Let x(t) denotes the position of the center of the body in terms of its coordinate on the axis at the moment of time t 0. Under these conditions, we are interested in the solution of the extreme problem xðt 1 Þ ! max:
ð1:1Þ
There is a complicated relation between the position x(t1) of the center of the body and the alternating force F(t). To obtain this relation, we use Newton’s second law. We take the resistance force F1(t) of the spring that is proportional to x(t): © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_1
3
4
1 The Subject of Optimal Control
F 0
x
Fig. 1.1 System of coordinates in the mass-spring example. The coordinate axis is directed along the rod, and the origin of the system coincides with the center of the stable body
F 1 ðt Þ ¼ kxðt Þ: According to Newton’s second law, the common force m€xðt Þ acting on the body is equal to the sum of the forces F1(t) and F(t): m€xðt Þ ¼ kxðt Þ þ F ðt Þ:
ð1:2Þ
For the second-order differential equation (1.2) from the statement of the problem, we obtain the following initial conditions xð0Þ ¼ 0, x_ ð0Þ ¼ 0:
ð1:3Þ
To determine the position x(t1) of the body related to a given force F(t), it is necessary to find a solution x(t) of Eq. (1.2) with the initial conditions (1.3), and then the value of the solution for t ¼ t1 can be calculated. Mathematically the above problem consists of determining a function F(t) that satisfies the condition jF ðt Þj F 0 , 0 t t 1 ,
ð1:4Þ
that maximizes the numerical value (1.1) on the solutions of the differential equation (1.2) with the initial conditions (1.3). Analytically, F(t) belongs to a class of functions given in advance. At first, we could decide that the solution can be found in a straightforward manner and that the largest movement of the body to the right is provided by a maximum force F(t) that is constantly directed to the right. However, this is generally not true. The correct solution involves swinging the body by applying a force that is alternately directed to the right and to the left so that by the time t ¼ t1, the body will shift to the right as far as possible from the initial position. It is impossible to find a function F(t) by means of simple enumeration since in each moment of time t from the segment [0, t1], it is necessary to determine the magnitude in the continuum of available values from F0 to F0 (Fig. 1.2). Any guesses are useless, and so we need a good theory to solve this problem.
1.2 Subject and Problems of Optimal Control Fig. 1.2 Possible graphs of function F(t)
F F0 0
t1
-F0
1.2
5
t
Subject and Problems of Optimal Control
Before proceeding to the systematic study of optimal control, it is useful to obtain a general idea of the subject of this science and its place among other mathematical disciplines. We consider that mathematical models of controlled systems are the object of study of optimal control. As a mathematical model, we understand a set of mathematical equations (differential or integral equations, recurrence relations, systems of equations and inequalities, etc.) that describe with some precision the motion of an object under the action of controls. In the mass-spring example, the mathematical model is given by the second-order differential equation (1.2) with initial conditions (1.3). A number of additional problems are traditionally related to the mathematical models of controlled systems, and these problems are: identification – the specification of parameters for mathematical models that use results from experiments or observations; controllability – the possibility of transferring the object from one position to another; observability – the restoration of unknown positions of controlled object at certain times using the available information; existence – solvability of the problem of optimal control in a given class of controls; optimality criteria – the necessary and sufficient conditions for optimal control; invariance – preserving some unaltered characteristics of the controlled object under the action of perturbations; computational methods – the development of numerical methods to determine the optimal controls and a number of other problems. It is not possible to cover all of these problems in detail, so we briefly highlight only some of these and thoroughly analyze the problem of the optimality criteria which has a rightfully central place in the theory of optimal control. To present its role, we draw a parallel between the problem of optimal control and the well-known in mathematical analysis problem of finding the extremum of a differentiable function y ¼ f(x) on a given interval a < x < b. In accordance with the necessary conditions for an extremum, its first derivative at the extremum point is equal to zero: f 0 ðxÞ ¼ 0:
ð1:5Þ
6
1 The Subject of Optimal Control
From a continuum of points x for interval (a, b), condition (1.5) generally defines a finite set of points at which the function y ¼ f(x) could have a maximum or a minimum. Along with the required points of the extremum, there can be extraneous points in that set, for example, points of inflection. The final screening of extraneous points is made by means of the sufficient conditions for an extremum – that is, by the change in the sign of the first derivative in a neighborhood of a suspected point or by the sign of the second derivative at a particular point. The same meaning is given to the necessary and sufficient conditions of optimality in optimal control. The necessary conditions determine the properties of the optimal controls that are distinguished from the non-optimal controls, and the sufficient optimality conditions allow us to learn which of the controls that meet the necessary conditions of optimality are really optimal. The further investigation of the problem of the optimality criteria concerns other issues: including the effect of small changes in control over time on solutions of differential equations and on the indicator of the quality of control (objective functional); a solution of typical optimal control problems; the sufficient conditions of optimality and the dynamic programming method.
1.3
Place of Optimal Control
Optimal control is considered as a modern branch of the classical calculus of variations which is the branch of mathematics that emerged about three centuries ago at the junction of mechanics, mathematical analysis and the theory of differential equations. The calculus of variations studies problems of extreme in which it is necessary to find the maximum or the minimum of some numerical characteristic (functional) defined on the set of curves, surfaces, or other mathematical objects of a complex nature. The development of the calculus of variations is associated with the names of some famous scientists, including Bernoulli, Euler, Newton, Lagrange, Weierstrass, Hamilton and others. Optimal control problems are different from variation problems due to the additional requirements needed to find a desired solution, and these requirements are sometimes difficult and even impossible to take into consideration when the methods for the calculus of variations are applied. The need for practical methods resulted in further development of variation calculus which ultimately led to the formation of the modern theory of optimal control. This theory, absorbed all previous achievements in the calculus of variations, and it was enriched with new results and new content. The central results of the theory – Pontryagin’s maximum principle and the dynamic programming method of Bellman – became widely known in the scientific and engineering community, and these are now widely used in various academic fields.
Chapter 2
Mathematical Model for Controlled Object
Abstract The basis concepts of optimal control – controlled object, control, trajectory, process, and mathematical model are introduced. The questions of correctness of the mathematical model – the unambiguous description of the processes are discussed. We introduce the types of linear models and give illustrative examples.
2.1
Controlled Object
Consider a controlled object as a device equipped with «rudders» that is able to move in space at different speeds when the position of the rudders changes. The spatial position of the object at time t will be characterized by real numbers x1, . . ., xn, and the position of the rudders by numbers u1, . . ., ur. The first are called state variables or phase variables, and the second are called control variables. The phase and control variables can be used to easily form the phase and control vectors x ¼ (x1, . . ., xn), u ¼ (u1, . . ., ur).
2.2
Control and Trajectory
With time, the phase and control vectors change, that is, they become the functions x ¼ x(t), u ¼ u(t) of time t. The piecewise continuous function u(t) that is defined on the real line R and takes values in a given compact (closed and bounded) set U ⊂ Rr, is referred to as control. Denote a set of all controls as K(R ! U ). Let us clarify this definition. The piecewise continuity of the given control on a real line means the existence of a finite number of points of discontinuity in which the control has finite one-sided limits and is continuous in other points in a usual sense. In other words, a piecewise continuity implies a radiant of rudders of the controlled system, that is, the possibility of an instantaneous transition of the rudders from one position to another in a given moments of time. Such a mathematical idealization is useful to form a theory, and in many cases, this ensures the existence of optimal controls. A set of controls (a range) U is introduced to take into © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_2
7
8
2 Mathematical Model for Controlled Object
u 1
W1
W2
0
W3
t
-1 Fig. 2.1 Continuous control from the right with three points of discontinuity Fig. 2.2 Causal relationship between control and trajectory
Controlled system
u(t )
x (t )
consideration the real-life technological, technical or operational requirements for the rudder’s positions of a controlled object. The value of the control does not play a significant role at the points of discontinuity. To resolve the ambiguity, assume that control is continuous from the right uð t Þ ¼ uð t þ 0Þ ¼
lim
ε!0, ε>0
uðt þ εÞ:
The typical graph for continuous control from the right with three points of discontinuity and a set of controls U ¼ [1, 1] is shown in Fig. 2.1. Sometimes we need to consider the constriction of control onto the segment [t0, t1] ⊂ R or, on the contrary, continuous control defined in segment [t0, t1] onto R. In the first case, we assume that control is continuous in the ends of a segment and put uðt 0 Þ ¼ uðt 0 þ 0Þ, uðt 1 Þ ¼ uðt 1 0Þ: In the second case, unless otherwise stated, we assume uðt Þ ¼ uðt 0 þ 0Þ, t < t 0 ; uðt Þ ¼ uðt 1 0Þ, t > t 1 : A curve in space Rn which is plotted with a point x(t) according to the change in time t, is referred to as the trajectory of the controlled object, or simply trajectory, and a pair of functions x(t), u(t) is called a process. The role of the functions that make up the process is not the same: control is independent and primary, and the trajectory is a reaction of the controlled object to the action of the control, that is, dependent or secondary. A causal relationship between the control and trajectory is schematically shown in Fig. 2.2.
2.4 Existence and Uniqueness of a Process
2.3
9
Mathematical Model
Mathematical model of a controlled object is a description of the law of transformation of controls into trajectories by means of mathematical tools. Such a law can be defined using differential equations, recurrence relations or otherwise. We restrict ourselves to the class of mathematical models described by systems of ordinary differential equations in the normal form x_ 1 ¼ f 1 ðx1 , . . . , xn , u1 , . . . , ur , t Þ, :.............................. x_ n ¼ f n ðx1 , . . . , xn , u1 , . . . , ur , t Þ or in vector notation x_ ¼ f ðx, u, t Þ:
ð2:1Þ
Here the dot denotes differentiation of x with respect to time t, the symbols x_ , f ðx, u, t Þ mean vector x_ ¼ ðx_ 1 , . . . , x_ n Þ and vector function f(x, u, t) ¼ ( f1(x, u, t), . . ., fn(x, u, t)). We state the following assumptions concerning the right side of Eq. (2.1): 1. function f(x, u, t) is defined on a Cartesian product Rn U R, 2. in its domain each coordinate function fi(x, u, t) and partial derivative f ix j ðx, u, t Þ ¼
∂ f i ðx, u, t Þ , i, j ¼ 1, . . . , n ∂x j
are continuous by all arguments. We will use these assumptions further ahead without any special mention.
2.4
Existence and Uniqueness of a Process
Choose some control u(t) from the set K(R ! U ). Substituting this control into the right side of (2.1) we obtain a system of differential equations x_ ¼ f ðx, uðt Þ, t Þ
ð2:2Þ
that is piecewise and continuous with respect to t on the right-hand side. A solution of the system of differential equation (2.2) on the interval I ⊂ R is a continuous function x(t) : I ! Rn that satisfies the identity x_ ðt Þ f ðxðt Þ, uðt Þ, t Þ
ð2:3Þ
10
2 Mathematical Model for Controlled Object
for all points t 2 I with the possible exception of the points of discontinuity of the control u(t). It follows automatically from (2.3) that a solution x(t) will have a piecewise continuous first derivative on I. According to the well-known existence theorem [12], a solution for (2.2) exists locally on each interval of continuity of control, but it is not the only one. For example, the differential equation x_ ¼ ux for u ¼ 1 has a family of solutions x(t) ¼ cet, t 2 I, depending on the constant of integration c and interval I ⊂ R. Each solution in domain I satisfies the identity (2.3). To avoid a dependence of the solution on an interval of definition (that is, the domain), we can consider the so-called non-extendable solutions with maximum intervals of the definition. In the above example, non-extendable solutions are defined on the whole interval R. The dependence of the solutions on the constants of integration is eliminated by introducing the initial condition x(t0) ¼ x0 which requires the trajectory to pass through the point x0 2 Rn at time t0. The initial condition is added to the system of Eq. (2.2) to obtain the Cauchy problem x_ ¼ f ðx, uðt Þ, t Þ, xðt 0 Þ ¼ x0 :
ð2:4Þ
The non-extendable solution of the Cauchy problem (2.4) for the above definition is unique. At first, it is formed on the closure of the interval of continuity of control u(t) containing point t0, and it is then continuously extended to the left and right on adjacent intervals of the continuity of control until possible. The uniqueness of the solution is achieved during its construction and as a result of the uniqueness of the non-extendable solutions of (2.2) on the intervals of the continuity of control u(t) which is guaranteed by the existence and uniqueness theorem [12]. To summarize The unique trajectory x(t) is defined on a maximum range I ⊂ R under the assumptions of item 2.3, the mathematical model (2.1) of the controlled object in accordance with each control u(t), and the initial condition x(t0) ¼ x0.
2.5
Linear Models
Important special cases of model (2.1) are the systems of differential equations of the form x_ ¼ Aðt Þx þ bðu, t Þ, x_ ¼ Aðt Þx þ Bðt Þu, x_ ¼ Ax þ Bu: These are referred to as systems of linear differential equations with respect to state variables, with non-fixed coefficients and with constant coefficients (stationary system), respectively. Here x, x_ , u are column vectors of dimensions n, n, r respectively, b(u, t) is a vector function with continuous on U R coordinate functions bi(u, t), i ¼ 1, . . ., n, A(t), B(t) are matrices of size n n, n r accordingly with continuous on R elements aij(t), bik(t), i, j ¼ 1, . . ., n, k ¼ 1, . . ., r.
2.6 Example
11
The linear systems with respect to the state variables obviously include linear system with non-fixed coefficients and those, in turn, include linear stationary systems. For linear systems, the transformation law of controls into trajectories can be written explicitly using the Cauchy formula, so they are easier to study and can be more thoroughly investigated than non-linear systems. In matrix notation, the linear stationary system has the form 0
1 0 10 1 0 10 1 x_ 1 a11 . . . a1n x1 b11 . . . b1r u1 B C B CB C B CB C @ . . . A ¼ @ : . . . . . . . . . : A@ . . . A þ @ : . . . . . . . . . : A@ . . . A: x_ n an1 . . . ann xn bn1 . . . bnr ur The operations of addition and multiplication of matrices are applied according to certain rules of linear algebra. The coordinates of the vectors in the left-hand and right-hand sides of the equations can be used to obtain x_ 1 ¼
n X
a1j x j þ
j¼1
r X
b1k uk ,
k¼1
:........................ n r X X anj x j þ bnk uk x_ n ¼ j¼1
k¼1
or in shorthand x_ i ¼
n X j¼1
aij x j þ
r X
bik uk , i ¼ 1, . . . , n:
k¼1
Similarly, we can write the remaining linear systems in coordinate form. A few words about notation. Going forward, we shall often use the vector-matrix representations of systems of differential equations to fulfill various operations with vectors and matrices. Therefore, let us agree to consider all vectors involved in operations and formulas as columns, even if they are written as rows for economy of space. We use the Euclidean norm in the space Rn of vectors x. According to this agreement, kxk ¼ (x0 x)1/2.
2.6
Example
Let’s illustrate the concepts presented in Example 1.1. The object of control is a mechanical mass-spring system. A mathematical model of the controlled object is a second-order linear differential equation m€x ¼ kx þ F
12
2 Mathematical Model for Controlled Object
with constant coefficients m > 0, k > 0. New phase variables x1 ¼ x, x2 ¼ x_ are introduced to rewrite the second-order differential equation in the form of a system of two first-order differential equations x_ 1 ¼ x2 , x_ 2 ¼ ω2 x1 þ bu,
ð2:5Þ
where in notation 1.1 we have ω ¼ (k/m)1/2, b ¼ F0/m, u ¼ F/F0. The phase variables x1, x2 have a physical sense of distance and velocity, and the control variable u is a dimensionless quantity. Formulas (1.3) and (1.4) are rewritten using the new notation, and we obtain the initial conditions x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0
ð2:6Þ
for the system of Eq. (2.5), and the restriction |u| 1 for the control variable. The domain U of control in this case is the segment [1,1] from R. We assume u in Eq. (2.5) as a constant control. The corresponding general solution has the form x1 ¼ ω2 bu þ r cos ðωt þ φÞ, x2 ¼ rω sin ðωt þ φÞ, where r, φ are arbitrary constants. To find the constants of integration we use the initial conditions (2.6). We then obtain the following system of equations ω2 bu þ r cos φ ¼ 0, rω sin φ ¼ 0 for which the solution is r ¼ ω2bu, φ ¼ 0. When these values are used for the constants, a particular solution is defined in the general solution x1 ðt Þ ¼ ω2 buð1 cos ωt Þ, x2 ðt Þ ¼ ω1 bu sin ωt:
ð2:7Þ
This is a solution of the Cauchy problem (2.5), (2.6). It is easy to see that functions (2.7) are identically satisfy the following equation over t 2 2 x1 ω2 bu x2 þ ¼ 1, 2 1 ω bu ω bu i.e., the trajectory of the controlled object lies on the ellipse with the center x1 ¼ ω2bu, x2 ¼ 0 and axes ω2bu, ω1bu (Fig. 2.3). Note that model (2.5) of the controlled object is a linear stationary system of two differential equations with matrices of coefficients 0 1 0 A¼ , B¼ : 2 ω 0 b
2.6 Example Fig. 2.3 Under constant control u point x(t) with coordinates (2.7) moves over one of the ellipses in a clockwise direction
13
x2
u0
Z –2bu
u!0
Z –2bu
x1
The eigenvalues of A are purely imaginary, and from a mathematical point of view, this causes the cyclic motion of the phase point in ellipses under constant control. The physical explanation for cycling during the interaction of two forces acting on the body is, for example, that u > 0. Then the body is exposed to a permanent force that is directed to the right and is forcing the body to move from its initial position to the right at first. As the body moves, the resistance force of the spring increases and begins to pull the body to the left. Thus, there is a compression of the spring, the strength of spring’s resistance increases, and then it begins to push the body to the right, etc. A similar pattern holds for u < 0.
Part II
Control of Linear Systems
Chapter 3
Reachability Set
Abstract For the linear control systems it is proved Cauchy formula which represents the trajectory of the system with the help of the fundamental matrix. We list the properties of the fundamental matrix, introduce the notation of the reachability set of a linear system and establish its basic properties: the limitation, convexity, closure, and continuity. It is showed the relation of a special family of extreme controls with the boundary of a reachability set.
3.1
Cauchy Formula
Consider a linear system with respect to the state variable x_ ¼ Aðt Þx þ bðu, t Þ, xðt 0 Þ ¼ x0 :
ð3:1Þ
We maintain the previous assumptions from item 2.5 concerning matrix A(t) and vector function b(u, t), and we consider the initial values x0 2 Rn, t0 2 R to be given. Choose an arbitrary fixed control u(t) and substitute it in Eq. (3.1). The result is a Cauchy problem x_ ¼ Aðt Þx þ bðuðt Þ, t Þ, xðt 0 Þ ¼ x0
ð3:2Þ
with known function b(u(t), t). According to the existence and uniqueness theorem [12], a solution x(t) of the Cauchy problem (3.2) exists and is unique on the entire R, where are defined the coefficients. As a consequence of the possible exception of points of discontinuity of control u(t), the following identity holds x_ ðτÞ ¼ AðτÞxðτÞ þ bðuðτÞ, τÞ, τ 2 R: Let F(t, τ) be an arbitrary square matrix of order n that is continuous on R R and is differentiable with respect to τ elements. Multiplying the identity by F(t, τ) on the left, we obtain
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_3
17
18
3 Reachability Set
F ðt, τÞ_xðτÞ ¼ F ðt, τÞAðτÞxðτÞ þ F ðt, τÞbðuðτÞ, τÞ:
ð3:3Þ
Next, we will integrate and differentiate the matrix functions of a scalar argument. The operation of integration and differentiation involve the execution of arithmetic operations with matrices (addition, subtraction and multiplication by real numbers). According to the rules of linear algebra, the same operations are automatically performed with all relevant elements of the matrices. Therefore, an integral or derivative of the matrix by a scalar argument is to be naturally understood as a matrix that has been formed using the same order of integrals or derivatives of its elements. Then many of the known properties of the integrals and derivatives are transferred to the integrals and derivatives of the matrices. For example, keeping the order of factors, the formula of the partial differentiation of products of matrices F(t, τ)x(τ) with argument τ is true: ½F ðt, τÞxðτÞτ ¼ F τ ðt, τÞxðτÞ þ F ðt, τÞ_xðτÞ or the Leibniz-Newton formula Zt ½F ðt, τÞxðτÞτ dτ ¼ F ðt, t Þxðt Þ F ðt, t 0 Þxðt 0 Þ: t0
In particular, it follows from the last two formulas that the analogue of the wellknown formula for integration by parts is Zt
Zt F ðt, τÞ_xðτÞdτ ¼ F ðt, t Þxðt Þ F ðt, t 0 Þxðt 0 Þ t0
F τ ðt, τÞxðτÞdτ: t0
Using the foregoing considerations, let us return to identity (3.3). Integrate this identity with respect to τ over segment [t0, t]. Then Zt
Zt F ðt, τÞ_xðτÞdτ ¼
t0
Zt F ðt, τÞAðτÞxðτÞdτ þ
t0
F ðt, τÞbðuðτÞ, τÞdτ: t0
Replace the integral in the left-hand side of the equality by the formula for integration by parts. After grouping the terms, we obtain
3.2 Properties of the Fundamental Matrix
Zt þ
19
F ðt, t Þxðt Þ ¼ F ðt, t 0 Þxðt 0 Þþ Zt ½F τ ðt, τÞ þ F ðt, τÞAðτÞxðτÞdτ þ F ðt, τÞbðuðτÞ, τÞdτ:
t0
t0
Putting here F τ ðt, τÞ ¼ F ðt, τÞAðτÞ, F ðt, t Þ ¼ E
ð3:4Þ
(E is the identity matrix of the order n), and using the initial condition (3.2), we obtain the Cauchy formula Zt xðt Þ ¼ F ðt, t 0 Þx0 þ
F ðt, τÞbðuðτÞ, τÞdτ:
ð3:5Þ
t0
The matrix F(t, τ) is a well-defined solution of the linear matrix Cauchy problem (3.4). The existence, uniqueness, and differentiability of F(t, τ) by the arguments t and τ on R R follow from the theory of linear differential equations [12]. Following tradition, we refer to F(t, τ) as a fundamental matrix of solutions of a homogeneous system of differential equations x_ ¼ Aðt Þx
ð3:6Þ
or simply, a fundamental matrix. Using the fundamental matrix, the Cauchy formula allows us to explicitly express the solution of the Cauchy problem (3.2) with the initial values x0, t0 and the non-homogeneous part b(u(t), t) of a system of differential equations.
3.2
Properties of the Fundamental Matrix
We determine the properties of the fundamental matrix for further study. Theorem 3.1 The fundamental matrix F(t, τ) for any real t, s, τ satisfies the conditions F ðt, sÞF ðs, τÞ ¼ F ðt, τÞ, 1
ð3:7Þ
F ðt, τÞ ¼ F ðτ, t Þ,
ð3:8Þ
F t ðt, τÞ ¼ Aðt ÞF ðt, τÞ, F ðτ, τÞ ¼ E:
ð3:9Þ
20
3 Reachability Set
\
M (W ;M (s; c, t ), s)
M ( s; c, t )
с t
s
W
Fig. 3.1 The solutions of the system of differential equations (3.10) with initil values of c, t and φ(s; c, t), s in a given moment of time τ coincide
Proof The fundamental matrix is defined by relations (3.4) for any t, τ 2 R. Transposing and multiplying them by an arbitrary constant vector c 2 Rn on the right, we obtain F τ ðt, τÞ0 c ¼ AðτÞ0 F ðt, τÞ0 c, F ðt, t Þ0 c ¼ c: From this, we can see that for a fixed t, c the function φðτ; c, t Þ ¼ F ðt, τÞ0 c
ð3:10Þ
is a solution of the Cauchy problem ψ_ ðτÞ ¼ AðτÞ0 ψ ðτÞ, ψ ðt Þ ¼ c:
ð3:11Þ
By analogy, the function φ(τ; φ(s; c, t), s) will be a solution of the Cauchy problem ψ_ ðτÞ ¼ AðτÞ0 ψ ðτÞ, ψ ðsÞ ¼ φðs; c, t Þ: By the uniqueness of the solutions of the Cauchy problem and the special choice of the initial conditions, the solutions φ(τ; c, t) and φ(τ; φ(s; c, t), s) in the second problem at any arbitrarily fixed point in time τ give φðτ; c, t Þ φðτ; φðs; c, t Þ, sÞ ¼ 0 (Fig. 3.1). Hence, we use (3.10) to obtain F(t, τ)0c F(s, τ)0[F(t, s)0c] ¼ 0 or ½F ðt, τÞ F ðt, sÞF ðs, τÞ0 c ¼ 0: The latter equality is valid for any vector c, if and only if the matrix in the brackets is zero
3.3 Examples
21
F ðt, τÞ F ðt, sÞF ðs, τÞ ¼ 0: As a result, we obtain the equality (3.7). Replace τ by t and s by τ in (3.7). The initial condition (3.4) is taken into account to produce F ðt, τÞF ðτ, t Þ ¼ E:
ð3:12Þ
Therefore, a matrix F(τ, t) is the inverse of matrix F(t, τ). In the identity (3.12), the matrices F(τ, t), F(t, τ) are differentiable by t. The first matrix is differentiable as a solution of the differential equation, and the second as an inverse of the first. Differenting of the identity (3.12) by t yields F t ðt, τÞF ðτ, t Þ þ F ðt, τÞF t ðτ, t Þ ¼ 0: From here F t ðt, τÞ ¼ F ðt, τÞF t ðτ, t ÞF 1 ðτ, t Þ or using eq. (3.4) and formula (3.8) F t ðt, τÞ ¼ F ðt, τÞ½F ðτ, t ÞAðt ÞF ðt, τÞ: We remove the parentheses and use (3.12) to obtain the first of the relations (3.9), and the second one follows from the initial condition (3.4) by replacing t by τ. Thus, the theorem is proven.
3.3
Examples
The construction of a fundamental matrix F(t, τ) can be reduced to the solution of more simple vector Cauchy problems. Let e1, . . ., en be columns of the identity matrix E of order n. Multiplying equalites (3.9) on the right by a vector ei, i ¼ 1, . . ., n we obtain
F ðt, τÞei t ¼ Aðt Þ F ðt, τÞei , F ðt, τÞei t¼τ ¼ ei :
This shows that every i-th column x ¼ F(t, τ)ei of a fundamental matrix is a solution of the Cauchy problem x_ ¼ Aðt Þx, xjt¼τ ¼ ei , i ¼ 1, . . . , n:
22
3 Reachability Set
Example 3.1 Find a solution x(t) of the Cauchy problem x_ ¼ aðt Þx þ bðt Þ, xðt 0 Þ ¼ x0 with continuous coefficients a(t), b(t). Here n ¼ 1, A(t) ¼ a(t), b(u, t) ¼ b(t). Conditions (3.9) which define a fundamental matrix, become F t ¼ aðt ÞF, Fjt¼τ ¼ 1: A solution of this Cauchy problem is an exponent function F ðt, τÞ ¼ t R exp aðsÞds . τ
By the Cauchy formula (3.5), we obtain 0 xðt Þ ¼ exp @
Zt
1 aðsÞdsAx0 þ
t0
Zt
0 exp @
Zt
1 aðsÞdsAbðτÞdτ:
τ
t0
Example 3.2 Solve the Cauchy problem x_ 1 ¼ x2 , x_ 2 ¼ ω2 x1 þ buðt Þ, x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0
0 ω2
ð3:13Þ
1 0 ,B¼ , t0 ¼ 0, x0 ¼ 0 b
from example 1.1. Here n ¼ 2, r ¼ 1, A ¼ 0 , u(t) is an arbitrary fixed control. The homogeneous system of differential 0 equations will be written in the form of x_ 1 ¼ x2 , x_ 2 ¼ ω2 x1 and it has a general solution x1 ¼ r cos ðωt þ φÞ, x2 ¼ rω sin ðωt þ φÞ
ð3:14Þ
with constants of integration r, φ. We first define the first column F(t, τ)e1 of a fundamental matrix, and we place the initial conditions x1 jt¼τ ¼ r cos ðωτ þ φÞ ¼ 1, x2 jt¼τ ¼ rω sin ðωτ þ φÞ ¼ 0: Hence, we find r ¼ 1, φ ¼ ωτ. Substituting these values into the formula (3.14), we obtain
3.4 Definition of a Reachability Set
23
F ðt, τÞe ¼ 1
cos ωðt τÞ : ω sin ωðt τÞ
In a similar manner, the second column F(t, τ)e2 of a fundamental matrix can be formed, resulting in F ðt, τÞ ¼ F ðt, τÞe1 , F ðt, τÞe2 ¼
cos ωðt τÞ ω1 sin ωðt τÞ : ω sin ωðt τÞ cos ωðt τÞ
Write the solution of the Cauchy problem (3.13) with formula (3.5)
x1 ð t Þ x2 ð t Þ
Zt ¼ 0
0 uðτÞdτ, b cos ωðt τÞ
ω1 sin ωðt τÞ
cos ωðt τÞ ω sin ωðt τÞ
and in coordinate form 1
Zt
x1 ðt Þ ¼ ω b
Zt sin ωðt τÞuðτÞdτ, x2 ðt Þ ¼ b
0
3.4
cos ωðt τÞuðτÞdτ: 0
Definition of a Reachability Set
Consider a state-linear model of a controlled object x_ ¼ Aðt Þx þ bðu, t Þ, xðt 0 Þ ¼ x0 :
ð3:15Þ
For any fixed control u(t), we write the corresponding solution of the Cauchy problem (3.15) by the Cauchy formula Zt xðt Þ ¼ F ðt, t 0 Þx0 þ
F ðt, τÞbðuðτÞ, τÞdτ t0
with the aid of the fundamental matrix F(t, τ). Fixing a moment of time t ¼ t1, t1 t0, we obtain the point
24
3 Reachability Set
x ( t1 )
x (t ) x0
Q(t1)
Fig. 3.2 The reachability set Q(t1)
Zt1 xðt 1 Þ ¼ F ðt 1 , t 0 Þx0 þ
F ðt 1 , t Þbðuðt Þ, t Þdt
ð3:16Þ
t0
on trajectory x(t). If control u(t) runs the entire class K(R ! U ), then points (3.16) will fill some set Q(t1) in space Rn (Fig. 3.2). This set is referred to as the reachability set of the system (3.15) at time t1. Example 3.3 Construct the reachability set Q(1) of the simplest linear system x_ 1 ¼ u1 , x_ 2 ¼ u2 , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, ju1 j 1, ju2 j 1: Substitute the arbitrary fixed control u(t) ¼ (u1(t), u2(t)) into the given differential equations. By means of direct integration, we find the coordinates of the corresponding point of a reachability set Q(1) Z1 x1 ð1Þ ¼
Z1 u1 ðt Þdt, x2 ð1Þ ¼
0
u2 ðt Þdt: 0
The properties of the definite integral and the constraints on control are used to obtain the following estimates of the coordinates Z1 j x 1 ð 1Þ j
Z1 ju1 ðt Þjdt 1, jx2 ð1Þj
0
ju2 ðt Þjdt 1: 0
These estimates are accurate in that they are achieved on constant controls u(t) ¼ (1, 1). Therefore, the reachability set is a closed square [1, 1] [1, 1] on a phase plane (Fig. 3.3).
3.5 Limitation and Convexity
25
x2
Fig. 3.3 Reachability set Q(1) in Example 3.3
1
-1
0
1
x1
-1
3.5
Limitation and Convexity
For simplicity we confine ourselves to considering the linear model x_ ¼ Aðt Þx þ Bðt Þu, xðt 0 Þ ¼ x0 , u 2 U
ð3:17Þ
with a convex compact domain of control U ⊂ Rr. The convexity of a set U means that it contains the segment [u1, u2] connecting together any two points u1, u2.
3.5.1
Limitation
Obtain a pre-estimate of the norm of the definite integral. Let φ(s) be a function of the class C([a, b] ! Rn). Divide a segment [a, b] by points a ¼ s0 < s1 < . . . < sm + 1 ¼ b on partial segments [sk, sk + 1] of length Δsk ¼ sk + 1 sk, k ¼ 0, . . ., m. Choose an arbitrary sample point σ k in each segment [sk, sk + 1] and form an integral sum m P φðσ k ÞΔsk . We use the triangle inequality for the norm of a vector to obtain k¼0
X X m m φðσ k ÞΔsk kφðσ k ÞkΔsk : k¼0 k¼0 Due to the continuity of the norm, from the last inequality for m ! 1 we get the required estimate b Zb Z φðsÞds kφðsÞkds: a
a
26
3 Reachability Set
Check the limitation of set Q(t1). According to the definition, any point x 2 Q(t1) can be represented in the following form Zt1 x ¼ F ðt 1 , t 0 Þx0 þ
F ðt 1 , t ÞBðt Þuðt Þdt, t0
where u(t) is the corresponding control. Therefore, the triangle inequality and estimation of the norm of the integral are used to obtain t Z 1 Zt1 kxk ¼ F ðt 1 ,t 0 Þx0 þ F ðt 1 ,t ÞBðt Þuðt Þdt kF ðt 1 ,t 0 Þx0 k þ F ðt 1 ,t ÞBðt Þuðt Þdt t0
t0
Zt1 kF ðt 1 ,t 0 Þx0 k þ
kF ðt 1 ,t ÞBðt Þuðt Þkdt: t0
Due to the range of control, U is bounded, and an integrand kF(t1, t)B(t)u(t)k is bounded on the segment [t0, t1] for any control u(t). Consequently, there is a constant C > 0 such that Zt1 kF ðt 1 , t 0 Þx0 k þ
kF ðt 1 , t ÞBðt Þuðt Þkdt C: t0
From this and the previous inequality, we obtain kxk C for each point x 2 Q(t1). Consequently, the set Q(t1) is limited.
3.5.2
Convexity
Let us show that a set Q(t1) along with any two points x1, x2 contains all points x ¼ (1 λ)x1 + λx2, 0 λ 1 of a segment [x1, x2]. We choose an arbitrary number λ 2 [0, 1] and points x1, x2 2 Q(t1) to generate some controls u1(t), u2(t). By the Cauchy formula Zt1 x ¼ F ðt 1 , t 0 Þx0 þ
F ðt 1 , t ÞBðt Þu1 ðt Þdt,
1
t0
ð3:18Þ
3.6 Closure
27
Zt1 x ¼ F ðt 1 , t 0 Þx0 þ
F ðt 1 , t ÞBðt Þu2 ðt Þdt:
2
ð3:19Þ
t0
We multiply equalities (3.18) and (3.19) by 1 λ and λ respectively and add them. Then Zt1 ð1 λÞx þ λx ¼ F ðt 1 , t 0 Þx0 þ 1
2
F ðt 1 , t ÞBðt Þ ð1 λÞu1 ðt Þ þ λu2 ðt Þ dt:
t0
The function u(t) ¼ (1 λ)u1(t) + λu2(t) formed as the sum of piecewise continuous functions is also piecewise and continuous on R, and as a result of the convexity of U, it takes values in a set U for every t 2 R, that is, u(t) is the control. Then, the inclusion (1 λ)x1 + λx2 2 Q(t1) follows from the last equality. Due to a random choice of points x1, x2 from Q(t1) and a number λ from [0,1], the convexity of a reachability set is also proven.
3.6
Closure
In general, a reachability set is not closed in the class of the piecewise continuous controls. This can be illustrated using the following example. Example 3.4 Let the model of the controlled object be given in the form of x_ ¼ bðt Þu, xð0Þ ¼ 0, juj 1, where function bðt Þ ¼ t sin
1 , t 6¼ 0, bð0Þ ¼ 0 t
is continuous on R and has a countable set of roots in a small neighborhood of zero. R1 Let a set Q(1) from the points x ¼ bðt Þuðt Þdt correspond to all controls u(t) of the class K([0, 1] ! [1, 1]). Then
0
1 Z1 Z Z1 Z1 jxj ¼ bðt Þuðt Þdt jbðt Þuðt Þjdt ¼ jbðt Þjjuðt Þjdt jbðt Þjdt ¼ B 0
0
0
0
and, consequently, Q(1) ⊂ [B, B]. Construct a sequence of controls
28
3 Reachability Set
1 1 uk ðt Þ ¼ 0, 0 t < ; uk ðt Þ ¼ sign bðt Þ, t 1, k ¼ 1, 2, . . . k k and the corresponding sequence of points Z1 xk ¼
Z1 bðt Þuk ðt Þdt ¼
0
Z1=k jbðt Þjdt ¼ B
jbðt Þjdt 0
1=k
of the set Q(1). Obviously, xk ! B. If we replace uk(t) by uk(t) then the corresponding sequence of points xk 2 Q(1) converges to B. From the convexity of the reachability set, it follows that (B, B) ⊂ Q(1). An exact equality R1 bðt Þuðt Þdt ¼ B would be provided by a function u(t) ¼ sign b(t), 0 t 1 0
which has a countable set of points of discontinuity, but it is not a member of the class K([0, 1] ! [1, 1]). Hence, B 2 = Q(1). By analogy, B 2 = Q(1). As a result, we obtain Q(1) ¼ (B, B). As can be seen, the non-closure of the reachability set in the example is caused by the incompleteness of the class of controls. A natural question then arises: will the reachable set be closed in the class of piecewise continuous controls, if we impose some additional conditions on the linear model (3.17)? A positive answer to the question is given in terms of the regularity conditions. Let us refer to a unit vector c 2 Rn as a direction. Note that a set of all directions forms a sphere C ⊂ Rn of radius 1 centered at the origin. The Weierstrass theorem indicates that for arbitrary fixed c 2 C, t 2 [t0, t1] a linear function u ! c0F(t1, t)B(t)u has a maximum point on a compact set U uðt, cÞ ¼ arg max c0 F ðt 1 , t ÞBðt Þu:
ð3:20Þ
u2U
We say that a linear model (3.17) is regular on the segment [t0, t1], if for any c 2 C a maximum point (3.20) is unique for every t 2 [t0, t1], except, possibly, for a finite subset of points T(c) ⊂ [t0, t1]. By Theorem A.3.1 in the Appendix, a function t ! u(t, c) is continuous on a set [t0, t1]\T(c). We extend it by continuity to the right of the segment [t0, t1] and extend it by continuity by constant values beyond this segment. As a result, we obtain a control, denote it as u(t, c), and refer to it as an extreme control. Cauchy formula Zt1 xðcÞ ¼ F ðt 1 , t 0 Þx0 þ
F ðt 1 , t ÞBðt Þuðt, cÞdt t0
ð3:21Þ
3.6 Closure
29
assigns a point x(c) of a reachability set Q(t1) to each extreme control u(t, c). Thus, it is given a map of a unit sphere C into a set Q(t1). Let us find some of its properties. Lemma 3.1 In the context of regularity, the formula (3.21) assigns to each direction c a unique extreme point x(c) of a set Q(t1) with the prope c0 xðcÞ > c0 x, x 2 Qðt 1 Þ, x 6¼ xðcÞ:
ð3:22Þ
Proof We fix an arbitrary direction c, arbitrary control u(t) and a corresponding point Zt1 x ¼ F ðt 1 , t 0 Þx0 þ
F ðt 1 , t ÞBðt Þuðt Þdt
ð3:23Þ
t0
of a set Q(t1). If x 6¼ x(c), then the controls u(t) and u(t, c) do not match at least in one common point s 2 (t0, t1) of their continuity. Therefore, u(t) 6¼ u(t, c) in a neighborhood S ⊂ (t0, t1) of a point s, according to the properties of continuous functions. By the regularity condition, we have
> 0, t 2 S, c F ðt 1 , t ÞBðt Þuðt, cÞ c F ðt 1 , t ÞBðt Þuðt Þ : 0, t 2 ½t 0 , t 1 ∖S 0
0
We integrate these inequalities on the segment [t0, t1] to obtain Zt1
½c0 F ðt 1 , t ÞBðt Þuðt, cÞ c0 F ðt 1 , t ÞBðt Þuðt Þdt > 0:
t0
We then add and subtract from the left-hand side a term c0F(t1, t0)x0. The result is rewritten using the notation (3.21), (3.23) to obtain the inequality c0x(c) c0x > 0 that is equivalent to (3.22). We now show that x(c) is an extreme point of a set Q(t1), i.e., the equality 1 1 xð cÞ ¼ x1 þ x2 2 2 is impossible for any x1, x2 2 Q(t1) other than x(c). Otherwise, we take into account the inequalities c0 x1 < c0 xðcÞ, c0 x2 < c0 xðcÞ that arise from (3.22) and arrive at the contradictory inequality
30
3 Reachability Set
c0 xð cÞ ¼ c0
1 1 1 2 1 1 1 1 x þ x ¼ c0 x1 þ c0 x2 < c0 xðcÞ þ c0 xðcÞ ¼ c0 xðcÞ: 2 2 2 2 2 2
Finally, if we assume that two different extreme points x1(c), x2(c) of a set Q(t1) correspond to one direction c then, in view of (3.22), we obtain a contradiction c0 x1 ðcÞ > c0 x2 ðcÞ, c0 x2 ðcÞ > c0 x1 ðcÞ: Hence a point x(c) is a single point, and the lemma is proven. Lemma 3.2 Under the assumption of the regularity, function, x(c) defined by (3.21) is continuous on C. Proof By Lemma 3.1, a maximum of the linear function c0x on Q(t1) is obtained for every c 2 C at a unique point x(c): M ðcÞ ¼ max c0 x ¼ c0 xðcÞ: x2Qðt 1 Þ
The desired result follows now from Theorem A.3.1 and Remark A.3.2. Theorem 3.2 Following regularity conditions, a set Q(t1) is closed. Proof By Lemma 3.2, a function x(c) is continuous on a unit sphere C, and thus, the range x(C) is compact. According to Lemma 3.1 x(C) consists of extreme points of a set Q(t1). We construct a convex hull X ¼ co x(C). By Theorem A.2.5, the set X is compact, and it satisfies the inclusion X ⊂ Q(t1) by its construction. Suppose there is a point bx 2 Qðt 1 Þ that does not belong to X. By Theorem A.2.1, a point bx is strictly separated from a compact X by some plane with the normal vector c, i.e., c0bx > c0 x, x 2 X: Setting x ¼ x(c) in the inequality, we obtain c0bx > c0 xðcÞ that contradicts Lemma 3.1. Therefore, X ¼ Q(t1). Due to the compactness of X, a set Q(t1) will be closed, and the theorem is thus proved.
3.7
Continuity
A reachability set Q(t1) can be interpreted as the range of a multi-valued function t ! Q(t) that assigns a certain set Q(t) ⊂ Rn to each moment of time t t0. We show a continuous dependence of Q(t) from t. We introduce necessary formal concepts. Symbol Qε(t1) will denote the ε-neighborhood of a set Q(t1), that is, the union of all open balls of radius ε > 0 centered at Q(t1) (Fig. 3.4).
3.7 Continuity
31
Fig. 3.4 The union of all open balls of radius ε > 0 centered at Q(t1) constitutes the ε-neighborhood Qε(t1) of a set Q(t1)
Q e (t1 ) Q ( t1 )
The multi-valued function Q(t) is referred to as continuous at the time t1 > t0 if for any ε > 0, there is δ > 0 that when |t t1| < δ, the following two inclusions hold Qðt Þ ⊂ Qε ðt 1 Þ, Qðt 1 Þ ⊂ Qε ðt Þ:
ð3:24Þ
This definition generalizes the concept of continuity of a function at a given point. If, for example, the area of control U consists of a single point, then Q(t) is a vector function with a range in Rn. Then, the requirements (3.24) are equivalent to the inequality kQ(t) Q(t1)k < ε and are analogous to the usual condition of the proximity of values of a vector function. We show continuity of a multi-valued function Q(t) at any given time t1 > t0. We choose a moment of time t > t0 and corresponding arbitrary control u(τ) the point Zt xðt Þ ¼ F ðt, t 0 Þx0 þ
F ðt, τÞBðτÞuðτÞdτ t0
of a set Q(t). Using the properties of fundamental matrix, represent x(t) in the form Zt xðt Þ ¼ F ðt, t 1 ÞF ðt 1 , t 0 Þx0 þ
F ðt, t 1 ÞF ðt 1 , τÞBðτÞuðτÞdτ ¼ t0
2
¼ F ðt, t 1 Þ4F ðt 1 , t 0 Þx0 þ
Zt
3 F ðt 1 , τÞBðτÞuðτÞdτ5:
t0
Add and subtract the integral
Rt1
F ðt 1 , τÞBðτÞuðτÞdτ in the brackets. Taking into
t
consideration that, by the Cauchy formula,
32
3 Reachability Set
Zt1 xðt 1 Þ ¼ F ðt 1 , t 0 Þx0 þ
F ðt 1 , τÞBðτÞuðτÞdτ t0
, we obtain 2 xðt Þ ¼ F ðt, t 1 Þ4xðt 1 Þ
Zt1
3 F ðt 1 , τÞBðτÞuðτÞdτ5:
t
Consequently, Zt1 xðt Þ xðt 1 Þ ¼ ðF ðt, t 1 Þ EÞxðt 1 Þ
F ðt, τÞBðτÞuðτÞdτ: t
From the known properties of the norm, we have kx ð t Þ x ð t 1 Þ k ¼ Zt1 ¼ F t, t ð ð Þ E Þx ð t Þ F ð t, τ ÞB ð τ Þu ð τ Þdτ 1 1 t t Z1 kðF ðt, t 1 Þ E Þxðt 1 Þk þ F ð t, τ ÞB ð τ Þu ð τ Þdτ t t Z 1 kðF ðt, t 1 Þ EÞxðt 1 Þk þ kF ðt, τÞBðτÞuðτÞkdτ t t Z 1 kF ðt, t 1 Þ E kkxðt 1 Þk þ kF ðt, τÞBðτÞkkuðτÞkdτ: t
Due to limited the sets Q(t1) and U, exist positive constants α, β such that kxðt 1 Þk α; kuðτÞk β, 1 < τ < 1: Then we strengthen the previous estimation to obtain
3.8 Extreme Principle
33
kxðt Þ xðt 1 Þk φðt Þ, t Z 1 φðt Þ ¼ αkF ðt, t 1 Þ Ek þ β kF ðt, τÞBðτÞkdτ: t
Since the matrix F(t, τ) is defined and continuous on R R, the function φ(t) is defined, non-negative and continuous on R. In particular, continuity occurs at the point t ¼ t1, where φ(t1) ¼ 0. Then for any ε > 0, we can find δ > 0 such that, when |t t1| < δ, the following inequality holds jφðt Þ φðt 1 Þj ¼ jφðt Þ 0j ¼ φðt Þ < ε: From this and the preceding inequality, it follows that kx(t) x(t1)k < ε, if |t t1| < δ. Thus, for any point x(t) 2 Q(t), there is a point x(t1) 2 Q(t1) and an ε-neighborhood of which contains x(t). Hence, for |t t1| < δ an inclusion Q(t) ⊂ Qε(t1) holds. We swap the places for x(t) and x(t1) in the last inequality to come to the same conclusion Q(t1) ⊂ Qε(t). Thus, the continuity of function Q(t) at any time t1 > t0 is proven.
3.8
Extreme Principle
According to Lemma 3.1, the extreme point, which is indeed a boundary point of reachability set, corresponds to the appropriate extreme control. Now we will discuss the converse statement: some extreme control corresponds to each boundary point of reachability set. In other words, there exists a correspondence between the boundary points of the reachability set and the extreme controls. Theorem 3.3 (extreme principle) For the conditions of the regularity of model (3.17) on segment [t0, t1], the extreme controls correspond to and only to the boundary points of a reachability set Q(t1). Proof By Lemma 3.1, for each direction c and extreme control u(t, c), an extreme point x(c) corresponds to a set Q(t1). If x(c) is an interior point of Q(t1), then there exists a closed ball of a small radius centered at the point x(c) which is entirely contained in Q(t1). Then x(c) is the half-sum of the ends of the diameter of this ball which contradicts the definition of an extreme point. Therefore, x(c) is a boundary point of a set Q(t1), and the converse statement remains to be proven. Assume x is an arbitrary boundary point of a set Q(t1). Since a set Q(t1) is convex and compact, then according to Theorem A.2.2 it has a reference plane with normal c at a point x. By the definition of a reference plane
34
3 Reachability Set
c0 x c0 x, x 2 Qðt 1 Þ: Write the points x, x using the Cauchy formula Zt1 x ¼ F ðt 1 , t 0 Þx0 þ
Zt1 F ðt 1 , t ÞBðt Þuðt Þdt, x ¼ F ðt 1 , t 0 Þx0 þ
t0
F ðt 1 , t ÞBðt Þuðt Þdt t0
for corresponding controls uðt Þ, uðt Þ and substitute them in the last inequality. After the obvious transformations, we obtain Zt1
c0 F ðt 1 , t ÞBðt Þ½uðt Þ uðt Þdt 0:
t0
Use the arbitrariness of control u(t) and form one as 2½τ, τ þ εÞ; uðt Þ ¼ v, t 2 ½τ, τ þ εÞ, uðt Þ ¼ uðt Þ, t= where v, τ are any fixed points of the appropriate sets U, [t0, t1), and ε is a small positive parameter (Fig. 3.5). Then the last inequality takes the form Zτþε
c0 F ðt 1 , t ÞBðt Þ½uðt Þ vdt 0:
τ
For a sufficiently small ε, the integrand is continuous, so by the mean value theorem of calculus Zτþε
c0 F ðt 1 , t ÞBðt Þ½uðt Þ vdt ¼ εc0 F ðt 1 , θÞBðθÞ½uðθÞ v 0, τ θ τ þ ε:
τ
u(t )
u v
u (t ) t0
W
Fig. 3.5 Needle variation u(t) of control uðt Þ
+ WH
t1
t
3.8 Extreme Principle
35
Hence, in the limit for ε ! 0 we have c0 F ðt 1 , τÞBðτÞ½uðτÞ v 0, v 2 U, τ 2 ½t 0 , t 1 Þ: By continuity, this inequality holds for τ ¼ t1. Rewriting it in the form c0 F ðt 1 , t ÞBðt Þuðt Þ c0 F ðt 1 , t ÞBðt Þu 0, u 2 U, t 2 ½t 0 , t 1 , we are convinced that uðt Þ is an extreme control, and the theorem is proven. Corollary 3.1 Under the assumptions of Theorem 3.3, we have the equality 0
Zt1
0
max c x ¼ c F ðt 1 , t 0 Þx0 þ
x2Qðt 1 Þ
max c0 F ðt 1 , t ÞBðt Þu dt: u2U
t0
To check this corollary, it is sufficient to represent the inequality (3.22) in the equivalent form max c0 x ¼ c0 xðcÞ
x2Qðt 1 Þ
and use the relations (3.21), (3.20). Corollary 3.2 Under the conditions of Theorem 3.3, each boundary point of the reachability set Q(t1) is its extreme point. In fact, let x be an arbitrary fixed boundary point of a set Q(t1). Assume that x is a half-sum 1 1 x ¼ x1 þ x2 2 2 of two points x1, x2 2 Q(t1) that do not coincide with x. By Theorem A.2.2, a convex compact set Q(t1) in a boundary point x has a reference plane with a normal c. According to the definition of the reference plane, the inequality c0 x c0 x holds for all x 2 Q(t1). In particular, c0 x c0 x1 , c0 x c0 x2 . The half-sum of these inequalities gives 1 1 c0 x c0 x1 þ x2 ¼ c0 x: 2 2 Hence, c0 x1 ¼ c0 x2 ¼ c0 x c0 x,
x 2 Qðt 1 Þ:
From the last inequality, by analogy with the proof of Theorem 3.3, we can conclude regarding the extremity of controls u1(t, c), u2(t, c), that the generated points x1, x2 2 Q(t1). The regularity condition indicates that u1(t, c) ¼ u2(t, c) on
36
3 Reachability Set
[t0, t1] except for a finite number of discontinuity points of the controls. Then, the Cauchy formula can be used to find x1 ¼ x2 ¼ x . However, this contradicts the previous assumption, meaning that x is an extreme point of a set Q(t1). A closed convex set M ⊂ Rn is said to be strictly convex, if its boundary ∂M does not contain a non-degenerate interval. Corollary 3.3 For the conditions of regularity, a reachability set Q(t1) is strictly convex. Indeed, if the boundary ∂Q(t1) contained a non-degenerate interval, the midpoint would be a boundary point, and by Corollary 3.2, an extreme point of a set Q(t1) which contradicts the definition of an extreme point. Hence, a set Q(t1)is strictly convex.
3.9
Application of the Extreme Principle
In accordance with the extreme principle, the points x(c) 2 ∂Q(t1) are generated by extreme controls u(t, c). If the direction c runs over a sphere C, the point x(c) will describe the boundary of a set Q(t1). This is the theoretical and practical value of the extremal principle which makes it possible to describe the boundary of a reachability set through the use of extreme controls. Here is a simple illustrative example. Example 3.5 Let the model of controlled object be given in the form x_ 1 ¼ x2 , x_ 2 ¼ u, x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, juj 1: According to the rules given in item 3.3 we construct a fundamental matrix F ðt, τÞ ¼
1
tτ
0
1
:
From condition (3.20), we find extreme controls uðt, cÞ ¼ arg max c0 F ð1, t ÞBu juj1
for directions c ¼ (c1, c2). The expression is calculated under the sign of the maximum 1 c F ð1, t ÞBu ¼ ðc1 , c2 Þ 0 0
and a point of maximum to obtain
1t 1
0 1
u ¼ ½ð1 t Þc1 þ c2 u
3.9 Application of the Extreme Principle
37
uðt, cÞ ¼ sign½ð1 t Þc1 þ c2 : Function pðt, cÞ ¼ ð1 t Þc1 þ c2 is referred to as a switching function. Since it is an affine function and has no more than one root for c21 þ c22 ¼ 1, the extreme control u(t, c) is a piecewise constant function that switches from 1 to +1 or vice versa, no more than once. If the direction c runs over the unit circle, the switching points τðcÞ ¼ 1 þ c2 =c1 will fill the entire real line. For further calculations, it is convenient to determine the narrowing of the extreme controls on the segment [0,1] in the form of vðt Þ ¼ þe, t < τ; vðt Þ ¼ e, t τ considering e 2 {1, 1}, τ 2 [0, 1] as parameters. Find the boundary points x of a set Q(1) corresponding to control v(t) using the Cauchy formula Z1 x¼
F ð1, t ÞBvðt Þdt: 0
Or in vector-matrix notation
x1 x2
Z1
¼
1t 1
1 0
0
Z1 0 ð1 t Þvðt Þ vðt Þdt ¼ dt 1 vð t Þ 0
and in coordinate representation Z1 x1 ¼
Z1 ð1 t Þvðt Þdt, x2 ¼
0
vðt Þdt: 0
Calculating the integrals, we obtain Z1 x1 ¼
2 ð1 t Þvðt Þdt ¼ e4
0
Zτ
Z1 ð1 t Þdt
x2 ¼ 0
h i ð1 t Þdt 5 ¼ e ð1 τÞ2 þ 1=2 ,
τ
0
Z1
3
2 τ 3 Z Z1 vðt Þdt ¼ e4 dt dt 5 ¼ eð2τ 1Þ: 0
τ
38
3 Reachability Set
x2
Fig. 3.6 The boundary of a reachability set Q(1) in Example 3.5
1
Q(1) -1/2
0
1/2
x1
-1 The equations h i x1 ¼ e ð1 τÞ2 þ 1=2 , x2 ¼ eð2τ 1Þ, 0 τ 1 parametrically describe the boundary of a set Q(1) (Fig.3.6). After eliminating the parameter τ, we find an explicit coordinate description of a boundary of Q(1) by using two parabolas x1 ¼
eð1 ex2 Þ2 e þ , jx2 j 1, e ¼ 1: 2 4
Exercise Set 1. Let the range of control U ⊂ Rn be symmetrical about a point u, i.e., from the condition u þ v 2 U, it follows u v 2 U. Show that a reachability set Q(t1) of a linear system has the same symmetry property with respect to the point Zt1 x ¼ F ðt 1 , t 0 Þx0 þ
F ðt 1 , t ÞBðt Þu dt: t0
2. Show that regardless of the convexity of the range of control U ⊂ Rn, a reachability set Q(1) of the system
3.9 Application of the Extreme Principle
39
x_ ¼ bðuÞ, xð0Þ ¼ 0, u 2 U is convex. Hint: if points x1, x2 2 Q(1) correspond to controls u1(t), u2(t) then for 0 < λ < 1 a point x ¼ (1 λ)x1 + λx2 corresponds to control t 2 tλ , t < λ; uðt Þ ¼ u , t λ: uð t Þ ¼ u λ 1λ 1
3. Check that from the convexity of a reachability set, the convexity of its closure follows. 4. Graph in the plane x, x_ a reachability set Q(1) of a second-order linear differential equation €x þ a1 x_ þ a2 x ¼ u, xð0Þ ¼ 0, x_ ð0Þ ¼ 0, juj 1 with constant coefficients a1, a2 for different roots of an auxiliary equation. Hint: reduce the equation to its canonical form by choosing a suitable coordinate system, and then use the extreme principle. We can use example 3.5 as a sample that corresponds to the case a1 ¼ a2 ¼ 0. 5. Is the following statement true or false? If we suppose the non-uniqueness of the maximum points of u(t, c) in the regularity condition (3.20) on interval T(c) ⊂ [t0, t1] for some directions c, then a reachability set Q(t1) will be closed in the class of the piecewise continuous controls.
Chapter 4
Controllability of Linear Systems
Abstract We study the controllability of linear systems – the existence of processes with specified conditions on the ends of a trajectory. The criteria of point-to-point and complete controllability are established. We investigate the features of uncontrolled systems.
The theory of controllability established the criteria of translation of controlled systems from one position to another based on the on features of the mathematical model and the corresponding class of controls. The object of our attention is a linear control system x_ ¼ Aðt Þx þ Bðt Þu, u 2 U:
4.1
ð4:1Þ
Point-to-Point Controllability
Suppose points x0, x1 2 Rn and moment times t0, t1, t0 < t1 are given. We say that a system (4.1) is point-to-point controllable from position (x0, t0) to (x1, t1) if there is a process x(t), u(t) that satisfies the conditions (4.1) and x(t0) ¼ x0, x(t1) ¼ x1. The criterion for point-to-point controllability is obtained for the assumption of the regularity of system (4.1) on segment [t0, t1], as well as the convexity and compactness of the set U ⊂ Rr. Obviously, if system (4.1) is controllable, then x1 2 Q(t1). Therefore, by Lemma 3.1, we have c0 xð cÞ c0 x1 0
ð4:2Þ
for any direction c. By Lemma 3.2, the left-hand side of (4.2) depends continuously on the direction c. Since the set C of all directions is compact, then it follows from (4.2) that min ½c0 xðcÞ c0 x1 0: c2C
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_4
ð4:3Þ
41
42
4 Controllability of Linear Systems
The converse statement is also true in that if condition (4.3) holds, then x1 2 Q(t1). Indeed, assume the contrary if inequality (4.3) is true, but x1 2 = Q(t1). Apply the theorem of separation of convex sets to the sets x1 and Q(t1) (which are convex). According to the theorem, there exists a plane with the normal c that strictly separates these sets: c0 x c0 x1 < 0, x 2 Qðt 1 Þ: Hence, due to the arbitrariness of x 2 Q(t1), compactness of Q(t1), and Lemma 3.1 we have max c0 x c0 x1 ¼ c0 xðcÞ c0 x1 < 0:
x2Qðt 1 Þ
Strengthening the last inequality, we obtain min ½c0 xðcÞ c0 x1 < 0 c2C
which contradicts the inequality (4.3). Thus, the conditions x1 2 Q(t1) and (4.3) are equivalent. The formulas (3.20) and (3.21) can be used to represent the inequality (4.3) in its expanded form 2 min 4c0 F ðt 1 , t 0 Þx0 þ
ðt1
jjcjj¼1
3 max c0 Fðt 1 , tÞBðt Þu dt c0 x1 , 5 0: u2U
ð4:4Þ
t0
We then formulate the final conclusions. Theorem 4.1 (criterion of point-to-point controllability) Let the system (4.1) be regular on the segment [t0, t1] and the set U ⊂ Rr is convex and compact. Then the system (4.1) is point-to-point controllable from (x0, t0) into (x1, t1) if and only if the inequality (4.4) holds.
4.2
Analysis of the Point-to-Point Controllability Criteria
Define a function 0
ðt1
π ðzÞ ¼ z F ðt 1 , t 0 Þx0 þ
max z0 F ðt 1 , t ÞBðt Þu dt z0 x1 , z 2 Rn : u2U
t0
By Theorem A.3.1, the function π(z) is continuous in its domain. In addition, it is non-negative homogeneous:
4.2 Analysis of the Point-to-Point Controllability Criteria
43
π ðλzÞ ¼ λπ ðzÞ, λ 0: Assuming c ¼ z/kzk, z 6¼ 0, we can write the criterion for the point-to-point controllability (4.4) in the form min z6¼0
π ðzÞ 0: kz k
As can be seen, the verification of the criterion is reduced to the solution of an extreme problem, that is, the minimization of the ratio of two convex functions. The convexity of the function π(z) can be verified directly. Figure out the geometric sense of the function ðt1
0
π ðcÞ ¼ c F ðt 1 , t 0 Þx0 þ
max c0 F ðt 1 , t ÞBðt Þu dt c0 x1 u2U
ð4:5Þ
t0
in criterion (4.4). Using Corollary 3.1 and Lemma 3.1 we can write π ðcÞ ¼ max c0 x c0 x1 ¼ c0 xðcÞ c0 x1 ¼ c0 ½xðcÞ x1 : x2Qðt 1 Þ
Denote a shift of a set Q(t1) on vector x1 as Q(t1) x1, that is, form a set of vectors x x1, x 2 Q(t1). Then max c0 y ¼ max c0 ðx x1 Þ ¼ max c0 x c0 x1 :
y2Qðt 1 Þx1
x2Qðt 1 Þ
x2Qðt 1 Þ
We compare the last two expressions to conclude π ð cÞ ¼
max c0 y ¼ c0 ½xðcÞ x1 :
y2Qðt 1 Þx1
ð4:6Þ
Denote an angle between a vector y and direction c as α. The inner product of c and y c0 y ¼ kckkyk cos α ¼ kyk cos α is equal to the projection y on c by a known formula from analytic geometry (Fig. 4.1). Then, due to the equality (4.6), function π(c) is the maximum projection of vectors of a set Q(t1) x1 on a direction c (Fig. 4.2). Example 4.1 Verify the point-to-point controllability of the system x_ ¼ u, kuk 1 from position x0 ¼ 0, t0 ¼ 0 to position x1 ¼ e, t1 ¼ 1. Here
44
4 Controllability of Linear Systems
y
Fig. 4.1 Geometric sense of inner product c0y
c
D 0
ccy
Fig. 4.2 Geometric sense of function π(c)
Q (t1 ) – x1
x(c) – x1
c 0
S (c )
n ¼ r 2, A(t) ¼ 0, B(t) ¼ E, e ¼ (1, . . ., 1) 2 Rn. We solve the Cauchy problem (3.9) to find the fundamental matrix F(t, τ) ¼ E. The function (4.5) takes the form ð1 π ð cÞ ¼
max c0 u dt c0 e:
kuk1 0
The maximization problem under the integral sign is easily solved using the Cauchy-Schwarz inequality |c0u| kckkuk. From the latter, it follows the upper bound of function c0u on a ball kuk 1: c0u kckkuk kck ¼ 1. The upper bound is attained on an extreme control u(t, c) ¼ c. As a result, we obtain ð1
π ðcÞ ¼ dt c0 e ¼ 1 c0 e: 0
By analogy, we compute min π ðcÞ ¼ min ð1 c0 eÞ ¼ 1 max c0 e ¼ 1 kek ¼ 1
kck¼1
kck¼1
kck¼1
pffiffiffi n < 0:
Consequently, the system in question is not controllable from the position x0 ¼ 0, t0 ¼ 0 to the position x1 ¼ e, t1 ¼ 1. The same conclusion will follow if we directly construct a reachability set Q(1). According to the extreme principle, each boundary
4.3 Auxiliary Lemma
45
x2
Fig. 4.3 The reachability set in Example 4.1 for n ¼ 2
1
e
Q(1) 0
1
x1
S
Ð1 point xðcÞ ¼ c dt ¼ c of the set Q(1) corresponds to the appropriate extreme control 0
u(t, c) ¼ c. All of these points x(c) form a unit sphere S. Aspaffiffifficonsequence, the set pffiffiffi Q(1) is the unit ball kxk 1 (Fig. 4.3). Since kek ¼ n 2 > 1, then e 2 = Q(1).
4.3
Auxiliary Lemma
Let us clarify the question of the point-to-point controllability of a linear system x_ ¼ Bðt Þu, u 2 Rr
ð4:7Þ
from (0, t0) to (x1, t1). Controllability criterion (4.4) is not applicable because of the unbounded range U ¼ Rr of control. The solution x(t) of system (4.7) with the initial condition x(t0) ¼ 0 for an arbitrary control u(t) can be easily found by direct integration ðt xð t Þ ¼
BðτÞuðτÞdτ: t0
This solution satisfies the condition x(t1) ¼ x1 if and only if the system of integral equations ðt1 x1 ¼
Bðt Þuðt Þdt
ð4:8Þ
t0
is solvable with respect to u(t). The latter is equivalent to the solvability of the system of algebraic equations with respect to z
46
4 Controllability of Linear Systems
0t 1 ð1 x1 ¼ @ Bðt ÞBðt Þ0 dtAz:
ð4:9Þ
t0
Indeed, if the system (4.9) has a solution z, then the control uz(t) ¼ B(t)0z is a solution of system (4.8). We now show that the solvability of the integral equations (4.8) implies the solvability of the system of algebraic equations (4.9). Assume the contrary: the system (4.8) has a solution uðt Þ ðt1 x1 ¼
Bðt Þuðt Þdt,
ð4:10Þ
t0
and the system (4.9) is not solvable with respect to z. In other words, the point x1 does not belong to the range Y ⊂ Rn of the right-hand sides of (4.9), where z ranges over all the space Rn. Obviously, the set 8
c0y, y 2 Y, or in a detailed record 0t 1 ð1 c0 x1 > c0 @ Bðt ÞBðt Þ0 dtAz, z 2 Rn :
ð4:11Þ
t0
Due to the arbitrariness of z in (4.11), it follows that 0t 1 ð1 c0 @ Bðt ÞBðt Þ0 dtA ¼ 0: t0
We multiply this equality by the vector c on the right to obtain 0t 1 ðt1 ðt1 ð1 2 0 A 0 0@ 0 c Bðt ÞBðt Þ dt c ¼ c Bðt ÞBðt Þ c dt ¼ kc0 Bðt Þk dt ¼ 0: t0
t0
t0
Hence, by the continuity and non-negativity of the integrand function, we conclude that c0B(t) ¼ 0, t0 t t1. Then from (4.10) and (4.11), it simultaneously
4.4 Kalman Theorem
47
follows that c0x1 ¼ 0 and c0x1 > 0. The contradiction that is obtained proves the solvability of the system (4.9). Thereby, for point-to-point controllability of the system (4.7) from the position (0, t0) to (x1, t1), it is necessary and sufficient to have the solvability of the linear system (4.9). It is useful to consider this conclusion from another point of view. We introduce the linear integral operator ðt1 Lu ¼
Bðt Þuðt Þdt, t0
acting from space C([t0, t1] ! Rr) to space Rn, and we then compose a linear transformation of space Rn into Rn by the matrixe W 0t 1 ð1 0 Wz ¼ @ Bðt ÞBðt Þ dtAz: t0
In accordance with the above conclusion, the systems of equations (4.8) and (4.9) are simultaneously consistent. This means that the ranges LC([t0, t1] ! Rr) and WRn of operator L and transform W consist of the same vectors. In other words, the following assertion holds. Lemma 4.1 LC([t0, t1] ! Rr) ¼ WRn. Lemma 4.1 plays an important role in deriving the criteria for the controllability, observability and identifiability of linear systems. This allows us to conclude about the solvability of a complex system of linear integral equations, considering the solvability of a simpler system of linear algebraic equations.
4.4
Kalman Theorem
We apply Lemma 4.1 to obtain the criterion for point-to-point controllability of the linear system (4.1) from (x0, t0) into (x1, t1) for U ¼ Rr. The Cauchy formula is used to show that the point-to-point controllability is equivalent to the solvability of a system of integral equations with respect to u(t) ðt1 x1 F ðt 1 , t 0 Þx0 ¼
F ðt 1 , t ÞBðt Þuðt Þdt:
ð4:12Þ
t0
According to Lemma 4.1, this means that vector x1 F(t1, t0)x0 belongs to the range of linear transforms W(t1, t0)z : Rn ! Rn with a matrix of coefficients
48
4 Controllability of Linear Systems
ðt1 W ðt 0 , t 1 Þ ¼
F ðt 1 , t ÞBðt ÞBðt Þ0 F ðt 1 , t Þ0 dt:
ð4:13Þ
t0
The result is Theorem 4.2 (Kalman). The linear system (4.1) is point-to-point controllable from (x0, t0) into (x1, t1) for U ¼ Rr if and only if the system of linear algebraic equations W ðt 0 , t 1 Þz ¼ x1 F ðt 1 , t 0 Þx0
ð4:14Þ
with matrix of coefficients (4.13) is consistent.
4.5
Control with Minimal Norm
The control that takes the controlled object along the trajectory of the linear system (4.1) from the state x(t0) ¼ x0 to the state x(t1) ¼ x1 is referred to as admissible. If admissible controls exist, then Kalman theorem allows us to find one of them explicitly. In fact, let the system of linear algebraic equations (4.14) have the solution z. Substituting matrix (4.13) into equality (4.14), we see that a control uz ðt Þ ¼ Bðt Þ0 F ðt 1 , t Þ0 z
ð4:15Þ
ensures that condition (4.12) is satisfied, i.e., it is admissible. A control uz(t) has a minimal integral characteristic ðt1 J ðu Þ ¼ z
uz ðt Þ0 uz ðt Þdt
t0
among all admissible controls u(t). Indeed, in view of the admissibility of controls u(t), uz(t), we have ðt1
ðt1
x1 ¼ F ðt 1 , t 0 Þx0 þ F ðt 1 , t ÞBðt Þuðt Þ dt, x1 ¼ F ðt 1 , t 0 Þx0 þ F ðt 1 , t ÞBðt Þuz ðt Þdt t0
t0
by the Cauchy formula. The second equality is subtracted from the first one to obtain ðt1 0¼
F ðt 1 , t ÞBðt Þ½uðt Þ uz ðt Þdt: t0
4.6 Construction of Control with Minimum Norm
49
We multiply the last equality by the vector z0 on the left. In notation (4.15) we have ðt1 0¼
uz ðt Þ0 ½uðt Þ uz ðt Þdt:
t0
By using this relationship, we can write J ð uÞ ¼
Ðt1
Ðt1
uðt Þ0 uðt Þ dt ¼
t0
½uz ðt Þ þ ðuðt Þ uz ðt ÞÞ0 ½uz ðt Þ þ ðuðt Þ uz ðt ÞÞdt ¼
t0
¼
J ðuz Þ þ J ðu uz Þ J ðuz Þ:
The characteristic J(u) is the square of the norm u(t) in the space L2([t0, t1] ! Rr): ðt1 J ð uÞ ¼
0
ðt1 kuðt Þk2Rn dt ¼ kuk2L2 :
uðt Þ uðt Þdt ¼ t0
t0
As we have seen, a control uz(t) has a minimal norm among all admissible controls u(t) in this space.
4.6
Construction of Control with Minimum Norm
Construction of control with minimum norm by using formula (4.15) involves 1. computing a matrix W(t0, t1), 2. determining a vector x1 F(t1, t0)x0, 3. obtaining the solution of the system (4.14). The first and second operations above can be reduced to a solution of the matrix and vector Cauchy problems V_ ¼ Aðt ÞV þ VAðt Þ0 þ Bðt ÞBðt Þ0 , V ðt 0 Þ ¼ 0; y_ ¼ Aðt Þy, yðt 0 Þ ¼ x0 : By means of direct verification, we can use the properties of the fundamental matrix to be convinced that the matrix and vector functions ðt V ðt Þ ¼
F ðt, τÞBðτÞBðτÞ0 F ðt, τÞ0 dτ, yðt Þ ¼ F ðt, t 0 Þx0
t0
are the solutions of the appropriate Cauchy problems. By the uniqueness of the solutions, we have
50
4 Controllability of Linear Systems
ðt1 V ðt 1 Þ ¼
F ðt 1 , t ÞBðt ÞBðt Þ0 F ðt 1 , t Þ0 dt ¼ W ðt 0 , t 1 Þ, x1 yðt 1 Þ ¼ x1 F ðt 1 , t 0 Þx0 :
t0
Then, the system of equations (4.14) takes the form V ðt 1 Þz ¼ x1 yðt 1 Þ: If it has no solution, then the linear system (4.1) is not point-to-point controllable; if a solution z exists, then point-to-point controllability takes place. Solving one more Cauchy problem ψ_ ¼ Aðt Þ0 ψ, ψ ðt 1 Þ ¼ z in reverse order with respect to time results in the function ψ z ðt Þ ¼ F ðt 1 , t Þ0 z, and we construct the desired admissible control (4.15) with a minimal norm in the form uz ðt Þ ¼ Bðt Þ0 ψ z ðt Þ: It is easy to see that controls uz(t) have components of order 1/ε for a time interval of a short length ε > 0. Thus, the movement of an object requires large control “actions” over a finite distance for a short time.
4.7
Total Controllability of Linear System
Let U ¼ Rr still. We say that the system (4.1) is totally controllable on a segment [t0, t1] if it is point-to-point controllable from (x0, t0) to (x1, t1) for any x0, x1 2 Rn. By Kalman theorem, the complete controllability of the system (4.1) on the segment [t0, t1] is equivalent to the consistency of a system of linear algebraic equations (4.14) for all x0, x1 2 Rn. In turn, the latter is equivalent to the condition rankW ðt 0 , t 1 Þ ¼ n:
ð4:16Þ
This is thus the criterion of complete controllability, and it can be expressed in another form as. Lemma 4.2 The following three statements are equivalent (a) the system (4.1) when U ¼ Rr is completely controllable on [t0, t1], (b) rank W(t0, t1) ¼ n, (c) for every direction c, the vector function c0F(t1, t)B(t) is nontrivial on [t0, t1].
4.7 Total Controllability of Linear System
51
Proof The equivalence (a) , (b) was established earlier, and we show the equivalence (b) , (c), that is, we state the implication that (b) ) (c) ) (b). (b) ) (c). Suppose that assertion (b) is true and (c) is false, that is, for some direction c the identity c0 F ðt 1 , t ÞBðt Þ 0, t 0 t t 1 holds. Multiply this identity by matrix B(t)0F(t1, t)0 on the right side and integrate it in the segment [t0, t1] with respect to t. With notation (4.13), we have c0 W ðt 1 , t 0 Þ ¼ 0, i.e., a homogeneous system of linear algebraic equations of the maximal rank n has a solution c 6¼ 0. We obtain a contradiction here, and hence, statement (c) is true. (c) ) (b). From the condition (c) for any direction c, we have ðt1
0
0
ðt1
0
t0
kc0 F ðt, t ÞBðt Þk dt > 0 2
c F ðt, t ÞBðt ÞBðt Þ F ðt, t Þ cdt ¼ t0
or using notation (4.13) c0 W ðt 1 , t 0 Þc > 0: Consequently, the quadratic form c0W(t1, t0)c is positively defined, and therefore, the rank of the matrix W(t1, t0) is equal to n, and the lemma is proven. A statement (b) ) (c) is Gram’s criterion of the linear independence of rows of matrix F(t1, t)B(t) on a segment [t0, t1]. In this case, the integral term in the Cauchy formula ðt1 xðt 1 Þ ¼ F ðt 1 , t 0 Þx0 þ F ðt 1 , t ÞBðt Þuðt Þ dt t0
acts as a linear integral operator ðt1 Lu ¼
F ðt 1 , t ÞBðt Þuðt Þdt t0
from C([t0, t1] ! Rn) on all spaces Rn which explains the complete controllability of the system (4.1).
52
4 Controllability of Linear Systems
4.8
Synthesis of Control with a Minimal Norm
Consider the problem of the translation of system (4.1) from an arbitrary initial position (x0, t0) to a fixed terminal position (0, t1) under the assumptions that t0 < t1, rank W(t0, t1) ¼ n, U ¼ Rr. By Lemma 4.2, an admissible control v(t; x0, t0) of the minimal norm exists. When x1 ¼ 0, the system of linear algebraic equations (4.14) takes the form W ðt 0 , t 1 Þz ¼ F ðt 1 , t 0 Þx0 and has a solution z ¼ W 1 ðt 0 , t 1 ÞF ðt 1 , t 0 Þx0 : Substituting the solution z in formula (4.15), we find vðt; x0 , t 0 Þ ¼ Bðt Þ0 F ðt 1 , t Þ0 W 1 ðt 0 , t 1 ÞF ðt 1 , t 0 Þx0 : From this formula for t ¼ t0 we obtain the so-called synthetic control vðt 0 ; x0 , t 0 Þ ¼ Bðt 0 Þ0 F ðt 1 , t 0 Þ0 W 1 ðt 0 , t 1 ÞF ðt 1 , t 0 Þx0 as a function of the initial position (x0, t0) of the controlled object. Replacing (x0, t0) by (x, t) and putting v(t; x, t) ¼ u(x, t), we write the synthetic control in the form
where
uðx, t Þ ¼ Bðt Þ0 K ðt Þx,
ð4:17Þ
K ðt Þ ¼ F ðt 1 , t Þ0 W 1 ðt, t 1 ÞF ðt 1 , t Þ:
ð4:18Þ
The matrix K(t) is continuous, symmetric and positively defined for t < t1. In its domain, it satisfies the matrix of Riccati equation K_ ¼ Aðt Þ0 K KAðt Þ þ KBðt ÞBðt Þ0 K:
ð4:19Þ
To see this, it suffices to differentiate function (4.18) using derivatives _ ðt, t 1 ÞW 1 ðt, t 1 Þ, F_ ðt 1 , t Þ ¼ F ðt 1 , t ÞAðt Þ, W 1 ðt, t 1 Þ ¼ W 1 ðt, t 1 ÞW _ ðt, t 1 Þ ¼ F ðt 1 , t ÞBðt ÞBðt Þ0 F ðt 1 , t Þ0 W and rewrite the result using notation (4.18). A substitution of the synthetic control (4.17) into the system (4.1) leads to a closed system of homogeneous differential equations x_ ¼ Aðt Þ Bðt ÞBðt Þ0 K ðt Þ x:
ð4:20Þ
4.9 Krasovskii Theorem
53
The coefficient matrix of the system (4.20) is continuous for t < t1 and has a singularity at a moment of time t1 since kK(t)k ! 1 when t ! t1. Therefore, a solution x(t) of system (4.20) with initial values x0, t0 is defined for t < t1. Let ψ ðt Þ ¼ K ðt Þxðt Þ, t < t 1 :
ð4:21Þ
Be the function ψ(t) that is differentiable, and as a consequence of the expressions (4.19) and (4.20) it satisfies the conjugate system ψ_ ¼ Aðt Þ0 ψ in one’s domain. The function (4.21) is extended by the by continuity on the whole real line as a solution to the same system of conjugate equations. Then, a finite limit exists such that ψ ðt 1 Þ ¼ lim ψ ðt Þ ¼ lim K ðt Þxðt Þ. From the existence of a finite limit ψ(t1) it t!t 1
t!t 1
necessarily follows that kx(t)k ! 0 when t ! t1. The problem of the synthesis of a minimal norm control that translates the trajectory of the system x_ ¼ Aðt Þx þ Bðt Þu from an arbitrary initial position (x0, t0) into a fixed terminal position (x1, t1) can be solved in a similar manner. Previously, the assumptions and notation had a synthetic control with the form uðx, t Þ ¼ Bðt Þ0 K ðt Þ½F ðt, t 1 Þx1 x:
4.9
Krasovskii Theorem
Here is one sufficient condition for complete controllability of linear systems x_ ¼ Aðt Þx þ Bðt Þu, u 2 Rr
ð4:22Þ
on a segment [t0, t1] expressed directly in terms of the coefficients of the system. We assume that matrices A(t), B(t) are n 1 times differentiable. Using the recurrence relation, Pkþ1 ðt Þ ¼ Aðt ÞPk ðt Þ þ P_ k ðt Þ, k ¼ 0, . . . , n 2; P0 ðt Þ ¼ Bðt Þ we define the matrices P0(t), . . ., Pn 1(t) of size n r, and we compose them from another matrix Pðt Þ ¼ ðP0 ðt Þ, . . . , Pn1 ðt ÞÞ of the size n nr. Theorem 4.3 (Krasovskii). Let the matrices A(t), B(t) have derivatives with order n 1. To completely control the linear system (4.22) on a segment [t0, t1], it is sufficient for the rows of matrix P(t) to be linearly independent at least in one point within [t0, t1].
54
4 Controllability of Linear Systems
Proof Suppose that the system (4.22) is not completely controllable on a segment [t0, t1]. Then, by Lemma 4.2 there exists a direction c such that c0 F ðt 1 , t ÞBðt Þ ¼ c0 F ðt 1 , t ÞP0 ðt Þ 0, t 0 t t 1 : Differentiating this identity n 1 times successively and taking into consideration the properties of the fundamental matrix F(t1, t), we obtain
g c0 F ðt 1 , t ÞP0 ðt Þ ¼ c0 F ðt 1 , t Þ Aðt ÞP0 ðt Þ þ P_ 0 ðt Þ ¼ c0 F ðt 1 , t ÞP1 ðt Þ 0, .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . c0 F ðt 1 , t ÞPn2 ðt ÞÞ_ ¼ c0 F ðt 1 , t Þ Aðt ÞPn2 ðt Þ þ P_ n2 ðt Þ ¼ c0 F ðt 1 , t ÞPn1 ðt Þ 0 or briefly c0 F ðt 1 , t ÞPðt Þ 0, t 0 t t 1 : Due to the non-degeneracy of the fundamental matrix, vectors c0 F ðt 1 , t Þ are nonzero on R. Then, the last identity refers to the linear dependence of the rows of matrix P(t) in each point on [t0, t1]. If we convert this conclusion, we will obtain a statement of the theorem. Example 4.2 We apply the Krasovskii theorem to the stationary system x_ ¼ Ax þ Bu, u 2 Rn
ð4:23Þ
with constant matrix coefficients A, B. We use the recurrence relation Pkþ1 ¼ APk þ P_ k , k ¼ 0, . . . , n 2; P0 ¼ B and find P0 ¼ B, P1 ¼ AB, . . ., Pn P obtains the form
1
¼ (1)n
1 n 1
A
B. As a result, matrix
P ¼ B, AB, . . . , ð1Þn1 An1 B : According to Theorem 4.3, the stationary system (4.23) is completely controllable in any non-degenerate segment of time [t0, t1] if the rows of P are linearly independent, and this is equivalent to the requirement that rank P ¼ n. Without changing the rank of matrix P, we can write the latter condition in the form rank B, AB, . . . , An1 B ¼ n:
ð4:24Þ
4.10
4.10
Total Controllability of Stationary System
55
Total Controllability of Stationary System
We show that a sufficient condition for (4.24) is necessary at the same time for complete controllability of the system (4.23) on any non-degenerate segment [t0, t1]. First, assume the contrary: condition (4.24) is satisfied and the stationary system (4.23) is not completely controllable on a segment [t0, t1], t0 < t1. Then by Lemma 4.2 there exists a direction c such that c0 F ðt 1 , t ÞB 0, t 0 t t 1 , where F(t1, t) is a solution of the matrix Cauchy problem F_ ðt 1 , t Þ ¼ F ðt 1 , t ÞA, F ðt 1 , t 1 Þ ¼ E: We differentiate the identity n 1 times as a consequence. Taking into account equation for F(t1, t), we obtain c0 F ðt 1 , t ÞAB ¼ 0, . . . , ð1Þn1 c0 F ðt 1 , t ÞAn1 B ¼ 0: Put t ¼ t 1 in the identity and in the last equality. Then c0 B ¼ 0, c0 AB ¼ 0, . . . , c0 An1 B ¼ 0 or in matrix notation c0 B, AB, . . . , An1 B ¼ 0: Hence, in view of (4.24), it follows that c ¼ 0 contradicts the definition of c. Thus, this assertion is proven. The result of our investigation is. Theorem 4.4 (Kalman). Condition (4.24) is necessary and sufficient to ensure complete controllability of a stationary system (4.23) on any non-degenerate segment of time. Example 4.3 Verify the complete controllability of a second order differential equation €x þ a1 x_ þ a2 x ¼ u, u 2 R with constant coefficients. New variables x1 ¼ x, x2 ¼ x_ are introduced to represent the second order differential equation by the equivalent system of two first-order differential equations x_ 1 ¼ x2 , x_ 2 ¼ a2 x1 a1 x2 þ u:
56
4 Controllability of Linear Systems
Here n ¼ 2, r ¼ 1, A ¼
0 a2
1 0 0 1 ,B ¼ , ðB, ABÞ ¼ : 1 1 a1 a1
Since rank (B, AB) ¼ 2 for all a1, a2, then by Theorem 4.4, the given second-order differential equation is completely controllable on any non-degenerate segment of time.
4.11
Geometry of a Non-controllable System
Consider a linear system x_ ¼ Aðt Þx þ Bðt Þu, xðt 0 Þ ¼ 0, u 2 Rr ,
ð4:25Þ
that does not meet the criteria of complete controllability on some non-degenerate segment of time [t0, t1]. Then, by Lemma 4.2, rank W(t0, t1) ¼ m < n and a homogeneous system of linear algebraic equations W(t0, t1)c ¼ 0 has n m linearly independent solutions c1, . . ., cn m. For each solution ck, it holds the identity k 0 c F ðt 1 , τÞBðτÞ 0, t 0 τ t 1 : Using the properties of the fundamental matrix, we represent this identity in the form of
0 F ðt 1 , t Þ0 ck F ðt, t 1 ÞF ðt 1 , τÞBðτÞ 0, t 0 τ t t 1
or ψ k ðt Þ0 F ðt, τÞBðτÞ 0, t 0 τ t t 1 ,
ð4:26Þ
where ψ k(t) ¼ F(t1, t)0ck is a solution of the conjugate Cauchy problem ψ_ ¼ Aðt Þ0 ψ, ψ ðt 1 Þ ¼ ck :
ð4:27Þ
Multiply identity (4.26) by an arbitrary control u(τ) on the right, and integrate it by τ on segment [t0, t]. Then ψ ðt Þ k
0
ðt F ðt, τÞBðτÞuðτÞdτ 0, t 0 t t 1 t0
or
4.12
Transformation of Non-controllable System
57
\ n – m (t ) \ 1 (t )
x(t ) 0 Q(t) Fig. 4.4 A reachability set of a non-controllable completely linear system
ψ k ðt Þ0 xðt Þ ¼ 0, t 0 t t 1 , k ¼ 1, . . . , n m,
ð4:28Þ
ðt xðt Þ ¼ F ðt, t 0 Þ0 þ F ðt, τÞBðτÞuðτÞdτ 2 Qðt Þ: t0
As can we see from (4.28), due to the arbitrariness of control u(τ), a set Q(t) at each moment of time t 2 [t0, t1] lies at the intersection of n m planes (Fig. 4.4). The normal vectors ψ 1(t), . . ., ψ n m(t) of the planes are linearly independent, due to the linear independence of the vectors c1, . . ., cn m, and the properties of solutions of the conjugate Cauchy problem (4.27). Thus, in the case of a non-controllable completely linear system on a segment [t0, t1], the reachability set Q(t) has a dimension m ¼ rank W(t0, t1) at each moment t 2 [t0, t1].
4.12
Transformation of Non-controllable System
Consider the vector functions ψ 1(t), . . ., ψ n m(t) as rows for matrix Ψ(t) with a size of (n m) n and allocate the square block Ψ2(t) of order n m in it Ψðt Þ ¼ ðΨ1 ðt Þ, Ψ2 ðt ÞÞ, t 0 t t 1 : Due to the linear independence of the rows, the matrix Ψ(t) has a fixed rank n m on the entire segment [t0, t1]. For simplicity, we assume rank Ψ2 ðt Þ ¼ n m, t 0 t t 1 :
58
4 Controllability of Linear Systems
yð t Þ , where zðt Þ vectors y(t) and z(t) have dimensions m and n m accordingly. Then the relations (4.28) can be written in matrix form We then represent the phase vectors x(t) in the form of xðt Þ ¼
yð t Þ ðΨ1 ðt Þ, Ψ2 ðt ÞÞ zðt Þ
¼ Ψ1 ðt Þyðt Þ þ Ψ2 ðt Þzðt Þ ¼ 0:
ð4:29Þ
From here, we find zðt Þ ¼ Ψ1 2 ðt ÞΨ1 ðt Þyðt Þ:
ð4:30Þ
By construction, x(t) is a solution of the Cauchy problem (4.25) or in sub-matrix form
y_ z_
¼
A11 ðt Þ A12 ðt Þ A21 ðt Þ A22 ðt Þ
y z
þ
yð t 0 Þ 0 uðt Þ, ¼ , B 2 ðt Þ zðt 0 Þ 0
B 1 ðt Þ
where matrices A(t), B(t) are divided on the sub-matrices that are agreed with a dimension of vectors y, z so that the following operations of multiplication are correct. Then we have y_ ¼ A11 ðt Þy þ A12 ðt Þz þ B1 ðt Þuðt Þ, yðt 0 Þ ¼ 0, z_ ¼ A21 ðt Þy þ A22 ðt Þz þ B2 ðt Þuðt Þ, zðt 0 Þ ¼ 0: From here, by virtue of relation (4.30), we obtain for y a closed system of differential equations y_ ¼ A11 ðt Þ A12 ðt ÞΨ1 2 ðt ÞΨ1 ðt Þ y þ B1 ðt Þuðt Þ, yðt 0 Þ ¼ 0:
ð4:31Þ
Thus, although it is not totally controllable on segment [t0, t1], the linear system (4.25) is described by a system of algebraic equations (4.29) and a closed sub-system of differential equations (4.31) with respect to the phase coordinates. A reachability set Q(t) of the system (4.25) has a dimension m < n equal to the rank of a matrix W(t0, t1) at each t 2 [t0, t1].
4.13
Controllability of Transformed System
As we see, there are dependent and independent phase coordinates of an incompletely controllable linear system. The independent coordinates are described by a closed sub-system of differential equations of a lower order, and the dependent
4.13
Controllability of Transformed System
59
z
Fig. 4.5 The dependence of the phase coordinates of a linear system that is not completely controllable
x(t )
z(t)
y (t )
0
Q(t)
y
72
coordinates are expressed as linear combinations of the independent components by well-known formulas (Fig. 4.5). Show that under the assumption that rank W(t0, t1) ¼ m < n, a subsystem of differential equations (4.31) is completely controllable on the segment [t0, t1]. First, assume the contrary, and then by Lemma 4.2, there is a direction d 2 Rm for which the following identity holds d 0 F 1 ðt 1 , t ÞB1 ðt Þ ¼ 0, t 0 t t 1 , where F1(t1, t) is a fundamental matrix of solutions corresponding to the subsystem (4.31). Multiply this identity by an arbitrary fixed control u(t) and integrate it with respect to t in the segment [t0, t1]. Using the Cauchy formula, we obtain
d
0
ðt1
F 1 ðt 1 , t ÞB1 ðt Þuðt Þdt ¼ d 0 yðt 1 Þ ¼ 0:
t0
To control u(t) the trajectory x(t) ¼ (y(t), z(t)) corresponds to the linear system (4.25). Introducing cn m + 1 ¼ (d, 0) 2 Rn, we find
0 yð t 1 Þ cnmþ1 xðt 1 Þ ¼ ðd 0 , 00 Þ ¼ d0 yðt 1 Þ ¼ 0: zðt 1 Þ
Then, using the Cauchy formula, we obtain
c
nmþ1 0
ðt1 xð t 1 Þ ¼
0 cnmþ1 F ðt 1 , t ÞBðt Þuðt Þdt ¼ 0:
t0
Due to the arbitrariness of control u(t), the latter equality holds if and only if nmþ1 0 c F ðt 1 , t ÞBðt Þ ¼ 0, t 0 t t 1 : Hence
60
4 Controllability of Linear Systems
c
nmþ1 0
ðt1 W ðt 0 , t 1 Þ ¼
0 cnmþ1 F ðt 1 , t ÞBðt ÞBðt Þ0 F ðt 1 , t Þ0 dt ¼ 0:
t0
Solutions c1, . . ., cn m, cn m + 1 of a homogeneous system of linear algebraic equations W(t0, t1)c ¼ 0 are linearly independent. Indeed, if they were linearly dependent, then for a nontrivial set of numbers λ1, . . ., λn m, λn m + 1 the equality λ1 c1 þ . . . þ λnm cnm þ λnmþ1 cnnþ1 ¼ 0
ð4:32Þ
would hold. Obviously, λn m + 1 6¼ 0 due to the linear independence of vectors c1, . . ., cn m. Putting λn m + 1 ¼ 1 without losing generality and denoting λ ¼ (λ1, . . ., λn m), we rewrite equality (4.32) in the form of cnmþ1 ¼ λ1 c1 þ . . . þ λnm cnm ¼ λ1 ψ 1 ðt 1 Þ þ . . . þ λnm ψ nm ðt 1 Þ ¼ Ψðt 1 Þ0 λ: In sub-matrix form we have
d ¼ ðΨ1 ðt 1 Þ, Ψ2 ðt 1 ÞÞ0 λ 0 or d ¼ Ψ1 ðt 1 Þ0 λ, 0 ¼ Ψ2 ðt 1 Þ0 λ: Hence, by the non-degeneracy of the block Ψ2(t1), we find d ¼ 0. But it is assumed to be impossible. Therefore, the vectors c1, . . ., cn m, cn m + 1 are linearly independent. Then rank W(t0, t1) < m that contradicts the initial assumption. Consequently, the subsystem of differential equations (4.31) really is completely controllable over the segment [t0, t1]. Exercise Set 1. What will become the criterion for point-to-point controllability (4.6), if we replace the direction c by –c? 2. Can we obtain Theorem 4.2 as a consequence of Theorem 4.1, taking a ball of sufficiently large radius as a range of control U? 3. Derive the criterion of controllability of linear system from a set X0 ⊂ Rn on a set X1 ⊂ Rn for a given segment of time in the following cases: (а) X0, X1, U are convex compacts; (b) X0, X1 are linear manifolds, U ¼ Rr. 4. Explore the geometry of a stationary system x_ ¼ Ax þ Bu, xð0Þ ¼ 0, u 2 Rr
4.13
Controllability of Transformed System
61
that is not completely controllable. Is it true that the reachability set Q(t) of this system at any given time t > 0 is in the subspace formed by the solutions of a homogeneous system of linear algebraic equations z0 B, AB, . . . , An1 B ¼ 0? 5. Verify the complete controllability of the differential equation of nth order xðnÞ ¼ u, u 2 R: 6. Let matrix A of the system x_ ¼ Ax þ Bu be given. How can we have the minimum number of columns in matrix B for which the stationary system x_ ¼ Ax þ Bu becomes completely controllable?
Chapter 5
Minimum Time Problem
Abstract We consider the two-point performance problem of translating a controlled object from one position to another one by trajectory of a linear system for minimal time. The conditions for solvability of the problem, the optimality criteria, and the relationship with Pontryagin’s maximum principle are defined. The stationary performance problem is studied in detail.
5.1
Statement of the Problem
The task is to direct the controlled object from one point of the phase space to another in the minimum time. Its prototype in the calculus of variations was the brachistochrone problem, formulated by Johann Bernoulli in 1696. The conditions of the minimum time problem are t 1 t 0 ! min , x_ ¼ Aðt Þx þ Bðt Þu, xðt 0 Þ ¼ x0 , xðt Þ ¼ x1 , u 2 U, t 1 < t 0 :
ð5:1Þ
Here A(t), B(t) are continuous on R matrices of sizes n n, n r accordingly, x0, x1 are given points in Rn, x0 6¼ x1, t0 is a fixed moment of time, U is a convex compact set in Rr. Let us call a triple exðt Þ, e uðt Þ, et 1, composed by the trajectory exðt Þ, control e uð t Þ and a moment of time et 1 a process if it satisfies all conditions of the problem, except perhaps, the performance index. A minimum time problem is to find among these all processes exðt Þ, e uðt Þ, et 1 the optimum one x(t), u(t), t1 that satisfies the property t 1 et 1 (Fig. 5.1). The components x(t), u(t), t1 of the optimal process are referred to as the optimal trajectory, optimal control and moment of performance, respectively. Further, without special reservations, we will assume that the conditions of the minimum time problem meet the requirement of regularity (Sect. 3.6) at each time of interval ½t 0 , et 1 , et 1 > t 0 . This is important for ensuring the solvability of the problem in the considered class of controls.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_5
63
64
5 Minimum Time Problem
Fig. 5.1 The moments et 1 , t 1 to get the point x1 of trajectories exðt Þ, xðt Þ are different
a
x1 = x(t1 ) = xa(t1 )
a
x (t ) x (t ) x0
5.2
Existence of a Solution of the Minimum Time Problem
Existence of a solution of the minimum time problem is closely related to the behavior of the functions ρðt Þ ¼ min π ðt, cÞ, t t 0 , kck¼1
π ðt, cÞ ¼ c0 F ðt, t 0 Þx0 þ
Zt
max c0 F ðt, τÞBðτÞudτ c0 x1 , t t 0 , kck ¼ 1: u2U
t0
Consistent with the application of Theorem A.3.1, this shows that these functions are continuous in their domains. Since we assume x0 6¼ x1, then ρðt 0 Þ ¼ min π ðt 0 , cÞ ¼ min c0 ðx0 x1 Þ ¼ kx0 x1 k < 0: kck¼1
kck¼1
If the inequality ρ(t) < 0 holds for t > t0, then by Theorem 4.1, the linear system (5.1) is not controllable from a position (x0, t0) into (x1, t). The latter means that the reachable set Q(t) corresponding to the initial values x0, t0 does not contain x1. If, however, ρ(t) > 0, then x1 2 Q(t). Obviously, a set Q(t) “touches” a point x1 by its boundary ∂Q(t) at a time t1. According to the extreme principle to a point x1 2 ∂Q(t1) it corresponds to an extreme control. Hence, an optimal control steering the trajectory of a linear system from a point x0 to a point x1 must be an extreme one, and we now justify this intuition. Theorem 5.1 If ρ(t) 0 at some point t > t0, then an optimal process of the minimum time problem exists and the smallest root of an equation ρ(t) ¼ 0 is the moment of performance. Proof Suppose that function ρ(t) is non-negative for some t2 > t0. Then there is a root of this function on the segment [t0, t2]. Indeed, if ρ(t2) ¼ 0 then the root is t2. If ρ(t2) > 0, then the existence of a root is guaranteed by the well-known Cauchy theorem. As a consequence, the set K of the roots of function ρ (t) is not empty over segment [t0, t2]. We then put t1 ¼ inf K. By definition of the greatest lower bound, there exists a sequence {τk} K that converges to t1: lim τk ¼ t 1 (Fig. 5.2). By the k!1
continuity of the function ρ(t), we have
5.3 Criterion of Optimality
65
U
0
t0
W1
Wk
t1
t2
t
Fig. 5.2 Location of roots of ρ(t)
0 ¼ lim ρ ðτk Þ ¼ ρ k!1
lim τk
k!1
¼ ρ ðt 1 Þ:
Hence, Theorem 4.1 indicates that the point-to-point controllability of a linear control system (5.1) from (x0, t0) into (x1, t1), that is, the existence of a process x(t), u(t), t1 satisfies the conditions x(t0) ¼ x0, x(t1) ¼ x1. If t0 t < t1, then ρ (t) < 0, and point-to-point controllability of the system (5.1) is impossible. Consequently, t1 is a moment of performance and x(t), u(t), t1 is an optimal process. Thus, the theorem is proven.
5.3
Criterion of Optimality
Theorem 5.2 For the optimality of process x(t), u(t), t1 in the problem of minimum time, it is necessary and sufficient for control u(t) to be extreme c0 F ðt 1 , t ÞBðt Þuðt Þ ¼ max c0 F ðt 1 , t ÞBðt Þu, t 0 t t 1 , u2U
ð5:2Þ
where the direction c satisfies the condition π ð t 1 , cÞ ¼ 0
ð5:3Þ
and moment t1 is the least root of function ρ (t). Proof For necessity, let x(t), u(t), t1 be the optimal process of the minimum time problem. According to Theorem 5.1, t1 is the least root of the function ρ (t). Obviously, x(t1) ¼ x1 is the boundary point of the reachability set Q(t1). Indeed, if x1 2 int Q (t1), then according to continuity of the set Q(t) (Sect. 3.7), the condition x1 2 Q(t1 δ) holds for a small δ > 0. However, the latter contradicts the definition of t1. By Theorem 3.3, the fact that x(t1) ¼ x1 is a boundary point implies that equality (5.2) is valid for some vector c. From equality x(t1) x1 ¼ 0 and by applying the Cauchy formula and the relation (5.2), we obtain
66
5 Minimum Time Problem
0
0
Zt1
0
0 ¼ c xðt 1 Þ c x1 ¼ c F ðt 1 , t 0 Þx0 þ
c0 F ðt 1 , t ÞBðt Þuðt Þdt c0 x1 ¼
t0
¼ c0 F ðt 1 , t 0 Þx0 þ
Zt1
max c0 F ðt 1 , t ÞBðt Þudt c0 x1 ¼ π ðt 1 , cÞ: u2U
t0
Therefore, direction c satisfies the condition (5.3). For sufficiency, let t1 be the least root of function ρ(t). Suppose that for some control u(t) and direction c, the relations (5.2) and (5.3) are true. Then the point Zt1 xðt 1 Þ ¼ F ðt 1 , t 0 Þx0 þ
F ðt 1 , t ÞBðt Þuðt Þdt 2 Qðt 1 Þ t0
corresponds to control u(t) by the Cauchy formula. In view of (5.2) and (5.3), we obtain Zt1
0
0 ¼ π ðt 1 , cÞ ¼ c F ðt 1 , t 0 Þx0 þ
max c0 F ðt 1 , t ÞBðt Þudt c0 x1 ¼ u2U
t0 0
Zt1
¼ c F ðt 1 , t 0 Þx0 þ
c0 F ðt 1 , t ÞBðt Þuðt Þdt c0 x1 ¼ c0 xðt 1 Þ c0 x1 :
t0
From here, it follows that c0 xð t 1 Þ ¼ c0 x1 :
ð5:4Þ
Due to the extreme principle, x(t1) is the boundary and at the same time the extreme point of the set Q(t1). By Lemma 3.1, c0 xðt 1 Þ > c0 x, x 2 Qðt 1 Þ, x 6¼ xðt 1 Þ: By Theorem 5.1 t1 is a moment of performance, and hence, x1 2 Q(t1). If x1 6¼ x(t1), then on the basis of the previous inequality, we obtain c0x(t1) > c0x1 which contradicts (5.4). Consequently, x(t1) ¼ x1 and then x(t), u(t), t1 is the optimal process of the minimum time problem, and the theorem is proven. Corollary 5.1 If an optimal control exists for a minimum time problem, then it is unique up to the values at the points of discontinuity.
5.3 Criterion of Optimality
67
Proof Suppose there are two different optimal controls u1(t), u2(t) on a segment [t0, t1]. According to Theorem 5.2, u1(t), u2(t) satisfies the extreme condition (5.2) for the corresponding vectors c1, c2. From the extreme condition of u1(t), it follows that 1 0 c F ðt 1 , t ÞBðt Þ u1 ðt Þ u2 ðt Þ 0, t 0 t t 1 :
ð5:5Þ
The optimal trajectories that correspond to controls u1(t), u2(t) pass through the point x1 at a given moment of time t ¼ t1. By the Cauchy formula Zt1 x1 ¼ F ðt 1 , t 0 Þx0 þ
Zt1 F ðt 1 , t ÞBðt Þu ðt Þdt ¼ F ðt 1 , t 0 Þx0 þ
F ðt 1 , t ÞBðt Þu2 ðt Þdt:
1
t0
t0
From here, we obtain Zt1
F ðt 1 , t ÞBðt Þ u1 ðt Þ u2 ðt Þ dt ¼ 0:
t0
Multiply the given equality by vector (c1)0 on the left. Then, Zt1
1 0 c F ðt 1 , t ÞBðt Þ u1 ðt Þ u2 ðt Þ dt ¼ 0:
t0
The integrand in this integral is piecewise, continuous on the segment [t0, t1] and due to inequality (5.5), it is also non-negative. Therefore, 1 0 c F ðt 1 , t ÞBðt Þ u1 ðt Þ u2 ðt Þ ¼ 0, t 0 t t 1 : Taking into consideration the fact that control u1(t) is extreme, this equation is rewritten in the form 1 0 0 c F ðt 1 , t ÞBðt Þu1 ðt Þ ¼ c1 F ðt 1 , t ÞBðt Þu2 ðt Þ ¼ 0 ¼ max c1 F ðt 1 , t ÞBðt Þu, t 0 t t 1 : u2U
As a result of the regularity conditions we have from here, u1(t) ¼ u2(t) for any t 2 [t0, t1], except, perhaps, for the break points of controls. Thus, the assertion in Corollary 5.1 is proven.
68
5.4
5 Minimum Time Problem
Maximum Principle for the Minimum Time Problem
We refer to the necessary conditions of optimality in the minimum time problem established under more general conditions – without the assumption of regularity of the linear system – as the maximum principle. Theorem 5.3 (maximum principle) If the process x(t), u(t), t1 of the minimum time problem is optimal, then there exists such a nontrivial solution ψ(t) of the conjugate system of differential equations ψ_ ¼ Aðt Þ0 ψ,
ð5:6Þ
such that the following conditions are to be true ψ ðt Þ0 Bðt Þuðt Þ ¼ max ψ ðt Þ0 Bðt Þu, t 0 t t 1 ,
ð5:7Þ
ψ ðt 1 Þ0 x_ ðt 1 Þ 0:
ð5:8Þ
u2U
For the conditions of regularity, the maximum principle is easily obtained from the extreme principle. Indeed, let process x(t), u(t), t1 be optimal for the minimum time problem. Then, for a rather large natural k, the point x1 ¼ x(t1) does not belong to the reachability set Q(t1 1/k) of the linear system x_ ¼ Aðt Þx þ Bðt Þu, xðt 0 Þ ¼ x0 , u 2 U:
ð5:9Þ
Due to the regularity condition, the convex set Q(t1 1/k) is compact, and so it can be strictly separated from the point x(t1) by some plane with a normal ck, kckk ¼ 1: k 0 0 c xðt 1 Þ > ck x, x 2 Qðt 1 1=k Þ:
ð5:10Þ
Due to such limitation, the sequence {ck} has a convergent subsequence. To simplify the notation, we assume without loss of generality that the sequence {ck} itself converges to the vector c. Obviously, kck ¼ 1. For inequality (5.10), we choose a point exðt 1 1=kÞ on the trajectory to the fixed control e uðt Þ as 0exðt Þ that corresponds 0 a point x of system (5.9). Then ck xðt 1 Þ > ck exðt 1 1=kÞ. From here, we obtain uðt Þ, the last c0 xðt 1 Þ c0exðt 1 Þ , when k ! 1. Due to the arbitrariness of control e means c0 xðt 1 Þ c0 x, x 2 Qðt 1 Þ: From this inequality, though an analogy with the proof of Theorem 3.3, we obtain formula (5.2), and we denote
5.5 Stationary Minimum Time Problem
ψ ðt Þ ¼ F ðt 1 , t Þ0 c:
69
ð5:11Þ
From the properties of the fundamental matrix, we conclude that the function ψ(t) is non-trivial and satisfies the system of conjugate differential equations (5.6). Condition (5.2) takes the form (5.7) in the notation (5.11). To derive condition (5.8), we put x ¼ x(t1 1/k) in (5.10) and rewrite this inequality in the form k 0 xðt 1 1=kÞ xðt 1 Þ c > 0: 1=k In the limit, where k ! 1, we get condition (5.8), and the theorem is proven.
5.5
Stationary Minimum Time Problem
The general theory is well illustrated and is adjusted for the stationary minimum time problem t 1 ! min , x_ ¼ Ax þ Bu, xð0Þ ¼ x0 , xðt 1 Þ ¼ x1 , u 2 U, t 1 > 0, where A, B are fixed matrices, U is the polyhedron, that is, the convex hull of a finite number of points in space Rr. For the stationary minimum time problem, we can specify the regularity condition, determine the structure and evaluate the number to switch the optimal control. We start with the regularity condition. Following [13], we say that the condition of the general position holds in stationary minimum time problem, if for all vectors w parallel to any face of polyhedron U, the equality rank Bw, ABw, . . . , An1 Bw ¼ n
ð5:12Þ
is true. Lemma 5.1 The condition of general position in the stationary time minimum problem is sufficient for the regularity of the stationary system on any time interval ½0, et 1 , et 1 > 0. Proof Assume the contrary in that the condition of the general position holds but there exists a non-degenerate segment ½0, et 1 on which a stationary system does not meet the condition of regularity. In an equivalent form, this means that for some direction c and solution (5.11) of conjugate Cauchy problem
70
5 Minimum Time Problem
ψ_ ¼ A0 ψ, ψ ðet 1 Þ ¼ c the relation ψ ðt Þ0 Buðt Þ ¼ max ψ ðt Þ0 Bu u2U
does not specify the maximum points of u(t) unambiguously, at least on a countable set S ⊂ ½0, et 1 for the moments of time. Since the linear function reaches its maximum on the faces of the polyhedron U and the number of faces of U is limited, then on some face V ⊂ U the maximum of the linear function will also reach the countable set T ⊂ S of the moments of time. By picking any two distinct points u1, u2 from V, we have ψ ðt Þ0 Bu1 ¼ ψ ðt Þ0 Bu2 ¼ max ψ ðt Þ0 Bu, t 2 T: u2U
Put w ¼ u2 u1, then we get ψ ðt Þ0 Bw ¼ 0, t 2 T:
ð5:13Þ
The solution ψ(t) for the system of differential equations with constant coefficients is an analytical function, and consequently, the function ψ(t)0Bw is also analytical. It has a countable set of roots by virtue of the equality (5.13), and the latter is possible only in the case of its triviality ψ ðt Þ0 Bw 0, t 2 R: Differentiating this identity n1 times, we obtain ψ ðt Þ0 Bw ¼ 0, ψ ðt Þ0 ABw ¼ 0, . . . , ψ ðt Þ0 An1 Bw ¼ 0 or in vector-matrix form ψ ðt Þ0 Bw, ABw, . . . , An1 Bw ¼ 0: By construction ψ(t) 6¼ 0, hence from the last equality, it follows that rank Bw, ABw, . . . , An1 Bw < n which is a contradiction of the condition of the general position, and thus the lemma is proven. We formulate the criterion of optimality for the of the stationary minimum time problem in the specified form.
5.5 Stationary Minimum Time Problem
71
Theorem 5.4 Let the stationary minimum time problem satisfy the condition of the general position, and let the equality Ax1 + Bu1 ¼ 0 hold for some inner point u1 of polyhedron U. Then the maximum principle is a necessary and sufficient condition for optimality, and optimal control is a piecewise constant function that takes its values in vertexes of polyhedron U. Proof The necessity of the maximum principle follows from Theorem 5.3. We show the sufficiency of the maximum principle in the assumptions of Theorem 5.4. Let there be an admissible process x(t), u(t), t1 for the stationary minimum time problem satisfying the maximum principle ψ ðt Þ0 Buðt Þ ¼ max ψ ðt Þ0 Bu, 0 t t 1 , u2U
ð5:14Þ
where ψ(t) is a nontrivial solution of the conjugate system of differential equations ψ_ ¼ A0 ψ:
ð5:15Þ
Due to the homogeneity of the conjugated system, without loss of generality we can consider vector c ¼ ψ(t1) as the direction. Substituting solution (5.11) of system (5.15) into (5.14) shows the extremality of control u(t). By the extreme principle, for point x(t1) ¼ x1 we obtain c0 x1 c0 x, x 2 Qðt 1 Þ:
ð5:16Þ
Assume that process x(t), u(t), t1 is not optimal, i.e., there exists an admissible process exðt Þ, e uðt Þ, et 1 of the stationary minimum time problem with the better moment of performance et 1 < t 1 . Let us choose a small ε > 0 and define the function vð t Þ ¼
e uðt Þ, 0 t < et 1 , u1 þ εB0 F ðt 1 , t Þ0 c, et t t 1 :
Since u1 is an inner point of the set U, then for a rather small ε > 0 the condition v(t) 2 U holds for all t 2 [0, t1]. The function v(t) is extended by continuity to constant values beyond the segment [0, t1], and we regard it as a control. Consider a solution y(t) of the Cauchy problem y_ ¼ Ay þ Bvðt Þ, yð0Þ ¼ x0 corresponding to v(t) on the time segment [0, t1]. By definition, we have vðt Þ ¼ e uð t Þ on a half-axis ½0, et 1 Þ and therefore yðt Þ ¼ exðt Þ for all t 2 ½0, et 1 . In particular, yðet 1 Þ ¼ exðet 1 Þ ¼ x1 . By the Cauchy formula we have
72
5 Minimum Time Problem
yðt Þ ¼ F ðt, et 1 Þx1 þ
Zt
F ðt, τÞB u1 þ εB0 F ðt 1 , τÞ0 c dτ ¼
et1 Zt ¼ zðt Þ þ ε
ð5:17Þ
F ðt, τÞBB0 F ðt 1 , τÞ0 c dτ, et 1 t t 1 ,
et1 where zðt Þ ¼ F ðt, et 1 Þx1 þ
Zt F ðt, τÞBu1 dτ: et1
Utilizing the equality Ax1 + Bu1 ¼ 0 and the properties of the fundamental matrix, we obtain
zðt Þ ¼ F ðt, et 1 Þx1 þ
Zt
0 B ðF ðt, τÞAÞx1 dτ ¼ F ðt, et Þ1 x1 þ B @
et1
Zt
1 C F τ ðt, τÞdτC Ax1 ¼ x1 :
et1
In view of this, the formula (5.17) at t ¼ t1 takes the form Zt1 yð t 1 Þ ¼ x1 þ ε
F ðt 1 , t ÞBB0 F ðt 1 , t Þ0 c dt:
et1 Multiply this equality by vector c0 on the left. Then 0
0
Zt1
c yð t 1 Þ ¼ c x1 þ ε
0
0
0
0
Zt1
et1
kc0 F ðt 1 , t ÞBk dt: 2
c F ðt 1 , t ÞBB F ðt 1 , t Þ c dt ¼ c x1 þ ε et1
Due to the condition of the general position and Lemma 5.1, a function kc0F(t1, t) Bk2 is nontrivial on segment ½et 1 , t 1 . Therefore, from the last equality we obtain c0y(t1) > c0x1 for the point y(t1) 2 Q(t1) which contradicts the inequality (5.16). Consequently, the admissible process x(t), u(t), t1 that satisfies the maximum principle is optimal. According to the condition of the general position, formula (5.14) unambiguously defines the optimal control u(t) at all points of the segment [0, t1], except, possibly, for a finite number of points of discontinuity. At each interval of
5.5 Stationary Minimum Time Problem
73
continuity, the control u(t) coincides with one of the vertices of the polyhedron U as the point of the maximum linear form on U, and the theorem is proven. Let us consider a useful estimate for the number of switches for the extreme control. We start with the simple case of a stationary system x_ ¼ Ax þ bu, juj 1
ð5:18Þ
with a scalar control u and the range U ¼ [1, 1]. The extreme controls for the system (5.18) have the form uðt Þ ¼ arg max ψ ðt Þ0 bu ¼ signψ ðt Þ0 b, t 2 R, juj1
ð5:19Þ
where ψ(t) is the nontrivial solution of the conjugate system (5.15). The condition of the general position rank b, Ab, . . . , An1 b ¼ n
ð5:20Þ
guarantees the unambiguity of the control (5.19) on any non-degenerate segment from R, except, probably, for the finite number of roots of the switching function ψ(t)0b. Lemma 5.2 Let stationary system (5.18) satisfy the condition of the general position (5.20) and the matrix A have real eigenvalues. Then each extreme control has no more than n intervals of constancy. Proof We shall estimate the number of real roots of the switching function ψ(t)0b in the formula (5.19). According to the conditions of the lemma, matrix A0 has real eigenvalues, and each coordinate of the solution ψ(t) of (5.15) has the form Pk1 1 ðt Þeλ1 t þ . . . þ Pkm 1 ðt Þeλm t ,
ð5:21Þ
where λ1, . . ., λm are different pairwise eigenvalues A0 with corresponding multiplicity k1, . . ., km, k1 + . . . + km ¼ n, and Pk(t) are polynomials of order k. A switching function ψ(t)0b as a linear combination of coordinates of solution ψ(t) also has the form (5.21). Therefore, to prove the lemma it is sufficient to verify that the number of real roots of function λ1 t
φ0 ðt Þ ¼ Pk1 1 ðt Þe þ . . . þ Pkm 1 ðt Þeλm t does not exceed n 1. Assume that this is not true and the function φ0(t) has no less than n real roots. Then the function
74
5 Minimum Time Problem
φ0 ðt Þeλ1 t ¼ Pk1 1 ðt Þ þ Pk2 1 ðt Þeðλ2 λ1 Þt þ . . . þ Pkm 1 ðt Þeðλm λ1 Þt also has no less than n real roots. By Rolle theorem, two adjacent roots of a smooth function have at least one root of its derivative. Successively differentiating function φ0 ðt Þeλ1 t and applying Rolle theorem, we conclude that its k1-th derivative is of the form φ1 ðt Þ ¼ Pk2 1 ðt Þeðλ2 λ1 Þt þ . . . þ Pkm 1 ðt Þeðλm λ1 Þt which has no less than n k1 real roots. We take function φ1(t) as the function φ0(t) and repeat the above arguments. After m 2 repetitions, we find that the function φm1 ðt Þ ¼ Pkm 1 ðt Þeðλm λm1 Þt has no less than n k1 . . . km 1 ¼ km real roots, although there are actually no more than km 1. This contradiction shows that the ψ(t)0b function has no more than n 1 real roots. Then, according to formula (5.19), the extreme control has no more than n intervals of constancy, and the lemma is proven. We can then obtain more general result with the aid of Lemma 5.2. Theorem 5.5 Suppose that for a stationary minimum time problem the conditions of Theorem 5.4 hold. Let further all eigenvalues of matrix A are real and the polyhedron U ¼ [1, 1]r is a cube of dimension r. If the optimal control exists, then each its coordinate function has no more than n intervals of constancy. Proof Assume that an optimal process x(t), u(t), t1 in stationary minimum time problem exists. Under the conditions of Theorem 5.4 for optimality of control u(t), it is necessary and sufficient to satisfy the condition requirement ψ ðt Þ0 Buðt Þ ¼ max r ψ ðt Þ0 Bu, 0 t t 1 , u2½1, 1
ð5:22Þ
where ψ(t) is some nontrivial solution of the conjugate system (5.15). Denote the columns of matrix B as b1, . . ., br and write equality (5.22) in the form r X
ψ ðt Þ0 bk uk ðt Þ ¼ max
u2½1, 1r
k¼1
r X k¼1
ψ ð t Þ 0 bk uk ¼
r X k¼1
max ψ ðt Þ0 bk uk :
uk 2½1, 1
From here we obtain uk ðt Þ ¼ signψ ðt Þ0 bk , 0 t t 1 , k ¼ 1, . . . , r:
ð5:23Þ
Therefore, each coordinate function uk(t) of the optimal control is extreme for a stationary system
5.5 Stationary Minimum Time Problem
y_ ¼ Ay þ bk v, jvj 1:
75
ð5:24Þ
The matrix A of the coefficients of this system has real eigenvalues. Furthermore, from the condition of the general position (5.12) for vectors w that are directed along the k-th coordinate axis and are parallel to the corresponding edge of the cube U ¼ [1, 1]r, we obtain rank bk , Abk , . . . , An1 bk ¼ n: Consequently, system (5.24) satisfies the condition of the general position for every k ¼ 1, . . ., r. According to Lemma 5.2, the extreme for the system (5.24) control (5.23) has no more than n intervals of constancy. Thus, the theorem is proven. Exercise Set 1. Determine if direction c is the minimum point of the function π(t1, z) on sphere kzk ¼ 1 under the conditions of Theorem 5.2. 2. Determine the extremality of the optimal control in the two-point minimum time problem by using the theorem on the existence of the reference plane to a reachability set in point x1 at the moment of performance. 3. Determine under what conditions x1 2 Q(t) for t > t1 if the reachability set Q(t1) of a linear system “touches” the point x1 for the first time at time t1 > t0. 4. Determine the conditions for x(t) ¼ x1 when t > t1, if the trajectory x(t) reaches point x1 at the moment of performance t ¼ t1. 5. Suppose that the range of control U is a ball with a positive radius centered at the origin in the minimum time problem. Determine whether it is possible to make t1 a moment of the performance by means of decreasing the radius of the ball if ρ (t1) > 0 at some moment t1 > 0. 6. Find a lower bound of the moment of performance in a stationary minimum time problem. Hint: We need to estimate the norm of the solutions of a stationary system, using the inequality kxk kx_ k.
Chapter 6
Synthesis of the Optimal System Performance
Abstract The concept of the synthesized control and synthesis for optimal performance system is introduced. The reverse motion method is described. Examples of applying this method for the synthesis of stationary second-order systems are given.
6.1
General Scheme to Apply the Maximum Principle
The purpose of this section is to show the utilization of the maximum principle for the synthesis of optimal control in the minimum time problem. Consider the two-point minimum time problem formulated in Sect. 5.1. According to Theorem 5.3, optimal process of the minimum time problem satisfies the conditions x_ ¼ Aðt Þx þ Bðt Þu, xðt 0 Þ ¼ x0 , xðt 1 Þ ¼ x1 ,
ð6:1Þ
ψ_ ¼ Aðt Þ0 ψ,
ð6:2Þ
u ¼ arg max ψ 0 Bðt Þv ¼ vðψ, t Þ,
ð6:3Þ
ψ 0 x_ jt¼t1 0
ð6:4Þ
v2U
with a nontrivial function ψ(t). Let us find out how we can satisfy the above conditions. Select a random vector c 2 Rn and find the solution ψ(t, c) of the conjugate system of differential equations (6.2) with the initial condition ψ(t0) ¼ c. Substituting ψ(t, c) into formula (6.3), we obtain the well-known function uðt, cÞ ¼ vðψ ðt, cÞ, t Þ: Find a solution x(t, c) for the Cauchy problem x_ ¼ Aðt Þx þ Bðt Þuðt, cÞ, xðt 0 Þ ¼ x0
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_6
77
78
6
Synthesis of the Optimal System Performance
We require that at some moment t ¼ t1, this solution satisfies the condition xðt 1 , cÞ ¼ x1 : Solving the system of n equations with n unknown coordinates of vector c, we obtain a solution c(t1). To determine an unknown t1, we use condition (6.4) of the maximum principle ψ t 1 , cðt 1 Þ0 x_ ðt 1 , cðt 1 Þ 0: Suppose this inequality has a solution et 1 > t 0 and ec ¼ cðet 1 Þ 6¼ 0 . Then the functions exðt Þ ¼ xðt, ecÞ, e e ðt Þ ¼ ψ e ðt, ecÞ uð t Þ ¼ e uðt, ecÞ, ψ that satisfy the maximum principle on segment ½t 0 , et 1 become known. If function e uðt Þ is piecewise continuous on this segment, then a process exðt Þ, e uðt Þ, et 1 is constructed to meet all necessary conditions of optimality. The optimum process of the minimum time problem, if it exists, is among these processes. The above scheme to construct candidates for the optimal process is attractive due to its clarity, and it contains sufficient conditions for such to be determined. However, it is difficult to apply in practice because it is challenging to carry out the analytical integration of systems of differential equations, the solution of extreme problems in the parametric form and the solution of systems of nonlinear equations. In this regard, it seems to be that numerical methods that take into account additional information about the initial value of the conjugate system (Sect. 5.3) are more promising. Referring to details in the bibliography [4, 15], we dwell on the modification of the general scheme of applying the maximum principle which is known as the method of reverse motion. The basic idea of this method is to meet the maximum principle in reverse time. At first, we construct pieces of extreme trajectories that lead to a terminal point x1. We then construct pieces of extreme trajectories that lead to already constructed trajectories, and so on. The process continues until we construct a trajectory that goes from the initial point x0. We then demonstrate the method of reverse motion using a simple example.
6.2
Control of Acceleration of a Material Point
Consider the minimum time problem t 1 ! min , x_ 1 ¼ x2 , x_ 2 ¼ u, juj 1, x1 ð0Þ ¼ x10 , x2 ð0Þ ¼ x20 , x1 ðt 1 Þ ¼ 0, x2 ðt 1 Þ ¼ 0:
6.2 Control of Acceleration of a Material Point
79
u 0
x20 x10
x
Fig. 6.1 Initial position x10 and initial velocityx20 of the material point
In physical terms, we describe the straight-line motion of a unit mass material point without friction and air resistance under the action of a limited force u by using differential equations. Variable x1 ¼ x has a sense of distance from the material point to the origin, and variable x2 ¼ x_ has a sense of speed (Fig. 6.1). The movement of the material point begins at moment t ¼ 0 from the initial position x10 and initial velocity x20. It is necessary to move the given material point to the origin with zero speed in the shortest amount of time by means of the action of a force. The system of differential equations in this example is equivalent to equation €x ¼ u that explains the name of Sect. 6.2. We now turn to the solution of an example. Compared with the general stationary minimum time problem, here. n ¼ 2, r ¼ 1, t 0 ¼ 0, U ¼ ½1, 1, A ¼
0 1 0 x10 0 ,B ¼ , x0 ¼ : , x1 ¼ 0 0 1 0 x20
From the characteristic equation λ 1 ¼ λ2 ¼ 0 jA λE j ¼ 0 λ we find a real eigenvalue λ ¼ 0 of multiplicity 2 for matrix A. The condition of the general position 0 1 rankðB, ABÞ ¼ rank ¼2 1 0 is satisfied. By Lemma 5.2 the extreme control equals 1 or + 1 and has no more than two intervals of constancy. When u ¼ 1, the equations of motion of the phase 1 point have the form x_ 1 ¼ x2 , x_ 2 ¼ u or dx dx2 ¼ ux2 . From here, we obtain x1 ¼
x22 x2 þ с2 ðu ¼ 1Þ, x1 ¼ 2 þ с1 ðu ¼ þ1Þ 2 2
ð6:5Þ
where с1, с2 are the constants of integration. The directions of the movement of the phase point are designated on Fig. 6.2. We distinguish trajectories AO and OB that lead to the origin (Fig. 6.3). If the initial point x0 lies on the arc AO or OB, then an extreme trajectory is found. If not, we select the trajectories from family (6.5) that cross the arcs AO and OB (Fig. 6.4). As a result, we obtain a family of trajectories that lead to the origin and fill the phase
80
6
u = –1
Synthesis of the Optimal System Performance
x2
x2
u = +1
x1
x1
Fig. 6.2 The trajectories of movement of the phase point under constant controls Fig. 6.3 Movements of the phase point in origin
A
u = –1
x2
О
x1 u = +1
Fig. 6.4 Movements of phase point to the switching curve АОВ
B
x2
A О
x1
B
plane entirely. Each trajectory satisfies the maximum principle. By Theorem 5.4, all of them are optimal (the interior point u1 ¼ 0 of the range of control satisfies the condition Ax1 + Bu1 ¼ 0 of theorem for x1 ¼ 0). The optimal movement of the phase point x to the origin takes place according to the following rule. We take control u(x0) at the initial point x0 as uðx0 Þ ¼ 1, if x0 is above AOB or on AO, uðx0 Þ ¼ þ1, if x0 is below AOB or on OB, and keep this value of control until the phase point x does not fall on the curve AOB or on the origin respectively. At the moment of x falling on the line AOB, we have to switch the control from 1 to +1 or vice versa and leave it till it reaches the origin. The AOB curve on which control is switched, is called a switching curve. It unambiguously defines the optimum control. Indeed, the control at the initial point x0 is determined by the value u(x0). Replacing the point x0 by x, we obtain the
6.3 Concept of Optimal Control Synthesis
81
optimal control u(x) given at each point x ¼ (x1, x2) of the phase plane. The optimal trajectory is then found by solving the Cauchy problem x_ 1 ¼ x2 , x_ 2 ¼ uðxÞ, x1 ð0Þ ¼ x10 , x2 ð0Þ ¼ x20 :
6.3
Concept of Optimal Control Synthesis
Previously, we had introduced a control as a function of time, and such controls are referred to as programming controls. In Sects. 6.2 and 4.8 controls of a new type were constructed that are functions of phase coordinates of a controlled system, and these are referred to as synthesized or feedback controls. To gain a more general understanding, a synthesized control is a function of a phase variable x and a time variable t. Depending on the type of controls that are used, we distinguish the problems of programming control from those of synthesis control (feedback control problems). We seek an optimal control as a function of a phase state and a time variable in a problem of synthesis control and as a function of time – that is, a programming control problem. It is clear that the first problem is more general and that it comprises the second one as a particular case. A synthesis control problem is more important for technical applications. With the help of synthesized control, we can organize the movement of a controlled object on the basis of feedback or, in other words, with a closed circuit: the action of control u on the object is produced by a regulator u(x, t) according to the current state x of the object and current moment of time t. The “object of control – regulator” pair (as well as “mathematical model – synthesized control”) forms a selfcontrolled system that operates independently from the initial position of the object (Fig. 6.5). The scheme for the programming control in Fig. 6.6 is unclosed. Programming control is rigidly connected with the initial position of the object, and if at a certain initial position control is optimal by time, then it may be even non-admissible for another initial position. Fig. 6.5 Feedback control scheme
Object of control x
u Regulator
u Fig. 6.6 Programming control scheme
Object of control
x
82
6
6.4
Synthesis of the Optimal System Performance
Examples of Synthesis of Optimal Systems Performance
Consider a stationary minimum time problem t 1 ! min , x_ 1 ¼ x2 , x_ 2 ¼ a2 x1 a1 x2 þ u, juj 1,
ð6:6Þ
x1 ð0Þ ¼ x10 , x2 ð0Þ ¼ x20 , x1 ðt 1 Þ ¼ 0, x2 ðt 1 Þ ¼ 0: Here. n ¼ 2, r ¼ 1, t 0 ¼ 0, U ¼ ½1, 1, A ¼ x0 ¼
x10 x20
! , x1 ¼
0 0
!
0
!
1
a2 a1
,B ¼
0 1
! ,
:
When a1 ¼ a2 ¼ 0, the solution of the problem is given in Sect. 6.2. We verify the condition of the general position (Sect. 5.5), and we equate vector w which is parallel to a one-dimensional face of polyhedron U, to 1. Then
0 1 rankðBw, ABwÞ ¼ rank 1 a1
¼ 2:
Consequently, the condition of the general position holds for any a1, a2. The phase portrait (picture of synthesis) largely depends on the eigenvalues of matrix A, that is, the roots of the characteristic equation 0λ 1 ¼ λ2 þ a1 λ þ a2 ¼ 0: jA λE j ¼ a2 a1 λ Let us consider two cases.
6.4.1
Eigenvalues of Matrix A Are Real and Distinct
When a21 4a2 > 0, the characteristic equation has two real and distinct roots λ1, λ2. For definiteness, we consider λ2 < λ1, and the simplify the differential equations (6.6), we introduce a new system of coordinates y1, y2 as x1 ¼ y1 þ y2 , x2 ¼ λ 1 y1 þ λ 2 y2
ð6:7Þ
6.4 Examples of Synthesis of Optimal Systems Performance
83
(the columns of the transition matrix are eigenvectors of matrix A corresponding to λ1, λ2). The conditions of the problem (6.6) in the new system of coordinates take the form t 1 ! min , y_ 1 ¼ λ1 y1 þ bu, y_ 2 ¼ λ2 y2 bu, juj 1, y1 ð0Þ ¼ y10 , y2 ð0Þ ¼ y20 , y1 ðt 1 Þ ¼ 0, y2 ðt 1 Þ ¼ 0,
ð6:8Þ
where b ¼ 1/(λ1 λ2) > 0, and y10, y20 correspond to x10, x20 according to the inverse transform (6.7). Note that the transition to the new coordinate system does not change the time and controls. Therefore, the solution for the original problem is obtained from the solution of (6.8) by using formulas (6.7). Problem (6.8) satisfies the conditions of Theorem 5.4 and Lemma 5.2. Hence, the optimal control takes the values 1, +1 and has no more than one switch. For the intervals of the constancy of control u ¼ 1, the differential equations (6.8) can be written as ðy1 þ bu=λ1 Þ • ¼ λ1 ðy1 þ bu=λ1 Þ,
ðy2 bu=λ2 Þ • ¼ λ2 ðy2 bu=λ2 Þ:
ð6:9Þ
Separating the variables in (6.9) and integrating, we have y1 ¼ с1 eλ1 t bu=λ1 , y2 ¼ с2 eλ2 t þ bu=λ2 , where с1, с2 are arbitrary constants. For u ¼ 1 and u ¼ + 1, we obtain two families of solutions y1 ¼ с1 eλ1 t þ b=λ1 , y2 ¼ с2 eλ2 t b=λ2 ðu ¼ 1Þ; y1 ¼ с1 eλ1 t b=λ1 , y2 ¼ с2 eλ2 t þ b=λ2 ðu ¼ þ1Þ:
ð6:10Þ
Case λ2 < λ1 < 0 Figure 6.7 shows the picture of the location of the solutions (6.10) for different values of arbitrary constants. Each family of solutions has a stable focus Φ1 ¼ ðb=λ1 , b=λ2 Þ ðu ¼ 1Þ, Φþ1 ¼ ðb=λ1 , b=λ2 Þ ðu ¼ þ1Þ: Following the method of reverse motion, choose trajectories AO and OB from the family of trajectories (6.10) leading to the origin (Fig. 6.8). We then define the trajectories that lead to the AO and OB curves (Fig. 6.9). The constructed trajectories fill the phase plane entirely, and for each initial point y0, there is a trajectory that is composed from trajectories (6.10) and leads to the origin. As noted above, such a trajectory is optimal, and the optimal synthesized control has the form
84
6
Synthesis of the Optimal System Performance
y2
) –1 y1
0
) +1 Fig. 6.7 Family of trajectories (6.10) for λ2 < λ1 < 0 and constant controls u ¼ 1(up) и u ¼ + 1 (down) Fig. 6.8 Movement of the phase point by trajectories AO, OB to origin under constant controls u ¼ + 1, u ¼ 1 respectively
A
y2
) –1
O
y1
B
) +1
uðyÞ ¼ 1, if y is above AOB or on OB, uðyÞ ¼ þ1, if y is below AOB or on AO, and it is fully defined by the switching curve AOB. If necessary, the formulas (6.10) can be used to write the equations of the switching curve in coordinates y1, y2, and we then return it to the original coordinates x1, x2 by using the inverse transform (6.7).
6.4 Examples of Synthesis of Optimal Systems Performance
85
Fig. 6.9 Phase portrait of the optimal system performance in the case λ2 < λ1 < 0
y2
A
O
y1
B
y2
Fig. 6.10 Family of trajectories (6.10) for λ2 < 0 < λ1 and constant controls u ¼ 1(up) and u ¼ + 1 (down)
) –1 0
y1
) +1
Case λ2 < 0 < λ1 The basic steps to construct the optimal trajectories are shown in Figs. 6.10, 6.11, and 6.12. In this case, the synthesized control is defined in an open strip |y1| < b/λ1 and curve AOB is a switching one. If an initial point lies beyond this strip, then the system is not controllable into the origin. Indeed, let us fix an arbitrary process y1(t), y2(t), u(t) that satisfies the differential equations and the initial conditions of problem (6.8). Then, the function z(t) ¼ λ1y1(t) b is the solution of the Cauchy problem z_ ¼ λ1 z þ λ1 b½1 þ uðt Þ, zð0Þ ¼ z0 ¼ λ1 y10 b and, consequently, has a form
86
6
Synthesis of the Optimal System Performance
A
Fig. 6.11 Movement of the phase point by trajectories AO, OB to the origin under constant controls u ¼ + 1, u ¼ 1 respectively
y2
u = –1 ) –1
O
y1
) +1
u = +1
B A
Fig. 6.12 Phase portrait of the optimal system performance in the case λ2 < 0 < λ1
y2
) –1
O
y1
) +1
B
6.4 Examples of Synthesis of Optimal Systems Performance
87
y2
Fig. 6.13 The phase portrait of the optimal system in the case 0 < λ2 < λ1
y1
λ1 t
ðt
zðt Þ ¼ e z0 þ λ1 b eλ1 ðtτÞ ½1 þ uðτÞdτ: 0
If z0 0, then, it follows from here that z(t) 0 for t 0. In other words, if the initial point y0 is to the right of the strip |y1| < b/λ1, each outgoing from y0 trajectory y(t) ¼ (y1(t), y2(t)) lies to the right of this strip, and if y10 b/λ1 then the arguments are similar. Case 0 < λ2 < λ1 is investigated as well as the previous cases. The picture of the location of the optimal trajectories is shown in Fig. 6.13. The synthesis of optimal control is possible within a limited region of the phase plane containing the origin.
6.4.2
The Eigenvalues of Matrix A Are Complex
Finally, we consider the case, where the differential equations (6.6) are given in the form of x_ 1 ¼ x2 , x_ 2 ¼ ω2 x1 þ u
ð6:11Þ
(a1 ¼ 0, a2 ¼ ω , ω > 0). Then, the matrix of coefficients A ¼ 2
0 1 ω2 0
of the
system (6.11) has complex eigenvalues λ1 ¼ iω, λ2 ¼ iω. We use the maximum principle to synthesize the optimal control (conditions of Lemma 5.2 are not satisfied). We write a conjugate system of differential equations ψ_ ¼ A0 ψ, or, by coordinates, ψ_ 1 ¼ ω2 ψ 2 , ψ_ 2 ¼ ψ 1 ::
88
6
Synthesis of the Optimal System Performance
A general solution that depends on the arbitrary constants r, φ of the conjugate system has the form ψ 1 ðt Þ ¼ rω cos ðω t þ φÞ, ψ 2 ðt Þ ¼ r sin ðω t þ φÞ
ð6:12Þ
Point ψ(t) with coordinates (6.12) rotates in a clockwise direction at a constant angular velocity ω describing an ellipse (ψ 1/rω)2 + (ψ 2/r)2 ¼ 1. From the condition of the maximum principle ψ ðt Þ0 Bu ¼ ðψ 1 ðt Þ, ψ 2 ðt ÞÞ
0 u ¼ ψ 2 ðt Þu ! max , juj 1 1
we find an extreme control uðt Þ ¼ sign ψ 2 ðt Þ ¼ sign r sin ðω t þ φÞ:
ð6:13Þ
Formula (6.13) shows that an extreme control is a piecewise constant function, and it has values 1, +1 with a length of interval with a constancy of u(t) that does not exceed π/ω. The solutions of differential equations (6.12) on the intervals of the constancy of control u(t) ¼ u have the form x1 ðt Þ ¼ r 1 ω2 cos ðωt þ φ1 Þ þ ω2 u, x2 ðt Þ ¼ r 1 ω1 sin ðωt þ φ1 Þ,
ð6:14Þ
where r1, φ1 are the constants of integration. Therefore, the phase point rotates uniformly in the clockwise direction, describing an ellipse centered at the point (u/ ω2, 0) (Fig. 6.14), and it completes one rotation for the time 2π/ω. Construct trajectories A1O and OB1 leading to the origin. Since the maximum length of the interval of constancy of control (6.13) is π/ω, the phase point (6.14) runs a half of an ellipse during this time (Fig. 6.15). The parts of the trajectories Fig. 6.14 Movement of the phase point under constant controls
u=–1
x2
u=+1 x1
Fig. 6.15 Movement of phase point into origin for the time π/ω
u=–1 A1
x2
О
B1
u=+1
x1
6.4 Examples of Synthesis of Optimal Systems Performance
89
x2
A2
A1
B1
О
B2
x1
Fig. 6.16 Parts of the extreme trajectories that terminate on curve A1OB1 and form two curvilinear quadrangles
x2 A2
A1
О
B1
B2
x1
Fig. 6.17 Switching curve for the control with complex eigenvalues of matrix A
(6.14) that are terminated on the curve A1OB1 are constructed in the same manner, and their initial points form semi-ellipses A2A1 and B1B2(Fig. 6.16). The process to construct curvilinear quadrangles, filled by parts of trajectories (6.14), continues in an unlimited manner until they have filled the entire phase plane. As a result, the switching curve for control . . .A2A1OB1B2. . . is determined (Fig. 6.17), and it divides the phase plane on two parts and is formed by semi-ellipses
x1 k=ω2
2
2 þ x2 =ω2 ¼ 1, k ¼ 1, 3, 5, . . .
The conditions of Theorem 5.4 are satisfied in this case, and therefore the maximum principle is sufficient for the optimality of each constructed trajectory. The optimal synthesized control u(x) at each point x of the phase plane is equal to uðxÞ ¼ 1, if x is above . . . A2 A1 OB1 B2 . . . or on A1 O,
90
6
Synthesis of the Optimal System Performance
x2
x0
u = -1
u = -1 –Z
–2
Z –2
0
x1
u = +1 Fig. 6.18 Number of switches of control increases with the distance from the starting point to the origin
uðxÞ ¼ þ1, if x is below . . . A2 A1 OB1 B2 . . . or on OB1 : The farther away the initial point x0 is from the origin, the larger the number of switches that the optimal control has (Fig. 6.18). The phase point x moves to the origin, alternating with semi-ellipses centered at the points (ω2, 0), (ω2, 0), and the transition from one semi-ellipse to another occurs on the switching curve. Exercise Set 1. Determine a synthesized control in problem (6.6) replacing the terminal point x1 ¼ (0, 0) by an arbitrary point of the phase plane. 2. Find the time of the performance in Exercise 1 as a function of the initial point. Is this function always continuous? 3. Determine a synthesized control in a two-point minimum time problem if
A¼
a11 a12
!
1
0
!
,B ¼ , x0 ¼ a21 a22 0 1 t 0 ¼ 0, U ¼ ½1, 1 ½1, 1:
x10 x20
! , x1 ¼
0 0
! ,
Chapter 7
The Observability Problem
Abstract The problem of observability – the possibility to define and calculate the position of a controlled object at a given time by observable data is explored. We establish the criteria of observability for homogeneous, non-homogeneous, and stationary observability systems. The relationship between observability and controllability is showed.
7.1
Statement of the Problem
In automatic control theory problems of observability arise in connection with the implementation of the synthesized controls. To determine the value of a synthesized control u(x, t) at some point in time, it is necessary to know the state of the controlled object at the same moment of time. In real systems of automatic control, the state vectors are generally inaccessible to direct measurement and can be evaluated by measuring other variables related to these vectors for some previous time interval. The observability problem consists of investigating the possibility of restoring the state of the controlled object at a given time by using the available information and providing a way to recover it. We consider the problem in the following setting. Given a mathematical model of the controlled object x_ ¼ Aðt Þx þ Bðt Þuðt Þ
ð7:1Þ
and the results of the observation yðt Þ ¼ C ðt Þxðt Þ, t 0 t < t 1
ð7:2Þ
of unknown trajectory x(t) of its movement, A(t), B(t), C(t) are continuous on R matrices of the sizes n n, n r, m n accordingly, m < n, u(t) ¼ u(x(t), t), t0 t < t1 is a known program realization of synthesized control u(x, t) along an unknown trajectory x(t), t0 t < t1. It is necessary to determine the possibility of
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_7
91
92
7 The Observability Problem
recovering the vector x(t1) ¼ x1 using known data A(t), B(t), C(t), u(t), y(t), t0, t1 and then to specify the method to determine vector x1, if possible. A triple x(t), y(t), x1 that satisfies the conditions (7.1) (7.2) of the observability problem and the additional condition x(t1) ¼ x1 is said to be a process. In view of the problem of observability, each process is uniquely defined by the vector x1. The relation (7.2) is treated in applications as a description to observe or measure a device that is capable to record linear combinations of the phase variables during some time interval. The values of the state variables are regarded as unknown and can be found if we complement a system of differential equations (7.1) with an initial condition x(t1) ¼ x1 so that the solution of the corresponding Cauchy problem satisfies the identity (7.2). If the problem of observability (7.1) (7.2) is solved, and an unknown vector x1 is found, then it will be possible to determine the value of the synthesized control u(x, t) for the x ¼ x1, t ¼ t1 position of the controlled object. Whereas t1 continuously changes, we solve the problem of the program implementation for u(t1) ¼ u(x(t1), t1) of a synthesized control. Let us start with a simple homogeneous observability problem x_ ¼ Aðt Þx,
ð7:3Þ
yðt Þ ¼ C ðt Þxðt Þ, t 0 t < t 1
ð7:4Þ
that corresponds to u(t) 0. We carried over the concept of a process with natural changes to the homogeneous observability problem.
7.2
Criterion of Observability
Following Kalman [9], we determine that a homogeneous system (7.3), (7.4) is observable in a direction q (q-observable) if there exists a continuous vector function z(t) : [t0, t1] ! Rm such that for any process x(t), y(t), x1 of a homogeneous problem, the following equality holds 0
ðt1
x1 q ¼
yðt Þ0 zðt Þdt:
ð7:5Þ
t0
In other words, in a q-observable system the projection of each vector x1 on the direction q can be restored by using the same linear operation corresponding to the x1 process. We proceed to derive the observability criterion. Let the system (7.3), (7.4) be observable in some direction q for which there exists a vector function z(t) satisfying (7.5). We pick an arbitrary fixed vector x1 2 Rn and write a solution x(t) ¼ F(t, t1)x1 of differential equations (7.3) with an initial condition x(t1) ¼ x1 obtained using the Cauchy formula. Then, the corresponding observations (7.4) are obtained in the form
7.3 Observability in Homogeneous System
93
yðt Þ ¼ Cðt ÞF ðt, t 1 Þx1 : Substituting y(t) into (7.5) and after obvious transformations, we have 0
1
ðt1
x1 0 @q F ðt, t 1 Þ0 C ðt Þ0 zðt ÞdtA ¼ 0: t0
Since x1 is arbitrary, we then get ðt1 q¼
F ðt, t 1 Þ0 C ðt Þ0 zðt Þdt:
ð7:6Þ
t0
Therefore, for q-observability, a solvability of the homogeneous system of integral equations (7.6) is required for the unknown function z(t). The converse statement is also true. A solvability of the system of integral equations (7.6) with respect to z(t) is sufficient for the q-observability of a homogeneous system. We can then easily make sure of that by carrying out computations in the reverse order. In accordance with item 4.3, the solvability of a system of linear algebraic equations W 1 ðt 0 , t 1 Þp ¼ q
ð7:7Þ
is the criterion of solvability for a system of integral equations (7.6). Here ðt1 W 1 ðt 0 , t 1 Þ ¼
F ðt, t 1 Þ0 Cðt Þ0 C ðt ÞF ðt, t 1 Þdt
ð7:8Þ
t0
is a matrix of coefficients. As a result, we obtain the following conclusion. Theorem 7.1 (criterion of observability) The homogeneous system (7.3), (7.4) is q-observable if and only if the system of linear algebraic equations (7.7) with the matrix of coefficients (7.8) has a solution.
7.3
Observability in Homogeneous System
Suppose that a homogeneous system (7.3), (7.4) is completely observable, that is, that it is observable in all directions q. Then, by Theorem 7.1 the system of linear algebraic equations (7.7) is solvable in any direction q. As a consequence, the matrix
94
7 The Observability Problem
(7.8) of the system has an inverse W 1 1 ðt 0 , t 1 Þ. Then, by definition of the inverse matrix W 1 1 ðt 0 , t 1 ÞW 1 ðt 0 , t 1 Þ ¼ E: We multiply this identity by an unknown vector x1 on the right. In view of formula (7.8), we obtain
x1 ¼
W 1 1 ðt 0 , t 1 Þ
ðt1
F ðt, t 1 Þ0 C ðt Þ0 Cðt ÞF ðt, t 1 Þx1 dt:
t0
It is easy to see the process xðt Þ ¼ F ðt, t 1 Þx1 , yðt Þ ¼ Cðt ÞF ðt, t 1 Þx1 , x1 of a homogeneous problem under the integral sign and rewrite the above formula as
x1 ¼
W 1 1 ðt 0 , t 1 Þ
ðt1
F ðt, t 1 Þ0 C ðt Þ0 yðt Þdt:
t0
Then a vector x1 is expressed in terms of its known values. To determine x1 we need to fulfill the following operations: 1. calculate the matrix W1(t0, t1), Ðt1 2. determine vector h ¼ F ðt, t 1 Þ0 C ðt Þ0 yðt Þdt, t0
3. solve the system of linear algebraic equations W 1 ðt 0 , t 1 Þx1 ¼ h:
ð7:9Þ
The computation of coefficients of system (7.9) can be reduced to a solution of the Cauchy problem V_ ¼ Aðt Þ0 V VAðt Þ þ C ðt Þ0 Cðt Þ, V ðt 0 Þ ¼ 0, ψ_ ¼ Aðt Þ0 ψ þ C ðt Þ0 yðt Þ, ψ ðt 0 Þ ¼ 0 for matrix and vector functions ðt V ðt Þ ¼ W 1 ðt 0 , t Þ, ψ ðt Þ ¼ t0
F ðτ, t Þ0 C ðτÞ0 yðτÞdτ:
7.4 Observability in Nonhomogeneous System
95
Then, in equations (7.9) W 1 ðt 0 , t 1 Þ ¼ V ðt 1 Þ, h ¼ ψ ðt 1 Þ:
7.4
Observability in Nonhomogeneous System
We return to the problem of nonhomogeneous observability (7.1), (7.2) regarding u(t) 6¼ 0. Let, x(t), y(t), x1 be arbitrary and xðt Þ, yðt Þ, 0 be the basic processes of a nonhomogeneous problem. We assume that x_ ðt Þ ¼ Aðt Þxðt Þ þ Bðt Þuðt Þ, xðt 1 Þ ¼ x1 , yðt Þ ¼ C ðt Þxðt Þ, t 0 t < t 1 ; x_ ðt Þ ¼ Aðt Þxðt Þ þ Bðt Þuðt Þ, xðt 1 Þ ¼ 0, yðt Þ ¼ C ðt Þxðt Þ, t 0 t < t 1 : Then put xðt Þ ¼ xðt Þ þ Δxðt Þ, yðt Þ ¼ yðt Þ þ Δyðt Þ, x1 ¼ 0 þ Δx1 , and the increments Δx(t), Δy(t), Δx1 form a process of a homogeneous observability problem ðΔxÞ ¼ Aðt ÞΔx, Δyðt Þ ¼ Cðt ÞΔxðt Þ, t 0 t < t 1 ,
ð7:10Þ
that is analogous to (7.3), (7.4). Therefore, all results for item 7.3 are applicable in this case, that is, the criterion of q-observability for a nonhomogeneous observability system and the procedure of recovery of vector Δx1 ¼ x1 remain valid. The system of algebraic equations (7.9) and the Cauchy problem to calculate its coefficients regarding the problem of observability (7.10) have the form V ðt 1 ÞΔx1 ¼ ψ ðt 1 Þ,
ð7:11Þ
V_ ¼ Aðt Þ V VAðt Þ þ C ðt Þ Cðt Þ, V ðt 0 Þ ¼ 0, ψ_ ¼ Aðt Þ0 ψ þ C ðt Þ0 Δyðt Þ, ψ ðt 0 Þ ¼ 0:
ð7:12Þ
0
0
Example 7.1 Consider the nonhomogeneous observability problem x_ 1 ¼ x2 , x_ 2 ¼ 2, t 2 þ 2t ¼ x1 ðt Þ þ x2 ðt Þ, 0 t < t 1 : Here
96
7 The Observability Problem
n ¼ 2, r ¼ 1, m ¼ 1, uðt Þ ¼ 2, yðt Þ ¼ t þ 2t, t 0 ¼ 0, A ¼ 2
0
B¼
1
!
0
1
0
0
! ,
, C ¼ ð1, 1Þ:
The basic process xðt Þ, yðt Þ, 0 is defined by the conditions x_ 1 ¼ x2 , x_ 2 ¼ 2, x1 ðt 1 Þ ¼ 0, x2 ðt 1 Þ ¼ 0, : yðt Þ ¼ x1 ðt Þ þ x2 ðt Þ, 0 t < t 1 : From here, we get x1 ðt Þ ¼ ðt t 1 Þ2 , x2 ðt Þ ¼ 2ðt t 1 Þ, yðt Þ ¼ 2ðt t 1 Þ þ ðt t 1 Þ2 : Consequently, Δyðt Þ ¼ yðt Þ yðt Þ ¼ t 2 þ 2t 2ðt t 1 Þ ðt t 1 Þ2 : We find the solutions of the Cauchy problems (7.12) 0 B V ðt Þ ¼ t @
1 t þ1 2
1 0 1 t þ1 ð t þ 1Þ 2 t 1 t 1 2 C B C A , ψ ð t Þ ¼ t 1 @ ð t þ 1Þ 3 A: t1 t2 1 t2 þ þ ð2 t 1 Þt þ tþ1 3 3 2 3
The system of linear algebraic equations (7.11) is obtained in the form t Δx1 1 1 Δx2 ¼ 2t 1 , 2 2 t t3 t 1 1 Δx1 þ 1 t 1 þ 1 Δx2 ¼ 1 t 21 þ 2t 1 2 3 6 and has a solution Δx1 ¼ x1 ðt 1 Þ ¼ t 21 , Δx2 ¼ x2 ðt 1 Þ ¼ 2t 1 :
7.5
Observability of an Initial State
The method given above can be used to find the unknown initial state x(t0) ¼ x0 of the system of linear differential equations
7.5 Observability of an Initial State
97
x_ ¼ Aðt Þx
ð7:13Þ
yðt Þ ¼ C ðt Þxðt Þ, t 0 < t t 1
ð7:14Þ
by using the results of observations
of its unknown trajectory x(t). A triplet x(t), y(t), x0 satisfying (7.13), (7.14) and condition x(t0) ¼ x0 is said to be a process. We refer to the system (7.13), (7.14) as observable in direction q (q-observable) if there exists a continuous vector function z(t) : [t0, t1] ! Rm that for any process x(t), y(t), x0 of a homogeneous problem the equality ðt1
0
x0 q ¼
yðt Þ0 zðt Þdt
t0
holds. We repeat the same arguments as in item 7.2 to arrive at such conclusions. 1. For the q-observability of the system (7.13) (7.14) by the initial states, it is necessary and sufficient for the solvability of the system of linear algebraic equations W 2 ðt 0 , t 1 Þp ¼ q with a matrix of coefficients ðt1 W 2 ðt 0 , t 1 Þ ¼
F ðt, t 0 Þ0 Cðt Þ0 C ðt ÞF ðt, t 0 Þdt:
t0
2. The observability of the system (7.13) (7.14) in all directions (total observability) is equivalent to the condition rank W2(t0, t1) ¼ n. 3. If the system (7.13), (7.14) is totally observable, then the desired initial state x0 is found from the system of linear algebraic equations ðt1 W 2 ðt 0 , t 1 Þx0 ¼
F ðt, t 0 Þ0 Cðt Þ0 yðt Þdt:
t0
All that we had considered earlier concerning the calculation of coefficients of the last system with some natural changes is applicable in this case.
98
7 The Observability Problem
7.6
Relation Between Controllability and Observability
We determine the relation between the total observability of the system x_ ¼ Aðt Þ0 x, yðt Þ ¼ Bðt Þ0 xðt Þ, t 0 < t t 1
ð7:15Þ
according to the initial states and the total controllability of the linear system x_ ¼ Aðt Þx þ Bðt Þu, u 2 Rr
ð7:16Þ
on segment [t0, t1]. By definition, a fundamental matrix F(t, τ) for a linear system (7.16) is a solution of the Cauchy problem F τ ðt, τÞ ¼ F ðt, τÞAðτÞ, F ðt, t Þ ¼ E: Transposing the given matrix relations and changing t on τ and τ on t, we obtain F t ðτ, t Þ0 ¼ Aðt Þ0 F ðτ, t Þ0 , F ðτ, τÞ0 ¼ E: We can see from here that F(τ, t)0 is a fundamental matrix of the system of differential equations (7.15). From item 7.5, the condition rank W2(t0, t1) ¼ n with the matrix ðt1 W 2 ðt 0 , t 1 Þ ¼
F ðt 0 , t ÞBðt ÞBðt Þ0 F ðt 0 , t Þ0 dt
ð7:17Þ
t0
is the criterion for total observability of the system (7.15) by the initial states. The rank of matrix (7.17) will not change if we multiply it by non-singular matrices F(t1, t0) and F(t1, t0)0 on the left and on the right, accordingly. Due to the properties of the fundamental matrix and notation (4.13), we obtain ðt1
0
F ðt 1 , t 0 ÞW 2 ðt 0 , t 1 ÞF ðt 1 , t 0 Þ ¼
½F ðt 1 , t 0 ÞF ðt 0 , t ÞBðt ÞBðt Þ0 ½F ðt 1 , t 0 ÞF ðt 0 , t Þ0 dt ¼
t0
ðt1 ¼
F ðt 1 , t ÞBðt ÞBðt Þ0 F ðt 1 , t Þ0 dt ¼ W ðt 0 , t 1 Þ:
t0
Then, the criterion for the total observability of the system (7.15) can be written in equivalent form rank W(t0, t1) ¼ n. By lemma 4.2, it coincides with the criterion of the total controllability of the system (7.16) on the segment [t0, t1]. We then summarize to obtain the results.
7.7 Total Observability of a Stationary System
99
Theorem 7.2 (duality) For the total observability of the system (7.15) by the initial states, it is necessary and sufficient to have total controllability of the system (7.16) on the segment [t0, t1].
7.7
Total Observability of a Stationary System
We apply the duality theorem for the stationary observability of the system x_ ¼ Ax, yðt Þ ¼ Cxðt Þ, t 0 < t t 1
ð7:18Þ
with constant matrices A, C. According to Theorem 7.2, the total observability of the system (7.18) by its initial states is equivalent to the full controllability of the system x_ ¼ A0 x C 0 u, u 2 Rr
ð7:19Þ
on segment [t0, t1]. By Theorem 4.4, the criterion for the total controllability of the system (7.19) has the form n1 rank C0 , A0 C 0 , . . . , ðA0 Þ C 0 ¼ n
ð7:20Þ
(here some columns of the matrix are multiplied by 1 which does not change its rank). Condition (7.20) is an algebraic criterion for the total observability of the system (7.18) according to the initial states. Exercise Set 1. Apply the least squares method to solve the homogeneous problem of observability for (7.3), (7.4). Hint: Use a standard deviation ðt1 kyðt Þ C ðt ÞF ðt , t 1 Þx1 k2 dt
J ð x1 Þ ¼ t0
of observations y(t) from processes F(t, t1)x1, C(t)F(t, t1)x1, x1 of this problem. 2. Describe all the linearly independent directions in which the homogeneous system (7.3), (7.4) is observable and not observable. 3. Using the duality between the observability and controllability, determine the geometric properties of the system that is not totally observable. 4. How can matrix C be constructed with a minimal number of rows by matrix A so that the stationary system becomes totally observable?
Chapter 8
Identification Problem
Abstract The problem of identification – the possibility to define and calculate the unknown parameters of the controlled object by observable data is introduced. We establish criteria for identification. The application of the criteria for the restoration of unknown parameters is demonstrated.
8.1
Statement of the Problem
Given a mathematical model x_ ¼ Aðt Þx þ Bðt Þw, xðt 0 Þ ¼ 0
ð8:1Þ
of movement of an object and the results y(t) of the observation of its trajectory x(t): yðt Þ ¼ C ðt Þxðt Þ, t 0 t t 1 :
ð8:2Þ
Here A(t), B(t), C(t) are continuous for R matrices of sizes n n, n r, m n accordingly, m < n, y(t) is continuous on the [t0, t1] vector function of dimension m, and w 2 Rr is an unknown vector of parameters. It is required (1) to determine under what conditions on A(t), B(t), C(t), y(t), x0, t0, t1 there exists a vector w for which the corresponding solution x(t) of the Cauchy problem (8.1) satisfies the equality (8.2) and (2) in the case of the existence of a vector w, to specify the procedure for it to be determined. In actual applications, vector w and function y(t) are treated as the input and output of the model (8.1). In these terms, the problem of identification consists of finding the input of the model by providing a desired output.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_8
101
102
8.2
8 Identification Problem
Criterion of Identifiability
Let us refer to a triple x(t), y(t), w, satisfying the conditions (8.1), (8.2) of identification problem, as a process. Each process is unambiguously determined using a vector of parameters w. We say that the system (8.1), (8.2) is identifiable in a direction q (q-identifiable) if there exists a continuous vector function z(t) : [t0, t1] ! Rm such that for any process x(t), y(t), w, the following equality holds Zt1
0
wq¼
yðt Þ0 zðt Þdt:
ð8:3Þ
t0
In other words, in a q-identifiable system, we can restore the projection of each vector w on the direction q by using one and the same linear operation that corresponds to the w process. We then proceed to derive the identifiability criterion. Suppose the system (8.1), (8.2) is identifiable in some direction q. Then, a continuous function z(t) satisfying equation (8.3) exists for any vector w 2 Rr. We utilize the conditions (8.1), (8.2) and the Cauchy formulae to obtain Zt yð t Þ ¼ C ð t Þ
F ðt, τÞBðτÞwdτ:
ð8:4Þ
t0
Substitute function (8.4) in equality (8.3). Since a vector w is arbitrary, we have Zt1 q¼ t0
0 @
Zt
1 BðτÞ F ðt, τÞ dτAC ðt Þ0 zðt Þdt: 0
0
ð8:5Þ
t0
As a consequence z(t) is a solution of the integral equations (8.5) and the converse is also true. Indeed, suppose that the function z(t) satisfies (8.5). We multiply this equation by an arbitrary vector w0 on the left. Then, the relation (8.3) with the function y(t) of the form (8.4) is true. That is, the system (8.1) (8.2) is identifiable in the q direction, and thus, the solvability of the system of integral equations (8.5) is necessary and sufficient for the q-identifiability of the system (8.1), (8.2). The results for item 4.3 are then applied to obtain a theorem for these findings. Theorem 8.1 (criterion of Identifiability) The system (8.1), (8.2) is identifiable in direction q if and only if the system of linear algebraic equations W 3 ðt 0 , t 1 Þp ¼ q with a matrix of coefficients
ð8:6Þ
8.4 Total Identificaition of Stationary System
Zt1 W 3 ðt 0 , t 1 Þ ¼ t0
0 @
Zt
103
1
0
BðτÞ0 F ðt, τÞ0 dτAC ðt Þ0 Cðt Þ@
t0
Zt
1 F ðt, τÞBðτÞdτAdt
ð8:7Þ
t0
has a solution. The computation for matrix W3(t0, t1) can be reduced to the solution of the matrix Cauchy problem V_ ¼ Aðt ÞV þ Bðt Þ, V ðt 0 Þ ¼ 0, _ ¼ V 0 C ðt Þ0 Cðt ÞV, W ðt 0 Þ ¼ 0 W for functions Zt V ðt Þ ¼
Zt F ðt, τÞBðτÞdτ, W ðt Þ ¼
t0
8.3
V ðτÞ0 CðτÞ0 C ðτÞV ðτÞdτ ¼ W 3 ðt 0 , t Þ:
t0
Restoring the Parameter Vector
Suppose the system (8.1), (8.2) is identifiable in any direction q. Then the matrix of coefficients (8.7) of equations (8.6) has a maximum rank r and as a consequence, there is an inverse matrix W 1 3 ðt 0 , t 1 Þ. By definition E ¼ W 1 3 ðt 0 , t 1 ÞW 3 ðt 0 , t 1 Þ: We multiply this identity by sought-for vector w. Using the notation of (8.7), (8.4), we obtain w ¼ W 1 3 ðt 0 , t 1 Þ
Zt1 t0
0 @
Zt
1 BðτÞ F ðt, τÞ dτACðt Þ0 yðt Þdt: 0
0
t0
This formula expresses a vector of parameters w via the known data of the problem of identification.
8.4
Total Identificaition of Stationary System
The criterion for the identifiability of a stationary system
104
8 Identification Problem
x_ ¼ Ax þ Bw, xðt 0 Þ ¼ 0, yðt Þ ¼ Cxðt Þ, t 0 t t 1
ð8:8Þ
can be expressed directly in terms of matrices A, B, C. The idea is the following: in the case of the identifiability of the system (8.8) in not all directions (partial identifiability), a rank of the matrix Zt1 W 3 ðt 0 , t 1 Þ ¼ t0
0 @
Zt
1
0
B0 F ðt, τÞ0 dτAC0 C @
t0
Zt
1 F ðt, τÞBdτAdt
ð8:9Þ
t0
is less than r. Then, a system of homogeneous linear algebraic equations W3(t0, t1)p ¼ 0 has a solution p 6¼ 0 and 2 Zt1 Z t 0 0 0 0 0 p W 3 ðt 0 , t 1 Þp ¼ p B F ðt, τÞ C dτ dt ¼ 0: t0
ð8:10Þ
t0
Hence Zt
p0 B0 F ðt, τÞ0 C0 dτ 0, t 0 t t 1 :
ð8:11Þ
t0
Differentiating identity (8.11) n 1 times and setting in them t ¼ t0, we have p0 B0 C0 ¼ 0, p0 B0 A0 C 0 ¼ 0, . . . , p0 B0 ðA0 Þ
n1
C 0 ¼ 0:
ð8:12Þ
Equalities (8.12) mean that the rows of the r mn-matrix n1 D ¼ B0 C0 , B0 A0 C0 , . . . , B0 ðA0 Þ C 0 are linearly dependent and rank D < r. Thus, the condition rank D < r is necessary for the partial identifiability of the system (8.1). Let us show the sufficiency of this condition. Suppose rank D < r. Then the rows of matrix D are linearly dependent since there exists a vector p 6¼ 0 for which the relations (8.12) hold. By the Cayley-Hamilton theorem, a matrix A0 satisfies its characteristic equation jA0 λE j ¼ λn þ a1 λn1 þ . . . þ an1 λ þ an ¼ 0: That is
8.4 Total Identificaition of Stationary System
ðA0 Þ þ a1 ðA0 Þ n
105
þ . . . þ an1 A0 þ an E ¼ 0:
n1
We multiply the last equality on the left by the matrix p0B0(A0)k and on the right by C 0 . Adding to successively the values k ¼ 0, 1, . . . and taking into account (8.12), we obtain p0 B0 ðA0 Þ
nþk
C 0 ¼ 0, k ¼ 0, 1, . . . :
By construction, the function t !
Rt
ð8:13Þ
p0 B0 F ðt, τÞ0 C 0 dτ is analytical on R, that is, it
t0
can be represented by a converging power series. Since at the point t ¼ t0 its value and due to (8.12), (8.13) all its derivatives are equal to zero, then it is identically equal to zero on R. Then relations (8.10), (8.11) are valid, i.e., rows of matrix (8.9) are linearly dependent and its rank is less than r. We can reverse the criterion of the partial identifiability to obtain the criterion of the total identifiability of a stationary system, that is, the identifiability in all directions. Theorem 8.2 The stationary system (8.8) is totally identifiable if and only if n1 rank B0 C 0 , B0 A0 C0 , . . . , B0 ðA0 Þ C 0 ¼ r: Exercise Set 1. What changes will occur in the procedure to recover the parameter vector, if we place x(t0) ¼ x0 6¼ 0 in an identification problem (8.1) (8.2)? 2. For B ¼ E, the criterion of total identifiability in the Theorem 8.2 coincides with the criterion of total observability (7.20). What does this mean? 3. Provide an example of a non-identifiable system. 4. How will the criterion of identifiability change if we replace the initial condition x(t0) ¼ 0 with condition x(t0) ¼ x0 6¼ 0 in the identification problem (8.1), (8.2) and if we regard w, x0 as sought-for vectors? Hint: introduce a new phase vector z ¼ w and consider the observability problem of the initial state in the system x_ ¼ Aðt Þx þ Bðt Þz, z_ ¼ 0, xðt 0 Þ ¼ x0 , zðt 0 Þ ¼ w, yðt Þ ¼ C ðt Þxðt Þ, t 0 t t 1 :
Part III
Control of Nonlinear Systems
Chapter 9
Types of Optimal Control Problems
Abstract We describe the specific elements of optimal control problems: objective functions, mathematical model, constraints. It is introduced necessary terminology. We distinguish several classes of problems: Simplest problem, Two-point minimum time problem, Genetal problem with the movable ends of the integral curve, Problem with intermediate states, and Common problem of optimal control.
9.1
General Characteristics
A formulation of the problem of optimal control includes a control objective, a mathematical model of the controlled object, constraints and a description of a class of controls. The control objective is a request expressed in a formal form for the behavior of a controlled object. An objective of the control can be, for example, a transfer of the controlled object from one position to another in a finite amount of time or to keep the trajectory of motion within given limits, etc. Often the objective of control is to optimize (maximize or minimize) an objective functional, that is, a numerical parameter specified on a set of processes. The values of the objective functional characterize a “quality” of processes. When optimizing the objective functional, we distinguish the processes of the best quality from the various ones. As was already mentioned, a mathematical model of a controlled object is some law of transformation of controls into trajectories of an object. It can be set by a system of ordinary differential equations, partial differential equations, integral equations, recurrence relations, or in other ways. Constraints are additional conditions for processes that arise from the physical meaning of the statement of a control problem. The requirements related with the safe operation of a controlled object lead to phase constraints on a state vector or to mixed constraints on state vectors and controls simultaneously. In particular, the initial conditions for differential equations can be regarded as the simplest phase constraints. The class of controls is defined by specifying the analytical properties and the range of control variables. For example, a previously used class of controls K(R ! U ) consists of piecewise continuous functions u(t) : R ! Rr with values in © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_9
109
110
9 Types of Optimal Control Problems
a compact U ⊂ Rr. In addition, optimal control can use more general classes of summarizing or measurable controls that are dictated by the physical meaning of the problem or by the wish to ensure the solvability of the problem. A wider a class of controls allows for greater possibility for the optimal control to exist. However, the expansion of the class of controls requires using a more sophisticated mathematical apparatus and details of the theory of functions, functional analysis and differential equations. Thus, in this course, we restrict a class of controls by using piecewise continuous functions.
9.2
Objective Functionals
In optimal control theory, we traditionally consider three types of objective functionals defined on the processes x(t), u(t) of a system of differential equations x_ ¼ f ðx, u, t Þ, xðt 0 Þ ¼ x0 :
ð9:1Þ
Terminal Functional (Mayer Functional) J 1 ¼ Φðxðt 1 Þ, t 1 Þ,
ð9:2Þ
is defined by a scalar function Φ(x, t) on the ends (x(t1), t1) of the integral curves (x(t), t), where t1 is fixed or not fixed in advance in a given moment of time, t1 > t0. The integral functional (Lagrange functional) is given by a scalar function F(x, u, t) in a form of definite integral Zt1 J2 ¼
F ðxðt Þ, uðt Þ, t Þdt:
ð9:3Þ
t0
The analytical properties of the function F(x, u, t) are usually assumed to be the same as that for the right-hand sides of (9.1). Then, a complicated function t ! F(x(t), u(t), t) is piecewise continuous on a segment [t0, t1], and the existence of the integral (9.3) is guaranteed by the appropriate theorem of mathematical analysis.
9.2 Objective Functionals
111
Mayer-Bolts Functional
Zt1 J 3 ¼ Φðxðt 1 Þ, t 1 Þ þ
F ðxðt Þ, uðt Þ, t Þdt
ð9:4Þ
t0
is the sum of the functionals (9.2), (9.3). If the function Φ(x, t) belongs to the class C1(Rn R ! R) and Φ(x0, t0) ¼ 0, then the terminal functional can be easily transformed into an integral one. Indeed, according to the Leibniz-Newton formula Zt1 Φðxðt 1 Þ, t 1 Þ ¼
Zt1 dΦðxðt Þ, t Þ ¼
t0
½Φx ðxðt Þ, tÞ0 x_ ðt Þ þ Φt ðxðt Þ, tÞdt ¼
t0
Zt1 ¼
½Φx ðxðt Þ, tÞ0 f ðxðt Þ, uðt Þ, tÞ þ Φt ðxðt Þ, tÞdt:
t0
The integral in the right-hand side is a Lagrange functional with a generating function F ðx, u, t Þ ¼ Φx ðx, t Þ0 f ðx, u, t Þ þ Φt ðx, t Þ: A reverse transition from the integral functional to the terminal functional is carried out by extending the phase space, that is, by the introducing an additional phase variable xn + 1 according to the formulas x_ nþ1 ¼ F ðx, u, t Þ, xnþ1 ðt 0 Þ ¼ 0: Appending these relations to the conditions for (9.1), we obtain an extended system of differential equations and initial conditions x_ ¼ f ðx, u, t Þ, x_ nþ1 ¼ F ðx, u, t Þ, xðt 0 Þ ¼ x0 , xnþ1 ðt 0 Þ ¼ 0: If x(t), u(t) is a process of the system (9.1), then a triple
112
9 Types of Optimal Control Problems
Zt xðt Þ, xnþ1 ðt Þ ¼
F ðxðτÞ, uðτÞ, τÞdτ, uðt Þ t0
will be a process of extended system. From here, we have Zt1 F ðxðt Þ, uðt Þ, t Þdt ¼ xnþ1 ðt 1 Þ, t0
when t ¼ t1. Consequently, the integral functional in the system (9.1) coincides with the terminal functional in an extended system with a generating function Φ(x, x n + 1) ¼ x n + 1. The above methods are then used to transform the objective functionals and constraints to a terminal or integral form. Thus, it is important that when constructing a theory, we can only apply functionals of one type. The results for the other types of functionals are then obtained by using the above transformations.
9.3
Constraints on the Ends of a Trajectory. Terminology
Let t0, t1 be fixed or not in advance moments of time, t0 < t1, and let x(t), u(t) be an arbitrary process of the system (9.1). The points x(t0) and x(t1) are referred to as the left and right ends of a trajectory x(t), and the pairs (x(t0), t0), (x(t1), t1) are referred to as the left and right ends of an integral curve (x(t), t). The most general constraint on the ends of an integral curve has the form ðxðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ 2 G, where G is a given set of Cartesian product Rn Rn R R. If this inclusion unambiguously (ambiguously) defines the points x(t0), x(t1), we speak about fixed (mobile) ends of a trajectory. We apply the same terms to the ends (x(t0), t0), (x(t1), t1) of an integral curve or to the moments of time t0, t1. The end of the trajectory that does not impose any restrictions is referred to as the free end of a trajectory. There may be different combinations of requirements for the ends of integral curves in optimal control problems. For example, the left end of an integral curve can be fixed and the right end can be a free end at the same time when the moments t0, t1 are fixed or mobile. There can also be fixed ends of trajectory while the moments of time t0, t1 are mobile, and so on. Several types of problems that will be the subject of our further study are now considered.
9.6 General Optimal Control Problem
9.4
113
The Simplest Problem
The Simplest problem of optimal control (S-problem) consists of minimizing a terminal functional on a set of processes x(t), u(t) of a controlled system with fixed left and free right ends of a trajectory, and fixed moments t0, t1 of time. This problem has the form J ¼ Φðxðt 1 ÞÞ ! min , x_ ¼ f ðx, u, t Þ, xðt 0 Þ ¼ x0 , u 2 U, t 2 ½t 0 , t 1 , where a scalar function Φ(x) belongs to the class C1(Rn ! R). Regarding the function f, the range of control U and the class of control, the agreements that we set up earlier in Sects. 2.3 and 2.2 remain valid. The objective of control, the mathematical model of the controlled object, the phase constraint in the form of the initial condition and the restrictions on the vector of control are represented in an S-problem.
9.5
Two-Point Minimum Time Problem
Two-point minimum time problem (M-problem) is an optimal control problem with fixed endpoints of a trajectory and mobile moments of time: J ¼ t 1 t 0 ! min , x_ ¼ f ðx, u, t Þ, xðt 0 Þ ¼ x0 , xðt 1 Þ ¼ x1 , u 2 U, t 0 t 1 : Here x0, x1 are the given points of space Rn. The problem is thus to minimize a transition time from the point x0 to the point x1 along the trajectory of a system of differential equations of a controlled object by means of an appropriate control and end points of time t0, t1. The solution of the problem is trivial when x0 ¼ x1. Leaving aside this case, we assume that x0 6¼ x1.
9.6
General Optimal Control Problem
General optimal control problem (G-problem) has mobile ends of an integral curve:
114
9 Types of Optimal Control Problems
J 0 ¼ Φ0 ðxðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ ! min , ( 0, i ¼ 1, . . . , m0 , J i ¼ Φi ðxðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ ¼ 0, i ¼ m0 þ 1, . . . , m, x_ ¼ f ðx, u, t Þ, u 2 U, t 0 t 1 : Here Φ0, . . ., Φm are the given functions of the class C1(Rn Rn R R ! R), m0 is an integer nonnegative number, and m is a natural number. If m0 ¼ 0 or m0 ¼ m, then the G-problem only has constraints-equalities Ji ¼ 0, i ¼ 1, . . ., m, or only constraints-inequalities Ji 0, i ¼ 1, . . ., m, respectively. The process is said to be a quaternion x(t), u(t), t0, t1 that satisfies all conditions of the G-problem except, possibly, the first condition. A process x(t), u(t), t0, t1 is regarded to be optimal if for any other process exðt Þ, e uðt Þ, et 0 , et 1 , the following inequality is true Φ0 ðxðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ Φ0 ðexðet 0 Þ, exðet 1 Þ, et 0 , et 1 Þ: The G-problem consists of determining the optimal process.
9.7
Problem with Intermediate States
Problem with intermediate states (IS-problem) is a generalization of the General optimal control problem. In the notation of Sect. 9.2, it has the form J 0 ¼ Φ0 ðxðtÞ, tÞ ! min , ( 0, i ¼ 1, . . . , m0 , J i ¼ Φi ðxðtÞ, tÞ ¼ 0, i ¼ m0 þ 1, . . . , m, x_ ¼ f ðx, u, t Þ, u 2 U, t 2 T: Here t ¼ (t0, . . ., ts) is a vector of intermediate times t0, . . ., ts, x(t) ¼ (x(t0), . . ., x(ts)) is a matrix of corresponding states x(t0), . . ., x(ts), Φ0, . . ., Φm are functions of the classC1(Rn(s + 1) Rs + 1 ! R), m0 is non-negative integer, m is natural number, T is given set of space Rs + 1. As before, we assume that if m0 ¼ 0 the problem contains only equality-constraints Ji ¼ 0, i ¼ 1, . . ., m and if m0 ¼ m it contains only inequality-constraints Ji 0, i ¼ 1, . . ., m. The concepts of process x(t), u(t), t and optimal process are similar to those in Sect. 9.6.
9.8
Common Problem of Optimal Control
The general optimal control problem (C-problem) has the form
9.8 Common Problem of Optimal Control
115
Zt1 J ¼ Φðxðt 1 ÞÞ ¼
F ðxðt Þ, uðt Þ, t Þ ! min , t0
x_ ¼ f ðx, u, t Þ, xðt 0 Þ ¼ x0 , xðt 1 Þ 2 G, ðxðt Þ, uðt ÞÞ 2 V, t 2 ½t 0 , t 1 : It is the problem with fixed time, fixed left and moving right ends of the trajectory and mixed constraints on control and phase state of the control object. Here, the scalar functions Φ(x) and F(x, u, t) are continuous on the sets Rn and Rn Rr R respectively, the vector function f(x, u, t) meets the requirements of Sect. 2.3, sets G ⊂ Rn and V(t) ⊂ Rn Rr for t 2 [t0, t1] are given, moments of time t0, t1, t0 < t1 and point x0 2 Rn are fixed. Understanding a control u(t) and solution x(t) of the system x_ ¼ f ðx, uðt Þ, t Þ in the previous sense, we call the pair x(t), u(t) by a process of the C-problem if it satisfies to all of its conditions, except, possibly, the first one. The C-problem is in finding of the optimal process x(t), u(t) with the minimum value of the objective functional J.
Chapter 10
Small Increments of a Trajectory
Abstract With the aid of «small» variations of a fixed basis process we construct and describe the family of «close» processes by means of linear approximation. We clarify the concept of small variations and close processes.
10.1
Statement of a Problem
We consider the system of differential equations x_ ¼ f ðx, u, t Þ that satisfies the assumptions of Sect. 2.3. Suppose that there are two processes x(t), u(t) and exðt Þ, e uðt Þ of this system corresponding to initial conditions x(t0) ¼ x0 and exðet 0 Þ ¼ ex0 specified on a common segment of time I ¼ [τ0, τ1] (Fig. 10.1). Assuming that points t 0 , et 0 , t 1 , et 1 are lying on a segment I, find the following issues: (1) in what sense the increments Δuðt Þ ¼ e uðt Þ uðt Þ, Δx0 ¼ ex0 x0 , Δt 0 ¼ et 0 t 0 of the control and initial values must be small, and the increment Δxðt Þ ¼ exðt Þ xðt Þ of the trajectory must be uniformly small on the segment I; (2) what are the main members of the increment Δx(t); (3) what is the relation between the points x(t1) and exðet 1 Þ in the close moments of time t1 and et 1 ¼ t 1 þ Δt 1 .
10.2
Evaluation of the Increment of Trajectory
By assumption, the vector identities
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_10
117
118
10
Small Increments of a Trajectory
Fig. 10.1 Explanation of the notation
Zt
Zt f ðxðτÞ, uðτÞ, τÞdτ, exðt Þ ¼ ex0 þ
xðt Þ ¼ x0 þ
f ðexðτÞ, e uðτÞ, τÞdτ, t 2 I et0
t0
hold for the processes that are considered. In equivalent coordinate form, we have Zt
Zt f i ðxðτÞ, uðτÞ, τÞdτ, exi ðt Þ ¼ exi0 þ
xi ðt Þ ¼ xi0 þ
f i ðexðτÞ, e uðτÞ, τÞdτ, t 2 I, et0
t0
i ¼ 1, . . . , n: ð10:1Þ Subtracting the first identity (10.1) from the second, we obtain Zt0 f i ðexðτÞ, e uðτÞ, τÞdτþ
Δxi ðt Þ ¼ Δxi0 þ et0 Zt
½ f i ðxðτÞ þ ΔxðτÞ, e uðτÞ, τÞ f i ðxðτÞ, uðτÞ, τÞdτ:
þ t0
We then use the Lagrange formula to take out from the square brackets the linear elements by the Δx(τ) members. We assuming for brevity that Δeu f i ðx, u, t Þ ¼ f i ðx, e u, t Þ f i ðx, u, t Þ, i ¼ 1, . . . , n we can write
10.2
Evaluation of the Increment of Trajectory
Zt0
Zt f i ðexðτÞ, e uðτÞ, τÞdτ þ
Δxi ðt Þ ¼ Δxi0 þ
119
ΔeuðτÞ f i ðxðτÞ, uðτÞ, τÞdτþ
t0 et0 t Z X n þ f ix j yi ðτÞ, e uðτÞ, τ Δx j ðτÞdτ, t0
ð10:2Þ
j¼1
where yi ðτÞ ¼ xðτÞ þ αi ðτÞΔxðτÞ, 0 αi ðτÞ 1, i ¼ 1, . . . , n: We then evaluate the upper bound of the right-hand side of (10.2). The continuous functions xðt Þ, exðt Þ are limited on I, so their values belong to the closed ball B ⊂ Rn of a finite radius. Similarly, the continuous functions j f i ðx, u, t Þj, f ix j ðx, u, t Þ are bounded by a common constant M > 0 on a compact B U I: j f i ðx, u, t Þj M, f ix j ðx, u, t Þ M, ðx, u, t Þ 2 B U I, i, j ¼ 1, . . . , n: From here it follows in particular that, uðτÞ, τ M, τ 2 I, i, j ¼ 1, . . . , n: uðτÞ, τÞj M, f ix j yi ðτÞ, e j f i ðexðτÞ, e Applying the last estimates in (10.2), we obtain Zt0 uðτÞ, τÞjdτþ jΔxi ðt Þj jΔxi0 j þ j f i ðexðτÞ, e et0 t Z X Z n i þ ΔeuðτÞ f i ðxðτÞ, uðτÞ, τÞdτ þ uðτÞ, τ Δx j ðτÞ dτ f ix j y ðτÞ, e j¼1 t0
I
Z Zt X n Δx j ðτÞdτ, jΔxi0 j þ M jΔt 0 j þ ΔeuðτÞ f i ðxðτÞ, uðτÞ, τÞdτ þ M τ0
I
j¼1
i ¼ 1, . . . , n: We then strengthen the right-hand side of this inequality by replacing the modules of the coordinates of the given vectors by their Euclidean norms. Then
120
10
Z
Δ f dτ þ Mn eu
jΔxi ðt Þj kΔx0 k þ M jΔt 0 j þ
Small Increments of a Trajectory
Zt kΔxðτÞkdτ, i ¼ 1, . . . , n, τ0
I
where, for brevity, we put Z
Δ f dτ ¼ eu
I
Z ΔeuðτÞ f ðxðτÞ, uðτÞ, τÞdτ: I
It follows from the obtained estimate that n X
kΔxðt Þk ¼
!1=2 jΔxi ðt Þj
2
i¼1
2 11=2 Z Zt n X kΔx0 k þ M jΔt 0 j þ Δ f dτ þ Mn kΔxðτÞkdτ A @ eu i¼1 τ0 I 0 1 Z Zt n1=2 @kΔx0 k þ M jΔt 0 j þ Δeu f dτ þ Mn kΔxðτÞkdτA 0
τ0
I
or 0 kΔxðt Þk ð1 þ M Þn1=2 @kΔx0 k þ jΔt 0 j þ
Z
1 Δ f dτAþ eu
I
Zt
ð10:3Þ
kΔxðτÞkdτ, t 2 I:
Mn3=2 τ0
Lemma 10.1 If a continuous non-negative function z(t) satisfies the integral inequality Zt zðτÞdτ, t τ0
zðt Þ a þ b τ0
with constant coefficients a > 0, b 0, then zðt Þ aebðtτ0 Þ , t τ0 : Proof Suppose that function z(t) satisfies the conditions of the lemma. Let
ð10:4Þ
10.2
Evaluation of the Increment of Trajectory
121
Zt yð t Þ ¼ a þ b
zðτÞdτ, t τ0 : τ0
In its domain, the function y(t) is positive, has a continuous derivative y_ ðt Þ ¼ bzðt Þ and, in view of condition (10.4), it satisfies the inequality y_ ðt Þ ¼ bzðt Þ byðt Þ: Hence, by means of successive transformations, we find y_ ðt Þ b, ð ln yðt ÞÞg b, yð t Þ
Zt
Zt g
ð ln yðτÞÞ dτ τ0
bd τ, τ0
ln yðt Þ ln yðτ0 Þ bðt τ0 Þ, yðt Þ aebðtτ0 Þ : The statement of the lemma is proven since z(t) y(t) by condition (10.4). Returning to the analysis of inequality (10.3). Lemma 10.1 is applied to (10.3), and we obtain 0 kΔxðt Þk K @kΔx0 k þ jΔt 0 j þ
Z
1 Δ f dτA, t 2 I, eu
ð10:5Þ
I
where the constant K ¼ ð1 þ M Þn1=2 eMn
3=2
ðτ1 τ0 Þ
ð10:6Þ
depends on the function f and the compact B U I. The estimation (10.5) with the constant (10.6) results in an answer on the first (of the above three) question. A uniform smallness of increment Δx(t) on a segment I is provided by small increments Δx0, Δt0 of the initial values and the increment Δu(t) of control for which the integral Z I
Δ f dτ ¼ eu
Z k f ðxðτÞ, uðτÞ þ ΔuðτÞ, τÞ f ðxðτÞ, uðτÞ, τÞkdτ I
is small. The latter does not necessarily mean a uniform smallness of the increment of control. The integral will also be small if the increments of control are big for short time intervals (Fig. 10.2).
122
10
Small Increments of a Trajectory
Fig. 10.2 The large increments for control on short time intervals cause uniformly small increments in the trajectory
10.3
Representation of Small Increments of Trajectory
We proceed with the analysis for solution Δx(t) of the Cauchy problem Δ_x ¼ f ðxðt Þ þ Δx, e uðt Þ, t Þ f ðxðt Þ, uðt Þ, t Þ, t 2 I, Δxðt 0 Þ ¼ Δx0 þ exðt 0 Þ exðt 0 þ Δt 0 Þ:
ð10:7Þ
according to the assumptions and notations of Sect. 10.1. Here, Δ_x ¼ ex_ x_ and the initial condition Δxðt 0 Þ ¼ exðt 0 Þ xðt 0 Þ ¼ exðt 0 Þ x0 are transformed by adding a zero term 0 ¼ exðet 0 Þ exðet 0 Þ ¼ ex0 exðet 0 Þ ¼ x0 þ Δx0 exðt 0 þ Δt 0 Þ: Our goal is to allocate the principal terms in the increment Δx(t) in a sense that it will be further defined. The system of variational equations δ_x ¼ f x ðxðt Þ, uðt Þ, t Þδx þ ΔeuðtÞ f ðxðt Þ, uðt Þ, t Þ, t 2 I, δxðt 0 Þ ¼ Δx0 ex_ ðt 0 ÞΔt 0 ,
ð10:8Þ
that is obtained by linearization of the conditions (10.7) plays an important role here. The coefficient matrix fx(x(t), u(t), t) of the variational equations is composed of partial derivatives f ix j ðxðt Þ, uðt Þ, t Þ, i, j ¼ 1, . . . , n. F(t, τ) denotes the fundamental matrix of solutions of a homogeneous matrix system of differential equations
10.3
Representation of Small Increments of Trajectory
123
F τ ðt, τÞ ¼ F ðt, τÞ f x ðxðτÞ, uðτÞ, τÞ, F ðt, t Þ ¼ E,
ð10:9Þ
and write a solution of the system (10.8) by using the Cauchy formula Zt _ δxðt Þ ¼ F ðt, t 0 Þ Δx0 Δt 0exðt 0 Þ þ F ðt, τÞΔeuðτÞ f ðxðτÞ, uðτÞ, τÞdτ, t 2 I:
t0
ð10:10Þ Since the increment Δx(t) satisfies (10.7), then the following equality identically holds Zt
Zt
F ðt, τÞ½ f ðxðτÞ þ ΔxðτÞ, e uðτÞ, τÞ f ðxðτÞ, uðτÞ, τÞdτ, t 2 I:
F ðt, τÞΔ_xðτÞdτ ¼ t0
t0
We apply the method of integration by parts to evaluate the integral in the left hand side of this equality. Then Zt
Zt F ðt, τÞΔ_xðτÞdτ ¼ Δxðt Þ F ðt, t 0 ÞΔxðt 0 Þ t0
F τ ðt, τÞΔxðτÞdτ: t0
Using the initial condition (10.7), we have Zt Δxðt Þ ¼ F ðt, t 0 Þ½Δx0 þ exðt 0 Þ exðt 0 þ Δt 0 Þ þ
F τ ðt, τÞΔxðτÞdτþ t0
Zt F ðt, τÞ½ f ðxðτÞ þ ΔxðτÞ, e uðτÞ, τÞ f ðxðτÞ, uðτÞ, τÞdτ, t 2 I:
þ t0
By Taylor formula exðt 0 þ Δt 0 Þ ¼ exðt 0 Þ þ Δt 0ex_ ðt 0 Þ þ oðjΔt 0 jÞ, f ðxðτÞ þ ΔxðτÞ, e uðτÞ, τÞ ¼ f ðxðτÞ, e uðτÞ, τÞ þ f x ðxðτÞ, e uðτÞ, τÞΔxðτÞ þ oðkΔxðτÞkÞ: Here o(ε) is a small value of an order higher than ε: ko(ε)k/ε ! 0 when ε ! 0. In view of eq. (10.9) on the fundamental matrix and the formula (10.10), we obtain
124
10
Small Increments of a Trajectory
Δxðt Þ ¼ δxðt Þ þ r ðt Þ,
ð10:11Þ
where Zt r ðt Þ ¼ F ðt, t 0 ÞoðjΔt 0 jÞ þ
F ðt, τÞoðkΔxðτÞkÞdτþ t0
Zt F ðt, τÞΔeuðτÞ f x ðxðτÞ, uðτÞ, τÞΔxðτÞdτ, t 2 I:
þ t0
We show that for any t 2 I, the value r(t) of the form (10.11) is of an order of smallness higher than Z h ¼ kΔx0 k þ jΔt 0 j þ jΔt 1 j þ
Δ f dτ þ eu
I
Z
Δ f x dτ: eu
ð10:12Þ
I
Since the continuity of a fundamental matrix F(t, τ) on a set I I has norm bounded by a constant C > 0: kF ðt, τÞk C, ðt, τÞ 2 I I: From here and from (10.11) by the known properties of a norm, it follows that Zt r ðt Þ kF ðt, t 0 ÞkkoðjΔt 0 ÞjÞk þ
kF ðt, τÞkkoðkΔxðτÞkÞkdτþ t0
Zt þ 0
kF ðt, τÞkΔeuðτÞ f x ðxðτÞ, uðτÞ, τÞΔxðτÞdτ
t0
C @koðjΔt 0 ÞjÞk þ
Zt t0
1 Zt koðkΔxðτÞkÞkdτ þ ΔeuðτÞ f x kΔxðτÞkdτA: t0
Dividing the inequality by h > 0 and strengthening it using estimate (10.5), we obtain
10.4
Relation of the Ends of Trajectories
0 kr ð t Þ k koðjΔt 0 jÞk jΔt 0 j C@ þ h h jΔt 0 j 0 C@
Zt t0
125
koðkΔxðτÞkÞk kΔxðτÞk dτ þ h kΔxðτÞk
Z I
koðjΔt 0 jÞk þK jΔt 0 j
Z
1 kΔxðτÞk Δ f x dτA eu h 1
koðkΔxðτÞkÞk dτ þ KhA: kΔxðτÞk
I
By (10.5), the norm kΔx(t)k uniformly on a segment I tends to zero when h ! 0. Therefore, from the last estimation, it follows that kr(t)k/h ! 0 uniformly by t 2 I when h ! 0. In other words, the increment Δx(t) can be represented as a sum Δxðt Þ ¼ δxðt Þ þ oðhÞ,
ð10:13Þ
of the principal term δx(t) of the order h and the remainder o(h) of the order higher than h, uniformly, by t 2 I. The function δx(t) and the value of h are defined by formulas (10.10) and (10.12).
10.4
Relation of the Ends of Trajectories
Let us determine the relation between the point x(t1) and exðet 1 Þ. Utilizing equality (10.13), we obtain exðet 1 Þ ¼ xðet 1 Þ þ Δxðet 1 Þ ¼ xðt 1 þ Δt 1 Þ þ δxðt 1 þ Δt 1 Þ þ oðhÞ: Suppose that functions x(t), δx(t) are differentiable in a point t1. Then by Taylor’s formula, we have exðet 1 Þ ¼ xðt 1 Þ þ δxðt 1 Þ þ Δt 1 ½x_ ðt 1 Þ þ δ_xðt 1 Þ þ oðhÞ or in detailed form using relations (10.10) (10.12), exðet 1 Þ ¼ xðt 1 Þ þ F ðt 1 , t 0 Þ½Δx0 Δt 0 f ðxðt 0 Þ, e uðt 0 Þ, t 0 Þþ Zt1 þΔt 1 f ðxðt 1 Þ, e uðt 1 Þ, t 1 Þ þ F ðt 1 , t ÞΔeuðtÞ f ðxðt Þ, uðt Þ, t Þdt þ oðhÞ, t0
Z
h ¼ kΔx0 k þ jΔt 0 j þ jΔt 1 j þ I
Δ f dτ þ eu
Z
ð10:14Þ
Δ f x dτ: eu
I
The formula (10.14) provides the answer to the last question and describes, in terms of a linear approximation, the relation of the ends of two trajectories of a
126
10
Small Increments of a Trajectory
controlled system in close moments of time directly through the increments of the initial values, the increment of control and the fundamental matrix of solutions of the equations in variations (10.9).
Chapter 11
The Simplest Problem of Optimal Control
Abstract We set the maximum principle (necessary optimality conditions) in the Simplest problem of optimal control. We discuss related issues: the continuity of Hamiltonian, boundary value problem, the sufficiency, and application for linear systems.
11.1
Simplest-Problem. Functional Increment Formula
As mentioned in Sect. 9.4, the Simplest problem of optimal control (S-problem) J ¼ Φðxðt 1 ÞÞ ! min , x_ ¼ f ðx, u, t Þ, xðt 0 Þ ¼ x0 , u 2 U, t 2 ½t 0 , t 1 consists of the minimization of the terminal functional on a set of processes x(t), u(t) of a controlled system with a fixed left end of a trajectory x0, free right end x(t1) of a trajectory and fixed end points of time t0, t1. A scalar function Φ(x) belongs to the class C1(Rn ! R). A function f, a range of control U and a class of piecewise continuous controls meet the standard assumptions of Sects. 2.3 and 2.2. If we do not take into account the objective condition (minimization of the objective functional), then the other conditions of the simplest problem set a rule to calculate points x(t1) on the trajectories of processes x(t), u(t). A set of all these points in a space Rn forms a reachability set Q(t1) of a nonlinear system of differential equations (Fig. 11.1). In this sense, the S-problem can be regarded as an extreme mathematical problem Φ ðxÞ ! min , x 2 Qðt 1 Þ with an implicitly defined set Q(t1) of admissible points. In general, the reachability sets of nonlinear systems are not convex. Unlike linear systems, it is complicated to construct a maximum projection of a reachability set of a nonlinear system on the selected direction, and it is comparable in difficulty with the solution of the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_11
127
128
11
The Simplest Problem of Optimal Control
)( x) =const
Q(t1)
x (t )
x0
x(t1)
Fig. 11.1 Reachability set Q(t1) and lines of level of the objective function in the S-problem
S-problem itself. For nonlinear systems, special methods of investigation are required. Consider an increment ΔJ ¼ Φðexðt 1 ÞÞ Φðxðt 1 ÞÞ of an objective functional J on processes xðt Þ, uðt Þ; exðt Þ ¼ xðt Þ þ Δxðt Þ, e uðt Þ ¼ uðt Þ þ Δuðt Þ
ð11:1Þ
of the Simplest problem. Using Taylor’s formula, we determine a linear relationship in terms of Δx(t1) as the increment of the functional: ΔJ ¼ Φðxðt 1 Þ þ Δxðt 1 ÞÞ Φðxðt 1 ÞÞ ¼ Φx ðxðt 1 ÞÞ0 Δxðt 1 Þ þ oðkΔxðt 1 ÞkÞ: Putting Δx0 ¼ 0, Δt0 ¼ Δt1 ¼ 0 in formula (10.14), we have ðt1 Δxðt 1 Þ ¼
F ðt 1 , t ÞΔeuðtÞ f ðxðt Þ, uðt Þ, t Þdt þ oðhÞ, t0
ðt1
ðt1 Δ f dt þ Δ f x dt: eu eu
t0
t0
h¼
According to definition (10.9), the fundamental matrix F(t1, t) satisfies the matrix equation in the variations F t ðt 1 , t Þ ¼ F ðt 1 , t Þ f x ðxðt Þ, uðt Þ, t Þ, F ðt 1 , t 1 Þ ¼ E:
ð11:2Þ
From the general estimation (10.5) of the increment of a trajectory, it follows that ðt1
kΔxðt 1 Þk K Δeu f dt Kh: t0
11.2
Maximum Principle for the Simplest Problem
129
Substitute Δx(t1) in the formula of the increment of the functional. In view of the evaluation kΔx(t1)k and the representation of h, we obtain ΔJ ¼ Φx ðxðt 1 ÞÞ
0
ðt1 F ðt 1 , t ÞΔeuðtÞ f ðxðt Þ, uðt Þ, t Þdt þ oðhÞ:
ð11:3Þ
t0
Let us provide greater convenience in using the Hamiltonian form for expression (11.3). We introduce the Hamiltonian (Hamilton function) H ðψ, x, u, t Þ ¼ ψ 0 f ðx, u, t Þ
ð11:4Þ
ψ ðt Þ ¼ F ðt 1 , t Þ0 Φx ðxðt 1 ÞÞ:
ð11:5Þ
and the conjugate function
Due to the relations in (11.2), a conjugate function is a solution of the conjugate Cauchy problem ψ_ ¼ f x ðxðt Þ, uðt Þ, t Þ0 ψ, ψ ðt 1 Þ ¼ Φx ðxðt 1 ÞÞ: For notation (11.4), (11.5), formula (11.3) and the conjugate Cauchy problem can be written as ðt1 ΔJ ¼ ΔeuðtÞ H ðψ ðt Þ, xðt Þ, uðt Þ, t Þdt þ oðhÞ, t0
ðt1
ðt1 Δ f dt þ Δ f x dt, eu eu
t0
t0
h¼
ψ_ ¼ H x ðψ, xðt Þ, uðt Þ, t Þ, ψ ðt 1 Þ ¼ Φx ðxðt 1 ÞÞ:
ð11:6Þ
ð11:7Þ
Thus, the increase in the objective functional on processes (11.1) is determined by formula (11.6) in which the Hamiltonian and the conjugate function are defined by relations (11.4) and (11.7).
11.2
Maximum Principle for the Simplest Problem
Evaluate the increase in the objective functional on processes (11.1) of the Sproblem regarding the first process as an optimal one and the second one as a process corresponding to a needle variation of optimal control
130
11
u
The Simplest Problem of Optimal Control
v u (t)
u (t )
W
t0
W H
t1
Fig. 11.2 Needle variation of optimal control u(t)
e uðt Þ ¼ v, t 2 ½τ, τ þ εÞ; e uðt Þ ¼ uðt Þ, t= 2½τ, τ þ εÞ with parameters ε > 0, τ 2 [t0, t1), v 2 U (Fig. 11.2). Then with formula (11.6), we have τþε ð
0 ΔJ ¼
Δv H ðψ ðt Þ, xðt Þ, uðt Þ, t Þdt þ oðhÞ,
τ τþε ð
h¼
τþε ð
kΔv f kdt þ τ
kΔv f x kdt: τ
We apply the mean value theorem to the integrals to obtain 0 ε Δv H ðψ ðτÞ, xðτÞ, uðτÞ, τÞ þ oðεÞ: We then divide this inequality by ε > 0. In the limit ε ! 0, we obtain an inequality Δv H ðψ ðτÞ, xðτÞ, uðτÞ, τÞ 0,
ð11:8Þ
that is valid for all τ 2 [t0, t1) and v 2 U. By continuity, the inequality (11.8) holds for τ ¼ t1 as well. We now summarize. Theorem 11.1 (maximum principle for S-problem) If x(t), u(t) is an optimal process of the S-problem, then the condition of the maximum of the Hamiltonian H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ ¼ max H ðψ ðt Þ, xðt Þ, u, t Þ u2U
holds at every moment t 2 [t0, t1], where ψ(t) is the corresponding solution of the conjugate Cauchy problem (11.7).
11.4
11.3
Continuity of the Hamiltonian
131
Boundary Value Problem of the Maximum Principle
Let us refer to the process satisfying the maximum principle as an extreme one. According to Theorem 11.1, an extreme process of the S-problem satisfies the conditions x_ ¼ f ðx, u, t Þ, ψ_ ¼ H x ðψ, x, u, t Þ, xðt 0 Þ ¼ x0 , ψ ðt 1 Þ ¼ Φx ðxðt 1 ÞÞ, u ¼ uðψ, x, t Þ ¼ arg max H ðψ, x, v, t Þ: v2U
By eliminating the control vector u from these conditions, we obtain the boundary value problem of the maximum principle x_ ¼ f ðx, uðψ, x, t Þ, t Þ, ψ_ ¼ H x ðψ, x, uðψ, x, t Þ, t Þ, xðt 0 Þ ¼ x0 , ψ ðt 1 Þ ¼ Φx ðxðt 1 ÞÞ: The solutions x ¼ x(t, c), ψ ¼ ψ(t, c) of the original and the conjugate system of differential equations depends on the vector c ¼ (c1, . . ., c2n) of the constants of integration. To find the vector c, we obtain a system of 2n boundary conditions xðt 0 , cÞ ¼ x0 , ψ ðt 1 , cÞ ¼ Φx ðxðt 1 , cÞÞ: We then solve this system of equations, to find a vector c ¼ c and functions xðt Þ ¼ xðt, cÞ, ψ ðt Þ ¼ ψ ðt, cÞ, uðt Þ ¼ uðxðt Þ, ψ ðt Þ, t Þ, that satisfy all conditions of the maximum principle. Thus, the maximum principle contains the required number of conditions to allocate extreme processes which are candidates for optimality.
11.4
Continuity of the Hamiltonian
Lemma 11.1 If a triple of functions ψ(t), x(t), u(t) satisfies the maximum principle, a function t ! H(ψ(t), x(t), u(t), t) is continuous on segment [t0, t1]. If there additionally exists a derivative ft then the equality d ∂ H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ ¼ H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ dt ∂t holds for every point of the continuity of control u(t).
ð11:9Þ
132
11
The Simplest Problem of Optimal Control
Proof We arbitrarily choose two close points t, t + Δt of the segment [t0, t1] and evaluate the increment ΔH ¼ H ðψ ðt þ Δt Þ, xðt þ Δt Þ, uðt þ Δt Þ, t þ Δt Þ H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ in the assumptions of the lemma. From the condition of the maximum of the Hamiltonian, we have H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ H ðψ ðt Þ, xðt Þ, uðt þ Δt Þ, t Þ, H ðψ ðt þ Δt Þ, xðt þ Δt Þ, uðt þ Δt Þ, t þ Δt Þ H ðψ ðt þ Δt Þ, xðt þ Δt Þ, uðt Þ, t þ Δt Þ, so, the following two-sided evaluation holds H ðψ ðt þ Δt Þ, xðt þ Δt Þ, uðt Þ, t þ Δt Þ H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ ΔH H ðψ ðt þ Δt Þ, xðt þ Δt Þ, uðt þ Δt Þ, t þ Δt Þ H ðψ ðt Þ, xðt Þ, uðt þ Δt Þ, t Þ: Due to the continuity of functions H(ψ, x, u, t), ψ(t), x(t) and the piecewise continuity of control, it follows from this estimation that ΔH ! 0 when Δt ! 0, Δt > 0 and Δt ! 0, Δt < 0. Since a point t 2 [t0, t1] is arbitrary, the function H(ψ(t), x(t), u(t), t) is continuous over the entire segment [t0, t1]. We then verify the second assertion of the lemma by selecting extreme members of the evaluation ΔH that is linear in terms of Δt. Taking into account the differentiability of the functions ψ(t), x(t), the original and conjugate differential equations, we get Δt H t ðψ ðt Þ, xðt Þ, uðt Þ, t Þ þ oðjΔt jÞ ΔH Δt H t ðψ ðt Þ, xðt Þ, uðt þ Δt Þ, t Þ þ oðjΔt jÞ: This inequality is divided by Δt > 0, and let Δt ! 0. By virtue of the proposed continuity of control u(t + Δt) in a point t, we have H t ðψ ðt Þ, xðt Þ, uðt Þ, t Þ lim
Δt!0 Δt>0
ΔH H t ðψ ðt Þ, xðt Þ, uðt Þ, t Þ: Δt
For Δt < 0, we likewise set H t ðψ ðt Þ, xðt Þ, uðt Þ, t Þ lim
Δt!0 Δt 0, λ ! 0 in the limit we obtain the required property of a convex function Φx ðxÞ0 Δx Φðx þ ΔxÞ ΦðxÞ:
ð11:10Þ
Theorem 11.2 The maximum principle in a linearly convex S-problem is the criterion of optimality. Proof The necessity of the maximum principle follows from Theorem 11.1. We then verify its sufficiency by letting the process x(t), u(t) be an extreme. That is, x_ ðt Þ ¼ Aðt Þxðt Þ þ bðu, ðt Þ, t Þ, xðt 0 Þ ¼ x0 , ψ_ ðt Þ ¼ Aðt Þ0 ψ ðt Þ, ψ ðt 1 Þ ¼ Φx ðxðt 1 ÞÞ, 0
ð11:11Þ
0
ψ ðt Þ bðuðt Þ, t Þ ψ ðt Þ bðu, t Þ, u 2 U, t 2 ½t 0 , t 1 : Here we take into account the relations H ðψ, x, u, t Þ ¼ ψ 0 ½Aðt Þx þ bðu, t Þ, H x ðψ, x, u, t Þ ¼ Aðt Þ0 ψ: From (11.11), we use the properties of the fundamental matrix F(t, τ) of a homogeneous system x_ ¼ Aðt Þx and the Cauchy formula. We then obtain ψ ðt Þ ¼ F ðt 1 , t Þ0 Φx ðxðt 1 ÞÞ:
ð11:12Þ
Then, the last inequality in (11.11) becomes Φx ðxðt 1 ÞÞ0 F ðt 1 , t ÞΔu bðuðt Þ, t Þ 0, u 2 U, t 2 ½t 0 , t 1 :
ð11:13Þ
We thus denote an arbitrary fixed process of a linearly convex problem as exðt Þ ¼ xðt Þ þ Δxðt Þ e uðt Þ ¼ uðt Þ þ Δuðt Þ . From these conditions, the increment Δx(t) is a solution of the Cauchy problem Δ_x ¼ Aðt ÞΔx þ ΔeuðtÞ bðuðt Þ, t Þ, Δxðt 0 Þ ¼ 0:
11.6
Applying the Maximum Principle to the Linear Problem
135
Hence, we use the Cauchy formula to find ðt1 Δxðt 1 Þ ¼
F ðt 1 , t ÞΔeuðtÞ bðuðt Þ, t Þdt:
ð11:14Þ
t0
The inequality (11.13) for u ¼ e uðt Þ is integrated on segment [t0, t1]. In light of notation (11.14), we have Φx ðxðt 1 ÞÞ0 Δxðt 1 Þ 0: From here, by property (11.10) of the convex function, it follows that 0 Φx ðxðt 1 ÞÞ0 Δxðt 1 Þ Φðxðt 1 Þ þ Δxðt 1 ÞÞ Φðxðt 1 ÞÞ Or 0 Φðexðt 1 ÞÞ Δðxðt 1 ÞÞ. Therefore, the x(t), u(t) process is optimal, and the theorem is proven.
11.6
Applying the Maximum Principle to the Linear Problem
A linear problem is a particular case of the linearly convex problem J ¼ c0 xðt 1 ÞÞ ! min , x_ ¼ Aðt Þx þ Bðt Þu, xðt 0 Þ ¼ x0 , u 2 ½1, 1r , t 2 ½t 0 , t 1 : Here the objective function and the right-hand sides of the differential equations are linear, the domeine U is a r-dimensional cube. The maximum principle here is an optimality criterion. It allows us to completely solve the linear problem. Indeed, in view of (11.11), the sought-for optimal process x(t), u(t) is defined by conditions x_ ðt Þ ¼ Aðt Þxðt Þ þ Bðt Þuðt Þ, xðt 0 Þ ¼ x0 , ψ_ ðt Þ ¼ Aðt Þ0 ψ ðt Þ, ψ ðt 1 Þ ¼ c, ψ ðt Þ0 Bðt Þuðt Þ ¼ max r ψ ðt Þ0 Bðt Þu, t 2 ½t 0 , t 1 :
ð11:15Þ
u2½1, 1
From (11.15), a conjugate Cauchy problem can be seen to be immediately solved, so we regard function ψ(t) as a known one. Then, pðt Þ ¼ Bðt Þ0 ψ ðt Þ
ð11:16Þ
becomes a known function, and the last condition of (11.15) takes the form of a simple parametric linear problem
136
11
The Simplest Problem of Optimal Control
pðt Þ0 uðt Þ ¼ max r pðt Þ0 u: u2½1, 1
Rewriting it in coordinate form, r X
pk ðt Þuk ðt Þ ¼ max
k¼1
u2½1, 1r
r X k¼1
pk ðt Þuk ¼
r X k¼1
max pk ðt Þuk ,
uk 2½1, 1
we find a solution uk ðt Þ ¼ sign pk ðt Þ, t 0 t t 1 , k ¼ 1, . . . , r:
ð11:17Þ
By definition, coordinate uk(t) is equal to 1 if pk(t) < 0 and +1 if pk(t) > 0. When pk(t) ¼ 0, the value uk(t) 2 [1, 1] can be arbitrarily chosen. We then use this to redefine solution (11.17) by continuity on the right at all points t 2 [t0, t1]. If it is a piecewise constant function as a result, then it will be the optimal control. Then, the optimal trajectory is determined as a solution of the first Cauchy problem (11.15). In the theory of automatic control, uk(t) is called the relay control, and function (11.16) is referred to as the switching function. The relay control takes the values that are defined by the magnitudes of a switching function in the vertices of a cube [1, 1]r (Fig. 11.3).
Fig. 11.3 Case r ¼ 2. If the values of a switching function belong to the same coordinate quarter, the value of the optimal relay control is the vertex of the squarethat is located in the same coordinate quarter
11.7
11.7
Solution of the Mass-Spring Example
137
Solution of the Mass-Spring Example
We illustrate this problem as follows x1 ðt 1 Þ ! min , x_ 1 ¼ x2 , x_ 2 ¼ ω2 x1 þ bu, x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, u 2 ½1, 1, t 2 ½0, t 1 : When compared with the general linear optimal control problem n ¼ 2, r ¼ 1, t 0 ¼ 0, ω > 0, b > 0, ! ! ! 1 0 1 0 , A¼ , B¼ , x0 ¼ 0 ω2 0 b
c¼
0
!
0
:
The physical meaning of the problem and the designations are given in Sects. 1.1 and 2.6. We then proceed to the solution. The conjugate Cauchy problem takes the form
ψ_ 1 ψ_ 2
¼
0
ω2
1
0
ψ 1 ðt 1 Þ 1 , ¼ ψ 2 ðt 1 Þ ψ2 0 ψ1
(see (11.15)) or in coordinate form ψ_ 1 ¼ ω2 ψ 2 , ψ_ 2 ¼ ψ 1 , ψ 1 ðt 1 Þ ¼ 1, ψ 2 ðt 1 Þ ¼ 0: Hence, we have ψ 1 ðt Þ ¼ cos ωðt t 1 Þ, ψ 2 ðt Þ ¼ ω1 sin ωðt t 1 Þ: By (11.16), (11.17), we find the switching function
cos ωðt t 1 Þ pðt Þ ¼ ð0, bÞ ω1 sin ωðt t 1 Þ
¼ ω1 b sin ωðt 1 t Þ
and the optimal control uðt Þ ¼ sign sin ωðt 1 t Þ, t 2 ½0, t 1 : According to this formula, the optimal control of the piecewise constant function takes the values 1 or +1 and abruptly changes the values in the points tk ¼ t1 kπω1, k ¼ 0, . . ., m, where m is the largest integer π 1ωt1. According to item 2.6, for the intervals, where an optimal control is constant, the movement of a
138
11
The Simplest Problem of Optimal Control
Fig. 11.4 The optimal trajectory in the mass-spring example for t1 ¼ 3πω1. The optimal alternating force takes the corresponding values F0, F0, F0 on three consecutive time intervals of equal length πω1
phase point occurs by the ellipses centered at (ω2b, 0) for u(t) ¼ 1 and (ω2b, 0) for u(t) ¼ + 1. During the time period πω1 the phase point runs half of an ellipse. The optimal control completely determines the action of the optimal alternating force F(t) ¼ F0u(t) on the load with a spring. The farthest movement of the load to the right is caused by the maximum force |F(t)| ¼ F0 that periodically changes its direction of action in the reverse direction. The optimal trajectory of the load is shown when t1 ¼ 3πω1 in Fig. 11.4. Exercise Set 1. Determine how a solution of the mass-spring example will change if the initial position and the initial velocity of the mass are not zeros. In the following exercises we discuss only the simplest optimal control problem. 2. What will the form of the maximum principle become if we replace the terminal objective functional by the Lagrange functional, or the Mayer-Bolza functional? 3. Give an example of a problem in which an optimal control does not exist. 4. True or false: it is sufficient to have closure and limitation of a reachability set for the existence of an optimal control. 5. Suppose that the time-independent parameters play the role of control. Derive the necessary conditions of optimality by using the formula of increments of the objective functional and the additional conditions with the initial data from the problem. 6. What are the additional assumptions on the conditions of the problem for which the formula of the increment of the objective functional can be written as ðt1 ΔJ ¼ H u ðψ ðt Þ, xðt Þ, uðt Þ, t Þ0 Δuðt Þdt þ oðkΔukÞ, t0
ð11:18Þ
11.7
Solution of the Mass-Spring Example
139
0t 11=2 ð1 kΔuk ¼ @ Δuðt Þ0 Δuðt ÞdtA ? t0
7. Suppose that the increment of an objective functional has the representation (11.18) and the range of control is convex. What form does the necessary condition of optimality have in this case? Hint: use the property of convexity of the range of control when constructing a uniformly small its variation. 8. Does it follow from (11.18) that the gradient Ju(u) of an objective functional J(u) for control u(t) is J u ðuÞ ¼ H u ðψ ðt Þ, xðt Þ, uðt Þ, t Þ, t 0 t t 1 ? 9. What operations do we need to conduct to calculate Ju(u)? 10. Let a control u(t) be such that for any small α > 0 e uðt Þ ¼ uðt Þ αH u ðψ ðt Þ, xðt Þ, uðt Þ, t Þ 2 U, t 0 t t 1 : When is J ðe uÞ < J ðuÞ?
Chapter 12
General Optimal Control Problem
Abstract The necessary optimality conditions in the General problem of optimal control are set out in several stages. Initially, for optimal process we construct a parameter family of “close” varied processes. The requirement for admissibility of varied processes leads to a finite auxiliary problem of nonlinear programming that depends on parameters of variation. The analysis of the auxiliary problem and the limiting translation by parameters of variation give the required necessary optimality conditions in the form of the Pontryagin’s maximum principle. We consider a using of the maximum principle for various particular cases of the General problem.
12.1
General Problem. Functional Increment Formula
We have carried out the first step to study the theory of the necessary and sufficient conditions of optimality and have become acquainted with the maximum principle for the Simplest problem. It would not be an exaggeration to say that the maximum principle in this problem represents, in its pure form, a constraint on a control. Now we take the next step and determine how the restrictions on the ends of the integral curve are reflected in the necessary conditions of optimality. As will be seen later, they lead to additional conditions – that is, conditions of transversality. The object of our attention is the General optimal control problem in the formulation and assumptions of Sect. 9.6. The unknowns of the problem are the processes x(t), u(t), t0, t1. Assuming, for the sake of brevity, that t ¼ ðt 0 , t 1 Þ, xðtÞ ¼ ðxðt 0 Þ, xðt 1 ÞÞ, we write the G-problem in the form
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_12
141
142
12
General Optimal Control Problem
J 0 ¼ Φ0 ðxðtÞ, tÞ ! min , ( 0, i ¼ 1, . . . , m0 , J i ¼ Φi ðxðtÞ, tÞ ¼ 0, i ¼ m0 þ 1, . . . , m, x_ ¼ f ðx, u, t Þ, u 2 U, t 0 t 1 : Let us direct our efforts to obtain the necessary conditions for optimality. For some fixed i 2 {0, . . ., m} we evaluate an increment ΔJi of a functional Ji on two processes xðt Þ, uðtÞ, t 0 , t1 ; exðtÞ ¼ xðt Þ þ Δxðt Þ, e uðtÞ ¼ uðtÞ þ Δuðt Þ, et0 ¼ t0 þ Δt0 , et1 ¼ t1 þ Δt1
with left ends of the trajectories xðt 0 Þ ¼ x0 , exðet 0 Þ ¼ ex0 ¼ x0 þ Δx0 : We make the following assumptions: (1) the trajectories xðt Þ, exðt Þ are defined on an interval I containing the points t 0 , et 0 , t 1 , et 1, and (2) function x(t) and its derivative x_ ðt Þ are continuous in the points t0, t1. Then ΔJ i ¼ Φi ðexðet 0 Þ, exðet 1 Þ, et 0 , et 1 Þ Φi ðxðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ ¼ Φi ðxðt 0 Þ þ Δx0 , xðt 1 Þ þ ½exðet 1 Þ xðt 1 Þ, t 0 þ Δt 0 , t 1 þ Δt 1 Þ Φi ðxðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ: We use Taylor expansion to allocate the linear terms for the increment of the functional. Denoting the particular gradients of the function Φi(x0, x1, t0, t1) as Φix0 ðx0 , x1 , t 0 , t 1 Þ, Φix1 ðx0 , x1 , t 0 , t 1 Þ, we obtain ΔJ i ¼Φix0 ðxðtÞ, tÞ0 Δx0 þ Φix1 ðxðtÞ, tÞ0 ½exðet 1 Þ xðt 1 Þ þ Φit0 ðxðtÞ, tÞΔt 0 þ Φit1 ðxðtÞ, tÞΔt 1 þ o kΔx0 k þ jΔt 0 j þ jΔt 1 j þ exðet 1 Þ xðt 1 Þ : We proceed further as we have done in derivation of the formula of increment of the objective functional in the S-problem. We use formula (10.14), and we introduce the Hamiltonian H(ψ, x, u, t) ¼ ψ 0f(x, u, t) and function ψ i ðt Þ ¼ F ðt 1 , t Þ0 Φix1 ðxðtÞ, tÞ, where F(t1, t) is a fundamental matrix of solutions of variation equation F t ðt 1 , t Þ ¼ F ðt 1 , t Þ f x ðxðt Þ, uðt Þ, t Þ, F ðt 1 , t 1 Þ ¼ E: After grouping the terms, the increment ΔJi takes the form
12.2
Variation of the Process
143
0 ΔJ i ¼ Φix0 ðxðtÞ, tÞ ψ i ðt0 Þ Δx0 þ Φit0 ðxðtÞ, tÞ þ H ψ i ðt0 Þ, xðt 0 Þ, e uðt0 Þ, t0 Δt 0 ðt1 i þ Φit1 ðxðtÞ, tÞ H ψ ðt 1 Þ, xðt 1 Þ, e uðt1 Þ, t1 Δt 1 ΔeuðtÞ H ψ i ðtÞ, xðtÞ, uðtÞ, t dt þ oðhÞ,
ð
ð
t0
h ¼ kΔx0 k þ jΔt 0 j þ jΔt 1 j þ Δeu f dτ þ Δeu f x dτ, i ¼ 0, . . . , m: I
12.2
ð12:1Þ
I
Variation of the Process
We construct a family of varying processes exðt Þ, e uðt Þ, et 0 , et 1 with special increments Δx0, Δt0, Δt1, Δu(t) depending on the parameters of variation ε j , δx0j , δt 0j , δt 1j , τ j , v j , j ¼ 1, . . . , s. Put Δx0 ¼
s X j¼1
ε j δx0j , Δt 0 ¼
s X j¼1
ε j δt 0j , Δt 1 ¼
s X
ε j δt 1j ,
j¼1
s Δuðt Þ ¼ v j uðt Þ, t 2 τ j , τ j þ ε j , j ¼ 1, . . . , s, Δuðt Þ ¼ 0, t= 2 [ τ j, τ j þ ε j : j¼1
ð12:2Þ Here εj are small nonnegative parameters; δx0j are arbitrary vectors from a sphere Bn ¼ {x 2 Rn : kxk 1}; δt0j, δt1j are any numbers from segment B1 ¼ [1, 1]; τj are arbitrary points of the continuity of control u(t), t0 < τ1 < ⋯ < τs < t1; v j are arbitrary points in the range of control U. We form vectors ε ¼ ðε1 , . . . , εs Þ, π j ¼ δx0j , δt 0j , δt 1j , τ j , v j , j ¼ 1, . . . , s from the parameters of variation. By definition, the vector ε lies in the non-negative octant of a space Rs and Πs ¼ {π 1, . . ., π s} is a finite subset of the Cartesian product Π ¼ Bn B1 B1 ½t 0 , t 1 U ⊂ Rnþrþ3 : In accordance with the special performance of Δu(t), a control e uðt Þ ¼ uðt Þ þ Δuðt Þ takes the values v1, . . ., vs on half-intervals [τ1, τ1 + ε1), . . ., [τs, τs + εs) respectively, and the values u(t) in the remaining points of R. The parameters ε1, . . ., εs specify the durations of the perturbation of control u(t) (Fig. 12.1). We estimate the remainder of the increment ΔJi in formula (12.1). The CauchyBunyakovskii inequality (|a0b| kakkbk for all a, b 2 Rn) is used to obtain.
144
12
General Optimal Control Problem
u
u(t )
u (t ) v
v1
W1
2
W2 W 2 +H 2
W1 +H 1
t
Fig. 12.1 Graph of a needle variation e uðt Þ (solid lines) of control u(t) (dashed line) when s ¼ 2
X X s s s X j ε j δx0 ε j δx0j ε j s1=2 kεk, kΔx0 k ¼ j¼1 j¼1 j¼1 X X s s s X ε j δt 0j ε j δt 0j ε j s1=2 kεk, jΔt 0 j ¼ j¼1 j¼1 j¼1 X X s s s X ε j δt 1j ε j δt 1j ε j s1=2 kεk: jΔt 1 j ¼ j¼1 j¼1 j¼1 Further, in the notation of Sect. 10.2, we have ð
s X Δ f dτ ¼ eu j¼1
I
τ jð þε j
s X j¼1
τ jð þε j
n X
kΔv j f kdτ ¼
τ jð þε j s X j¼1
τj
dτ ¼ 2Mn1=2
s X
i¼1
τj
!1=2 jΔ v j f i j
2
dτ
i¼1
τj
!1=2
j2M j2
n X
ε j 2M ðnsÞ1=2 kεk:
j¼1
Similarly, ð
s X Δ f x dτ ¼ eu j¼1
I
τ jð þε j
kΔv j f x kdτ
j¼1
j¼1
τj
τ jð þε j s n X n X X
τ jð þε j s X
dτ ¼ 2Mn
j2M j
i¼1 k¼1
τj
!1=2
i¼1 k¼1
τj
!1=2 2
n X n X Δv j f ix 2 k
s X
ε j ¼ 2Mns1=2 kεk:
j¼1
Due to the last estimates, we obtain ð ð h ¼kΔx0 k þ jΔt 0 j þ jΔt 1 j þ Δeu f dτ þ Δeu f x dτ
3s
I 1=2
þ 2M ðnsÞ
1=2
þ 2Mns
I
1=2
kεk ð3 þ 4MnÞs1=2 kεk:
dτ
12.2
Variation of the Process
145
Consequently, h ! 0 for kεk ! 0 and joðhÞj j oð hÞ j h j oð hÞ j ð3 þ 4MnÞs1=2 : ¼ h kε k kε k h From here, it follows that o(h) has a small magnitude of order greater than kεk. Taking this into account and substituting in (12.1) the increments (12.2), we obtain s 0 X ΔJ i ¼ Φix0 ðxðtÞ, tÞ ψ i ðt 0 Þ ε j δx0j j¼1 s X þ Φit0 ðxðtÞ, tÞ þ H ψ i ðt 0 Þ, xðt 0 Þ, uðt 0 Þ, t 0 ε j δt 0j j¼1 s X þ Φit1 ðxðtÞ, tÞ H ψ i ðt 1 Þ, xðt 1 Þ, uðt 1 Þ, t 1 ε j δt 1j
ð12:3Þ
j¼1
s X j¼1
τ jð þε j
Δv j H ψ i ðτÞ, xðτÞ, uðτÞ, τ dτ þ oðkεkÞ:
τj
Since each function τ ! Δv j H ðψ i ðτÞ, xðτÞ, uðτÞ, τÞ is continuous in a small neighborhood of a point τj, by the mean value theorem for the integral we have τ jð þε j
Δv j H ψ i ðτÞ, xðτÞ, uðτÞ, τ dτ ¼ ε j Δv j H ψ i τ j , x τ j , u τ j , τ j þ oðkεkÞ:
τj
Using the last equality, the increment (12.3) of the functional Ji on a special variation of the process x(t), u(t), t0, t1 with parameters ε, Πs can be written in the form ΔJ i ðε, Πs Þ ¼
s X
aij ε j þ oðkεkÞ,
ð12:4Þ
j¼1
where
0 aij ¼ Φix0 ðxðtÞ, tÞ ψ i ðt 0 Þ δx0j þ Φit0 ðxðtÞ, tÞ þ H ψ i ðt 0 Þ, xðt 0 Þ, uðt 0 Þ, t 0 δt 0j þ Φit1 ðxðtÞ, tÞ H ψ i ðt 1 Þ, xðt 1 Þ, uðt 1 Þ, t 1 δt 1j Δv j H ψ i τ j , x τ j , u τ j , τ j , i ¼ 0, . . . , m, j ¼ 1, . . . , s:
ð12:5Þ
146
12
12.3
General Optimal Control Problem
Necessary Conditions for Optimality
Let x(t), u(t), t0, t1 be an optimal process of the G-problem. Without loss of generality, we consider that the first m1 0 inequality constraints on the optimal process are active Φi ðxðtÞ, tÞ ¼ 0, i ¼ 1, . . . , m1 , Φi ðxðtÞ, tÞ < 0, i ¼ m1 þ 1, . . . , m0 : Then, for a small ε 0 and any set Πs of variation parameters, the inequality ΔJ 0 ðε, Πs Þ 0 holds if ΔJ i ðε, Πs Þ 0, i ¼ 1, . . . , m1 , ΔJ i ðε, Πs Þ ¼0, i ¼ m0 þ 1, . . . , m: In other words, ε ¼ 0 is a local minimum point in the nonlinear programming problem (NP-problem) ΔJ 0 ðε, Πs Þ ¼ ΔJ 1 ðε, Πs Þ ¼ ΔJ i ðε, Πs Þ ¼
s X j¼1 s X j¼1 s X
a0j ε j þ oðkεkÞ ! min , aij ε j þ oðkεkÞ 0, i ¼ 1, . . . , m1 , aij ε j þ oðkεkÞ ¼ 0, i ¼ m0 þ 1, . . . , m,
j¼1
ε 0: This conclusion is the basis for obtaining the necessary conditions of optimality. We represent the NP-problem in a compact vector-matrix form a00 ε þ oðkεkÞ ! min , A1 ε þ oðkεkÞ 0, A2 ε þ oðkεkÞ ¼ 0, ε 0, where
0
1 0 1 a11 . . . a1s am0 þ1,1 . . . am0 þ1,s B C B C B C a0 ¼ @ . . . A, A1 ¼ @ . . . . . . . . . . . . A, A2 ¼ @ . . . . . . . . . . . . . . . . . . A: a0s am1 . . . ams am 1 1 . . . am 1 s a01
1
0
ð12:6Þ
12.3
Necessary Conditions for Optimality
147
By the conditions of the problem (12.6), we compose the system of linear algebraic equations a00 y þ y0 ¼ 0, A1 y þ y1 ¼ 0, A2 y ¼ 0
ð12:7Þ 0
a00
1
B C with unknowns y 2 Rs , y0 2 R, y1 2 Rm1 and matrix A ¼ @ A1 A, depending on a set A2 of variations Πs. Theorem 12.1 If x(t), u(t), t0, t1 is an optimal process of the G-problem, then for any set Πs ⊂ Π the following two conditions are inconsistent: (a) the rows of the matrix A are linearly independent, (b) the system of equations (12.7) has a solution y > 0, y0 > 0, y1 0. Proof We use a contradiction method. Suppose, there is a set of variation parameters Πs ⊂ Π of optimal process x(t), u(t), t0, t1 that satisfies conditions (a) and (b) of Theorem 12.1: the rows of the corresponding matrix A are linearly independent and the system of equations (12.7) has a solution y > 0, y0 > 0, y1 0. Since y > 0 and the system of linear equations (12.7) is linear and homogeneous, we accept kyk ¼ 1. Let us show that for small α > 0 the system of nonlinear equations a00 ε þ αy0 þ oðkεkÞ ¼ 0, A1 ε þ αy1 þ oðkεkÞ ¼ 0, A2 ε þ oðkεkÞ ¼ 0
ð12:8Þ
has a solution εðαÞ ¼ αðy þ A0 zðαÞÞ 0 with function z(α) ¼ o(α)/α. Substituting ε(α) into (12.8) and using equalities (12.7), we obtain the system of equations for the required function z ¼ z(α) 1 AA0 z þ oðαky þ A0 zkÞ ¼ 0: α
ð12:9Þ
Due to the linear independence of the rows of the matrix A, the square matrix B ¼ (AA0)1 of order q ¼ m + m1 m0 + 1 exists; therefore, the system of equations (12.9) can be written in the equivalent form 1 z ¼ B oðαky þ A0 zkÞ: α
ð12:10Þ
148
12
General Optimal Control Problem
Since y > 0, for small β > 0 the inequality y + A0z 0 is true for z 2 Rq, kzk β. By virtue of the definition of o(kεk), for any μ, 0 0 that ko(kεk)k μkεk for ε 0, kεk ν. Put α0 ¼
ν : 1 þ β kA 0 k
If kzk β we have ky þ A0 zk kyk þ kA0 zk 1 þ kA0 kkzk 1 þ βkA0 k and when α 2 [0, α0] αky þ A0 zk αð1 þ βkA0 kÞ ¼ α
ν ν: α0
Employing the choice of numbers μ and ν, we get 1 1 B oðαky þ A0 zkÞ kBkkoðαky þ A0 zkÞk α α 1 0 kBkμαky þ A zk μð1 þ βkA0 kÞkBk β: α Thus, if kzk β and α 2 [0, α0], then 1 B oðαky þ A0 zkÞ β: α
ð12:11Þ
For each fixed α 2 [0, α0] we define a sequence of vectors {zk(α)} using the recursive formula 1 zkþ1 ðαÞ ¼ B o αy þ A0 zk ðαÞ , k ¼ 0, 1, . . . , z0 ðαÞ ¼ 0: α
ð12:12Þ
Obviously, 0 ¼ kz0(α)k β. If inequality kzk(α)k β for k 1 is satisfied, then using formula (12.12) and estimate (12.11) we obtain kþ1 1 z ðαÞ ¼ B o αy þ A0 zk ðαÞ β: α
ð12:13Þ
Therefore, the sequence {zk(α)} is bounded and has a converging subsequence. Without loss of generality, we assume that the sequence {zk(α)} converges to the
12.4
Lagrange Multiplier Rule
149
limit z(α) pointwise for α 2 [0, α0]. Since the norm is continuous, inequality (12.13) implies kzðαÞk β, α 2 ½0, α0 and based on the estimate (12.11) 1 B oðαky þ A0 zðαÞkÞ β, α 2 ½0, α0 : α Using the last two inequalities, we have 1 1 zðαÞ B oðαky þ A0 zðαÞkÞ kzðαÞk þ B oðαky þ A0 zðαÞkÞ 2β: α α In view of the arbitrariness of a small β > 0, the latter is possible only if 1 zðαÞ ¼ B oðαky þ A0 zðαÞkÞ, α 2 ½0, α0 : α Hence, z(0) ¼ 0, z(α) ! 0 when α ! 0. So, the system of nonlinear equations (12.8) has solution εðαÞ ¼ αðy þ A0 zðαÞÞ 0, α 2 ½0, α0 : For a small α > 0, the point ε(α) satisfies the constraints of the NP-problem (12.6), and the objective function at this point has a negative value. The latter contradicts the assumption that ε ¼ 0 is a local minimum point. The theorem is proved.
12.4
Lagrange Multiplier Rule
By Theorem 12.1, if the process of the G-problem is optimal, then for any set Πs ⊂ Π one of the following conditions holds: (A) the rows of the matrix A are linearly dependent, (B) the system of equations (12.7) has no solution y > 0, y0 > 0, y1 0. Let us show that (A) is not a necessary optimality condition. In simple G-problem x2 ð1Þ ! min , x1 ð1Þ 1 ¼ 0, x_ 1 ¼ u1 , x_ 2 ¼ u2 , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, ju1 j 1, ju2 j 1, 0 t 1
150
12
General Optimal Control Problem
couple x(t) ¼ (t, t), u(t) ¼ (1, 1) is the optimal process. Using formula (12.5), we obtain A¼
v12 þ 1
⋯
vs2 þ 1
v11 1
⋯
vs1 1
! :
Obviously, the rows of this matrix cannot be linearly dependent for any variation parameters vij 1, i ¼ 1, 2, j ¼ 1, . . . , s. Consequently, condition (A) is not satisfied on the optimal process. In case (B) the optimal process has characteristic property – for any set Πs ⊂ Π the convex cone 9 80 0 1 > > = < a0 y þ y 0 B C C ¼ @ A1 y þ y1 A : y > 0, y0 > 0, y1 0 ⊂ Rq > > ; : A2 y does not contain the point 0 2 Rq. To perform further analysis and use the separability theorem for convex sets, we introduce sequences of points {y0k} ⊂ R, {yk} ⊂ Rs: y0k ! 0, yk ! 0, y0k > 0, yk > 0, k ¼ 1, 2, . . . and the corresponding sequence {Ck} ⊂ Rq of closed convex cones 9 80 0 1 > > = < a0 ðyk þ δyÞ þ ðy0k þ δy0 Þ B C C k ¼ @ A1 ðyk þ δyÞ þ δy1 A : δy 0, δy0 0, δy1 0 , k ¼ 1, 2, . . . > > ; : A2 ðyk þ δyÞ By construction Ck ⊂ C and 0 2 = Ck for any fixed k ¼ 1, 2, . . . . We apply the separation theorem to each pair of closed convex sets {0}, Ck. According to the Theorem A.2.1, there exists a nonzero vector λk 2 Rq with the property k 0 0 λ 0 < λk z, z 2 Ck : Without loss of generality, we assume kλkk ¼ 1. Let us select a converging subsequence from it. For simplicity, we assume that the sequence converges to the vector 2 1 2 1 λ ¼ λ0 , λ , λ , λ ¼ 1, λ ¼ λ1 , . . . , λm1 , λ ¼ λm0 þ1 , . . . , λm : From the last inequality, when k ! 1, we obtain
12.4
Lagrange Multiplier Rule
151
0 1 2 0 1 0 λ0 a0 þ A01 λ þ A02 λ δy þ λ0 δy0 þ λ δy1 , δy 0, δy0 0, δy1 0: Hence, due to the arbitrariness of the choice δy, δy0, δy1, we have 1
1
2
λ0 0, λ 0, λ0 a0 þ A01 λ þ A01 λ 0:
ð12:14Þ
Putting 0 1 s s s e λ ¼ e λ0 , . . . , e λm ¼ @λ0 , . . . , λm1 , 0, . . . , 0, λm0 þ1 , . . . , λm A |fflfflfflffl{zfflfflfflffl} m0 m1
and using the notation in (12.5), we rewrite the last inequality (12.14) in an expanded form " #0 m m m s X X X s s s i j e e e N e λ ,π ¼ λi aij ¼ λi Φix0 ðxðtÞ, tÞ λi ψ ðt 0 Þ δx0j i¼0
"
þ " þ
i¼0 m X
s e λi Φit0 ðxðtÞ, tÞþH
i¼0
i¼0 m X
!#
eλs ψ i ðt 0 Þ, xðt 0 Þ, uðt 0 Þ, t 0 i
i¼0
m X
s e λi Φit1 ðxðtÞ, tÞH
i¼0
m X
!# eλs ψ i ðt 1 Þ, xðt 1 Þ, uðt 1 Þ, t 1 i
i¼0
Δv j H
m X
δt 0j
s e λi ψ i τ j , x τ j , u τ j , τ j
δt 1j
! 0, j ¼ 1, . . . , s:
i¼0
ð12:15Þ We formulate the following conclusions from Theorem 12.1. Corollary 12.1 For an optimal process s x(t),s u(t), t0, t1 of G-problem and any set s λ ¼ e λ0 , . . . , e λm satisfying the following conditions: Πs ⊂ Π there exists a vector e s s e λi 0, i ¼ 0, . . . , m0 , λ 6¼ 0, e s e λi Φi ðxðtÞ, tÞ ¼ 0, i ¼ 1, . . . , m0 , s N e λ , π j 0, j ¼ 1, . . . , s:
ð12:16Þ
s The vector e λ with properties (12.16) is called Lagrange vector, their coordinates – Lagrange multipliers, Corollary 12.1 – the rule of Lagrange multipliers and equality s e λi Φi ðxðtÞ, tÞ ¼ 0, i ¼ 1, . . . , m0 , complementary slackness (complementarity) s conditions. They mean e λi ¼ 0 if Φi(x(t)t) < 0, i 2 {1, . . ., m0}.
152
12
General Optimal Control Problem
For the given form, the rule of the Lagrange multipliers for the time being is inconvenient to apply for two reasons. First, by construction, the Lagrange multipliers depend on the parameters of variation. Second, the last condition (12.16) itself contains a number of important consequences.
12.5
Universal Lagrange Multipliers
We continue the analysis of Corollary 12.1. Our efforts will be focused on the construction of the universal Lagrange multipliers that do not depend on the parameters of variation and on deriving the consequences from the last inequalities in (12.16). The idea to obtain the universal Lagrange multipliers is ruther simple. We consider a sequence of finite subsets {Πs} of a set Π ¼ Bn B1 B1 ½t 0 , t 1 U: n so Then we put into accordance a limited sequence of Lagrange vectors e λ to a sequence {Πs}. If the points of a set Πs are “evenly” distributednover o a set Π when s e s increases, then the limit of a convergent subsequence from λ will have the required universal property, that is, it will not depend on the choice of parameters of variation. We now turn to the formal constructions. For any ρ > 0, a compact set Π is a finite b By definition of the ρ-net for any point π 2 Π, there is a point b b with the ρ-net Π. π2Π property kb π π k < ρ. Let us refer to a sequence of sets {Πs} as the right sequence if for any ρ > 0, there exists a natural s(ρ) such that Πs(ρ) is a ρ-net of Π. Thus, if a sequence {Πs} is right then, starting with a some number s(ρ), the finite sets Πs(ρ) ⊂ Πs(ρ) + 1 ⊂ . . . form ρ-nets of a set Π. n so Then we fix a right sequence {Πs} and a corresponding sequence e λ of s e Lagrange vectors. The conditions (12.16) determine each s vector λ with an accuracy n so defined up to a positive factor. Therefore, we can take e λ ¼ 1. A sequence e λ is limited and, consequently, have n a convergent subsequence. Without loss of genero s e ality, we assume that a sequence λ itself converges to λ. Obviously, kλk ¼ 1. We choose an arbitrary point π ¼ (δx0, δt0, δt1, τ, v) 2 Π such that τ is a point of s continuity of control u(t). Since e λ ! λ and the sequence {Πs} is right, there exists a natural s(ρ) for which the following three conditions hold simultaneously: sðρÞ e λ λ < ρ, ΠsðρÞ is a ρ‐net of Π, π kðρÞ π < ρ
12.6
Maximum Principle for the General Problem
153
for some π k(ρ) 2 Πs(ρ). We put s ¼ s(ρ), j ¼ k(ρ) in the last inequality (12.16). Then, sðρÞ N e λ , π kðρÞ 0: The function N is continuous in a small neighborhood of a point (λ, π). Therefore, in a limit ρ ! 0 we obtain N(λ, π) 0 for all points π. Hence, by force of the piecewise continuity of function N by τ, it follows that N(λ, π) 0 for any π 2 Π. If we take the limit by s in the first three relations (12.16), we will obtain a complete kit of the necessary conditions of optimality for the optimal process x(t), u(t), t0, t1 of the Gproblem: λ 6¼ 0, λi 0, i ¼ 0, . . . , m0 , λi Φi ðxðtÞ, tÞ ¼ 0, i ¼ 1, . . . , m0 ,
ð12:17Þ
N ðλ, π Þ 0, π 2 Π, where a Lagrange vector λ ¼ (λ0, . . ., λm) now does not depend on the parameters of variation π 2 Π.
12.6
Maximum Principle for the General Problem
Taking into account formula (12.15), we write the last condition of (12.17) in expanded form ½Lx0 ðλ, xðtÞ, tÞ ψ ðt0 , λÞ0 δx0 þ ½Lt0 ðλ, xðtÞ, tÞ þ H ðψ ðt 0 , λÞ, xðt 0 Þ, uðt 0 Þ, t 0 Þδt 0 þ ½Lt1 ðλ, xðtÞ, tÞ þ H ðψ ðt 1 , λÞ, xðt 1 Þ, uðt 1 Þ, t 1 Þδt 1 Δv H ðψ ðτ, λÞ, xðτÞuðτÞ, τÞ 0, ðδx0 , δt 0 , δt 1 , τ, vÞ 2 Bn B1 B1 ½t 0 , t 1 U, where ψ ðt, λÞ ¼
m P
λi ψ i ðt Þ, function H has the standard form (11.4) and
i¼0 m X λ i Φ i x0 , x1 , t 0 , t 1 : L λ, x0 , x1 , t 0 , t 1 ¼ i¼0
Hence, by virtue of the arbitrariness of δx0, δt0, δt1, τ, v, it follows that Lx0 ðλ, xðtÞ, tÞ ψ ðt 0 , λÞ ¼ 0, Lt0 ðλ, xðtÞ, tÞ þ H ðψ ðt 0 , λÞ, xðt 0 Þ, uðt 0 Þ, t 0 Þ ¼ 0, Lt1 ðλ, xðtÞ, tÞ H ðψ ðt 1 , λÞ, xðt 1 Þ, uðt 1 Þ, t 1 Þ ¼ 0, Δv H ðψ ðτ, λÞ, xðτÞ, uðτÞ, τÞ 0, v 2 U, τ 2 ½t 0 , t 1 :
ð12:18Þ
154
12
General Optimal Control Problem
Notice that the second equation in (12.18) admits another representation: 0 ¼Lt0 ðλ, xðtÞ, tÞ þ H ðψ ðt 0 , λÞ, xðt 0 Þ, uðt 0 Þ, t 0 Þ ¼Lt0 ðλ, xðtÞ, tÞ þ ψ ðt 0 , λÞ0 f ðxðt 0 Þ, uðt 0 Þ, t 0 Þ ¼Lt0 ðλ, xðtÞ, tÞ þ Lx0 ðλ, xðtÞ, tÞ0 x_ ðt 0 Þ L_ t0 ðλ, xðtÞ, tÞ:
ð12:19Þ
The inequality in (12.18) indicates that in a set u 2 U the Hamiltonian H(ψ(t, λ), x(t), u, t) reaches its maximum value on the optimal control u(t) at any point t 2 [t0, t1]: H ðψ ðt, λÞ, xðt Þ, uðt Þ, t Þ ¼ max H ðψ ðt, λÞ, xðt Þ, u, t Þ: u2U
By definition, a function ψ ðt, λÞ ¼
m P
λi ψ i ðt Þ is a linear combination of functions
i¼0
ψ i ðt Þ ¼ F ðt 1 , t Þ0 Φix1 ðxðtÞ, tÞ, i ¼ 0, . . . , m, where F(t1, t) is a fundamental matrix of solutions of variation equation F t ðt 1 , t Þ ¼ F ðt 1 , t Þ f x ðxðt Þ, uðt Þ, t Þ, F ðt 1 , t 1 Þ ¼ E: Multiplying this equality by the vectors λi Φix1 ðxðtÞ, tÞ0 on the left and summing the results by i ¼ 0, . . ., m, we make certain that the function ψ(t, λ) satisfies the conjugate Cauchy problem ψ_ ðt, λÞ ¼ f x ðxðt Þ, uðt Þ, t Þ0 ψ ðt, λÞ, ψ ðt 1 , λÞ ¼
m X
λi Φix1 ðxðtÞ, tÞ
i¼0
or in the Hamiltonian form ψ_ ðt, λÞ ¼ H x ðψ ðt, λÞ, xðt Þ, uðt Þ, t Þ, ψ ððt 1 , λÞ ¼ Lx1 ðλ, xðtÞ, tÞ:
ð12:20Þ
Using the initial condition (12.20), we can represent the third condition in (12.18) in a form analogous to (12.19) 0 ¼Lt1 ðλ, xðtÞ, tÞ H ðψ ðt 1 , λÞ, xðt 1 Þ, uðt 1 Þ, t 1 Þ ¼Lt1 ðλ, xðtÞ, tÞ ψ ðt 1 , λÞ0 f ðxðt 1 Þ, uðt 1 Þ, t 1 Þ ¼Lt1 ðλ, xðtÞ, tÞ þ Lx1 ðλ, xðtÞ, tÞ0 x_ ðt 1 Þ L_ t1 ðλ, xðtÞ, tÞ: To summarize.
12.7
Comments
155
Theorem 12.2 (maximum principle for G-problem) Let x(t), u(t), t0, t1 be an optimal process of the G-problem. Then there exists a vector λ ¼ (λ0, . . ., λm) and a continuous solution ψ(t) of a conjugate system of differential equations ψ_ ¼ H x ðψ, xðt Þ, uðt Þ, t Þ, satisfying conditions: 1. non-triviality, non-negativity and complementary slackness λ 6¼ 0, λi 0, i ¼ 0, . . . , m0 , λi Φi ðxðtÞ, tÞ ¼ 0, i ¼ 1, . . . , m0 ; transversality ψ ðt 0 Þ ¼ Lx0 ðλ, xðtÞ, tÞ, ψ ðt 1 Þ ¼ Lx1 ðλ, xðtÞ, tÞ, L_ t0 ðλ, xðtÞ, tÞ ¼ 0, L_ t1 ðλ, xðtÞ, tÞ ¼ 0; maximum of Hamiltonian H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ ¼ max H ðψ ðt Þ, xðt Þ, u, t Þ, t 2 ½t 0 , t 1 : u2U
Here m n X X λi Φi x0 , x1 , t 0 , t 1 , H ðψ, x, u, t Þ ¼ ψ j f j ðx, u, t Þ: L λ, x0 , x1 , t 0 , t 1 ¼ i¼0
12.7
j¼1
Comments
Theorem 12.2 represents a fairly general necessary condition of optimality. From this theorem, for example, the maximum principle for minimum time problem and the Euler-Lagrange equation of the calculus of variations follow. The previously noted features of the maximum principle for the G-problem hold for the S-problem as well, and the continuity and differentiability of a Hamiltonian remains with respect to time along the extreme process. Determining the extreme processes of the G-problem is also reduced to the solution of a boundary value problem for a system of original and conjugate differential equations. At the same time, there are some distinctions. The unknown Lagrange multipliers and a complementary slackness condition appear in a boundary value problem. In essence, these conditions exclude inactive constraints for the optimal process inequalities-constraints from the boundary problem. Since it is not known a priori
156
12
General Optimal Control Problem
which of the constraints are inactive, the complementary slackness conditions lead to trying different possible combinations of active and inactive constraints. In conclusion, we present a useful technique that simplifies recording the transversality conditions. If we identify the arguments x0, x1 of a function L(λ, x0, x1, t0, t1) with phase states x(t0), x(t1) and represent this function in a form L(λ, x(t), t) ¼ L(x(t0), x(t1), t0, t1), then Lxk ðλ, xðtÞ, tÞ ¼ Lxðtk Þ ðxðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ, d L_ tk ðλ, xðtÞ, tÞ ¼ Lðxðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ, k ¼ 0, 1: dt k
12.8
Sufficiency of the Maximum Principle
A general problem is said to be linearly-convex if (1) the functions Φi(x0, x1) are convex by argument (x0, x1) for i ¼ 0, . . ., m0 and affine for i ¼ m0 + 1, . . ., m, (2) the system of equations has a form x_ ¼ Aðt Þx þ bðu, t Þ, (3) the times t0, t1 are fixed. Recall that an affine function is the sum of linear and constant functions. Theorem 12.3 If a general problem is linearly-convex, then each process that satisfies the maximum principle with multiplier λ0 > 0 is optimal. Proof Let there be some process x(t), u(t) of the G-problem that satisfies the maximum principle (Theorem 12.2) with its corresponding Lagrange vector λ (λ0 > 0) and solution ψ(t) of a conjugate system of differential equations ψ_ ¼ Aðt Þ0 ψ: Let, further, exðt Þ xðt Þ þ Δxðt Þ, e uðt Þ be any fixed process of the G-problem. According to the maximum principle, we have ψ ðt 0 Þ ¼ Lx0 ðλ, xðtÞÞ, ψ ðt 1 Þ ¼ Lx1 ðλ, xðtÞÞ, ψ ðt Þ0 ½bðe uðt Þ, t Þ bðuðt Þ, t Þ 0, t 0 t t 1 :
ð12:21Þ
Using the fundamental matrix F(t1, t) of the variational equation F t ðt 1 , t Þ ¼ F ðt 1 , t ÞAðt Þ, F ðt 1 , t 1 Þ ¼ E, we write a solution of a conjugate system with the initial condition ψ ðt 1 Þ ¼ Lx1 ðλ, xðtÞÞ by the Cauchy formula ψ ðt Þ ¼ F ðt 1 , t Þ0 Lx1 ðλ, xðtÞÞ:
ð12:22Þ
12.8
Sufficiency of the Maximum Principle
157
Substitute the solution (12.22) into (12.21) and integrate the inequality that is obtained by t in segment [t0, t1]. Then, Lx1 ðλ, xðtÞÞ
0
ðt1 F ðt 1 , t Þ½bðe uðt Þ, t Þ bðuðt Þ, t Þdt 0: t0
By definition the increment Δx(t) is a solution of the Cauchy problem Δ_x ¼ Aðt ÞΔx þ bðe uðt Þ, t Þ bðuðt Þ, t Þ, Δxjt¼t0 ¼ Δxðt 0 Þ: From here, by the Cauchy formula ðt1 Δxðt 1 Þ ¼ F ðt 1 , t 0 ÞΔxðt 0 Þ þ F ðt 1 , t Þ½bðe uðt Þ, t Þ bðuðt Þ, t Þdt: t0
Consequently, the previous inequality takes the form 0 Lx1 ðλ, xðtÞÞ0 ½Δxðt 1 Þ F ðt 1 , t 0 ÞΔxðt 0 Þ ¼Lx1 ðλ, xðtÞÞ0 Δxðt 1 Þ þ ψ ðt 0 Þ0 Δxðt 0 Þ ¼Lx0 ðλ, xðtÞÞ0 Δxðt 0 Þ þ Lx1 ðλ, xðtÞÞ0 Δxðt 1 Þ: Here, we have additionally used formula (12.22) and the first condition of transversality (12.21). According to the maximum principle, a multiplier λi 0 corresponds to every convex function Φi, i ¼ 0, . . ., m0. Therefore, in view of the affinity of other functions Φi, i ¼ m0 + 1, . . ., m, a function Lðλ, x0 , x1 Þ ¼ m P
λi Φi ðx0 , x1 Þ is convex by argument (x0, x1). By the property (11.10) of a convex
i¼0
function 0 0 Lx0 λ, x0 , x1 Δx0 þ Lx1 λ, x0 , x1 Δx1 L λ, x0 þ Δx0 , x1 þ Δx1 L λ, x0 , x1 : From here and from the previous inequality, we obtain 0 Lx0 ðλ, xðtÞÞ0 Δxðt 0 Þ þ Lx1 ðλ, xðtÞÞ0 Δxðt 1 Þ Lðλ, exðtÞÞ Lðλ, xðtÞÞ: In further detail, 0 λ0 ½Φ0 ðexðtÞÞ Φ0 ðxðtÞÞ m0 m X X þ ½λi Φi ðexðtÞÞ λi Φi ðxðtÞÞ þ λi ½Φi ðexðtÞÞ Φi ðxðtÞÞ: i¼1
i¼m0 þ1
158
12
General Optimal Control Problem
Since the processes x(t), u(t) and exðt Þ, e uðt Þ satisfy the functional constraints of the G-problem and the Lagrange multipliers for the process x(t), u(t) also meet the conditions of non-negativity and the complementary slackness, then in the last inequality, we have λi Φi ðexðtÞÞ 0, λi Φi ðxðtÞÞ ¼ 0, i ¼ 1, . . . , m0 , Φi ðexðtÞÞ ¼ Φi ðxðtÞÞ ¼ 0, i ¼ m0 þ 1, . . . , m Consequently, 0 λ0 ½Φ0 ðexðtÞÞ Φ0 ðxðtÞÞ: Hence, we obtain Φ0 ðxðtÞÞ Φ0 ðexðtÞÞ for λ0 > 0 which implies the optimality of the process x(t), u(t), and the theorem is proven.
12.9
Maximum Principle for Minimum Time Problem
We apply Theorem 12.2 for a particular case of the General problem t 1 t 0 ! min , x_ ¼ f ðx, u, t Þ, xðt 0 Þ x0 ¼ 0, xðt 1 Þ x1 ¼ 0, u 2 U, t 0 < t 1 with functional constraints-equalities and a given x0, x1. This is familiar to us as the two-point minimum time problem (M-problem) with fixed ends x0, x1 of a trajectory and not fixed moments of time t0, t1. Let x(t), u(t), t0, t1 be some process of the Mproblem. Put λ ¼ λ0 , λ1 , λ2 , λ1 , λ2 2 Rn , H ðψ, x, u, t Þ ¼ ψ 0 f ðx, u, t Þ, 0 0 Lðλ, xðtÞ, tÞ ¼ λ0 ðt 1 t 0 Þ þ λ1 ðxðt 0 Þ x0 Þ þ λ2 ðxðt 1 Þ x1 Þ: By Theorem 12.2, the optimality of the process x(t), u(t), t0, t1 requires the existence of a vector λ and a continuous solution ψ(t) of a conjugate system of differential equations ψ_ ¼ H x ðψ, xðt Þ, uðt Þ, t Þ that satisfy the following conditions: 1. λ 6¼ 0, λ0 0; 0 0 2. λ0 þ λ1 x_ ðt 0 Þ ¼ 0, λ0 þ λ2 x_ ðt 1 Þ ¼ 0, ψ(t0) ¼ λ1, ψ(t1) ¼ λ2; 3. H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ ¼ max H ðψ ðt Þ, xðt Þ, u, t Þ, t 2 [t0, t1]. u2U
If we assume that λ1 ¼ 0, then it follows that ψ(t) 0, λ2 ¼ 0, λ0 ¼ 0 from condition (2) and the conjugate system, i.e., λ ¼ 0 which contradicts condition (1). Consequently, λ1 6¼ 0 and ψ(t) 6¼ 0, and Lagrange multipliers
12.9
Maximum Principle for Minimum Time Problem
159
λ0 ¼ ψ ðt 0 Þ0 x_ ðt 0 Þ ¼ ψ ðt 1 Þ0 x_ ðt 1 Þ 0, λ1 ¼ ψ ðt 0 Þ 6¼ 0, λ2 ¼ ψ ðt 1 Þ 6¼ 0 correspond to conditions (1) and (2). Thus, as a corollary from the Theorem 12.2, it follows that. Theorem 12.4 (maximum principle for M-problem) If x(t), u(t), t0, t1 is an optimal process of the M-problem, then there exists a non-trivial continuous solution ψ(t) of a conjugate system of differential equations ψ_ ¼ H x ðψ, xðt Þ, uðt Þ, t Þ, such that H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ ¼ max H ðψ ðt Þ, xðt Þ, u, t Þ, t 2 ½t 0 , t 1 , u2U
H ðψ ðt 0 Þ, xðt 0 Þ, uðt 0 Þ, t 0 Þ ¼H ðψ ðt 1 Þ, xðt 1 Þ, uðt 1 Þ, t 1 Þ 0, where H(ψ, x, u, t) ¼ ψ 0f(x, u, t). In the formulation of the maximum principle, there are two additional relations to determine the unknowns t0, t1. Example 12.1 We illustrate the application of Theorem 12.4 for the minimum time problem t 1 t 0 ! min , x_ 1 ¼ x2 , x_ 2 ¼ u, x1 ðt 0 Þ ¼ x2 ðt 0 Þ ¼ 0, x1 ðt 1 Þ ¼ x2 ðt 1 Þ ¼ 2, juj 2, We verify the process x1(t) ¼ t2/2, x2(t) ¼ t, u(t) ¼ 1, t0 ¼ 0, t1 ¼ 2 of the optimality (Question: Is it optimal for a given problem?). Construct a Hamiltonian H ðψ, x, uÞ ¼ ψ 1 x2 þ ψ 2 u and conjugate the system of differential equations ψ_ 1 ¼ 0, ψ_ 2 ¼ ψ 1 : Integrating the conjugate equations, we obtain ψ 1 ðt Þ ¼ c1 , ψ 2 ðt Þ ¼ c1 t þ c2 , where c1, c2 are arbitrary constants. The conditions of the maximum principle ψ ðt Þ ¼ðc1 , c1 t þ c2 Þ 6¼ 0; Δu H ðψ ðt Þ, xðt Þ, uðt ÞÞ ¼ðc1 t þ c2 Þðu 1Þ 0, juj 2, 0 t 2; H ðψ ð0Þ, xð0Þ, xð0ÞÞ ¼H ðψ ð2Þ, xð2Þ, uð2ÞÞ ¼ c2 0 for the process under consideration are not satisfied (it then follows c1 ¼ c2 ¼ 0 and ψ(t) 0). Hence, the process in question is not optimal.
160
12
12.10
General Optimal Control Problem
Maximum Principle and Euler-Lagrange Equation
Consider the simplest problem of variational calculus ðt1 F ðxðt Þ, x_ ðt Þ, t Þdt ! min , xðt 0 Þ ¼ x0 , xðt 1 Þ ¼ x1
J¼
ð12:23Þ
t0
in which the minimum of the integral is sought on a set of functions x(t) from the class C2([t0, t1] ! R) with fixed ends. The numbers t0, t1, x0, x1 and the function F ðx, x_ , t Þ from the class C2(R R [t0, t1] ! R) are regarded as given. We assume that the problem (12.23) has a solution x(t) and that there exists a bounded interval V ⊂ R containing all values of the derivative x_ ðt Þ, t0 t t1. We can easily write the problem (12.23) as a G-problem if we put control u ¼ x_ Ðt and phase variables x1 ¼ x, x2 ¼ F ðxðτÞ, x_ ðτÞ, τÞdτ for function x ¼ x(t). Using t0
this notation, the problem (12.23) takes the form J ¼ x2 ðt 1 Þ ! min , x1 ðt 0 Þ x0 ¼ 0, x2 ðt 0 Þ ¼ 0, x1 ðt 1 Þ x1 ¼ 0, x_ 1 ¼ u, x_ 2 ¼ F ðx1 , u, t Þ, u 2 U, t 0 t t 1 ,
ð12:24Þ
where U is a closure V. Obviously, the triple of functions ðt F ðxðτÞ, x_ ðτÞ, τÞdτ, uðt Þ ¼ x_ ðt Þ
x1 ðt Þ ¼ xðt Þ, x2 ðt Þ ¼
ð12:25Þ
t0
is a process of the G-problem (12.24), and we write the necessary conditions of optimality for it. We form the functions H ðψ, x, uÞ ¼ψ 1 u þ ψ 2 F ðx1 , u, t Þ, Lðλ, xðtÞ, tÞ ¼λ0 x2 ðt 1 Þ þ λ1 ðx1 ðt 0 Þ x0 Þ þ λ2 x2 ðt 0 Þ þ λ3 ðx1 ðt 1 Þ x1 Þ: By Theorem 12.2, to ensure the optimality of a process (12.25), the existence of a vector λ ¼ (λ0, λ1, λ2, λ3) 6¼ 0, λ0 0 is required, and a continuous solution ψ(t) ¼ (ψ 1(t), ψ 2(t)) of a conjugate system ψ_ 1 ¼ F x ðxðt Þ, x_ ðt Þ, t Þψ 2 , ψ_ 2 ¼ 0, satisfying the transversality conditions ψ 1 ðt 0 Þ ¼ λ1 , ψ 2 ðt 0 Þ ¼ λ2 , ψ 1 ðt 1 Þ ¼ λ3 , ψ 2 ðt 1 Þ ¼ λ0
12.10
Maximum Principle and Euler-Lagrange Equation
161
and the condition of the stationarity of the Hamiltonian H u ðψ ðt Þ, xðt Þ, uðt ÞÞ ¼ ψ 1 ðt Þ þ ψ 2 ðt ÞF x_ ðxðt Þ, x_ ðt Þ, t Þ ¼ 0, t 2 ½t 0 , t 1 : For these conditions, we take into account the fixity of the ends of the time segment t0, t1 and the assumption concerning the range of control U. We use the conjugate equations and the transversality conditions to obtain ðt ψ 1 ðt Þ ¼ λ1 þ λ0 F x ðxðτÞ, x_ ðτÞ, τÞdτ, ψ 2 ðt Þ ¼ λ0 ¼ λ2 : t0
If λ0 ¼ 0, then ψ 2(t) ¼ λ0 ¼ λ2 ¼ 0, and it follows that ψ 1(t) ¼ λ1 ¼ 0 from the condition of the stationarity of the Hamiltonian. Consequently, ψ 1(t1) ¼ λ3 ¼ 0 and then λ ¼ 0 which contradicts the maximum principle. Therefore, without a loss of generality, we assume λ0 ¼ 1. As a result, we determine all the Lagrange multipliers λ0 ¼ 1, λ1 ¼ F x_ ðxðt 0 Þ, x_ ðt 0 Þ, t 0 Þ, λ2 ¼ 1, λ3 ¼ F x_ ðxðt 1 Þ, x_ ðt 1 Þ, t 1 Þ from the conditions of transversality and stationarity, and the condition of stationarity of the Hamiltonian takes the form ðt λ1 þ F x ðxðτÞ, x_ ðτÞ, τÞdτ F x_ ðxðt Þ, x_ ðt Þ, t Þ ¼ 0: t0
We differentiate the last equation by t to obtain the differential equation of EulerLagrange F x ðxðt Þ, x_ ðt Þ, t Þ
d F ðxðt Þ, x_ ðt Þ, t Þ ¼ 0 dt x_
for the sought-for function x(t). So, in order for function x(t) to be a solution of the simplest problem of variational calculus (12.23), it is necessary for it to satisfy the Euler-Lagrange equation. The Euler-Lagrange equation is derived from the maximum principle with the assumption that all values of a derivative for the sought-for function are located in the interior of the range of control U. For control problems, this situation is not typical – that is, the values of the optimal control may belong to the boundary of U. For this reason, the maximum principle is in a general a more necessary optimality condition.
162
12
General Optimal Control Problem
Example 12.2 Consider smooth curves x ¼ x(t) passing through the points (0,0), (1,1) of the coordinate plane. Find out which of these has the shortest length S, and write down the requirements in the form ð1 S¼
1 þ x_ 2 ðt Þ
1=2
dt ! min , xð0Þ ¼ 0, xð1Þ ¼ 1,
ð12:26Þ
0
we obtain the simplest problem of variational calculus (12.23) with the function 1=2 : F ðx, x_ , t Þ ¼ 1 þ x_ 2 Compute the derivatives 1=2 1 F x ¼ 0, F x_ ¼ x_ 1 þ x_ 2 , dF x_ =dt ¼ €x 1 þ x_ 2 , and write the Euler-Lagrange equation 1 ¼ 0 , €x ¼ 0: €x 1 þ x_ 2 Its general solution is x ¼ c1t + c2, where c1, c2 are arbitrary constants. Then the boundary conditions of (12.26) are satisfied to obtain c1 ¼ 1, c2 ¼ 0 and, as a consequence, the particular solution is x ¼ t. Therefore, the function x(t) ¼ t, 0 t 1 meets the necessary conditions for the extremum. This graph is a straight line with ends (0,0) and (1,1). We show that the necessary condition of the extremum for problem (12.26) is a sufficient condition simultaneously. Indeed, by analogy with (12.24), problem (12.26) can be represented as a linearly-convex G-problem J ¼ x2 ð1Þ ! min , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, x1 ð1Þ 1 ¼ 0, 1=2 , u 2 U: x_ 1 ¼ u, x_ 2 ¼ 1 þ u2
ð12:27Þ
By construction, the triple of functions ðt x1 ðt Þ ¼ xðt Þ ¼ t, x2 ðt Þ ¼
1 þ x_ 2 ðt Þ
1=2
dt ¼ 21=2 t, uðt Þ ¼ x_ ðt Þ ¼ 1
ð12:28Þ
0
forms a process of the problem (12.27) and satisfies the maximum principle with factor λ0 ¼ 1. Then by Theorem 12.3, the process (12.28) is optimal. Therefore, the function x(t) ¼ t is the solution of (12.26). Thus, among all curves coupling two given points in the plane, the shortest length has a straight line. Of course, this conclusion holds for the Euclidean metric embedded in the formula used to calculate the length of the curve.
12.11
12.11
Maximum Principle and Optimality of a Process
163
Maximum Principle and Optimality of a Process
We clarify the following important issues: the sense of the minimum in a General problem and the type of necessary conditions that the maximum principle is. Let us start from the beginning. In the problem to minimize the functional J(z) on a set of elements z 2 D, we distinguish a global (absolute) and a local (relative) minimum. If for some element z 2 D the inequality J ðzÞ J ðezÞ holds for all elements ez 2 D, then z is referred to as a global minimum point, and we say about a global minimum of J on D. If the inequality J ðzÞ J ðezÞ holds only for those elements ez 2 D that are located in a small neighborhood of z, then z is referred to as a local minimum point, and we say about a local minimum of J on D. Obviously, the global minimum is local at the same time, but, generally speaking, not vice versa. With regard to the G-problem, the processes of the problem, the set of all processes and the objective functional play a role as z, D, J(z), respectively. We also identified the optimal processes in a global sense. In the derivation of the necessary conditions for the minimum we have considered the processes ez ¼ uðt Þ, et 0 , et 1 Þ that are close to the optimal z ¼ (x(t), u(t), t0, t1) in a sense of ðexðt Þ, e smallness of ð ð hðez, zÞ ¼ kexðt 0 Þ xðt 0 Þk þ et 0 t 0 þ et 1 t 1 þ Δeu f dτ þ Δeu f x dτ: I
I
As a consequence, the maximum principle for the general problem is the necessary condition for a local minimum. Since the necessary condition for a local minimum is valid for the global minimum, the maximum principle is also a necessary condition for the global minimum. Exercise Set 1. What changes in the transversality conditions will occur, if we fix the ends of the trajectories and the end points of time? 2. Derive Theorem 11.1 (the maximum principle for the S-problem) from Theorem 12.2 (the maximum principle for the G-problem). 3. How will the maximum principle be formulated if we replace the terminal functionals in the G-problem by the Mayer-Bolza functionals (or Lagrange functionals) accordingly? 4. Let the right-hand sides of the differential equations depend smoothly on the parameters – that is, be constant in time controls. Derive the necessary conditions of optimality for them. 5. Hint: regard the parameters as additional phase variables for which the derivatives are zero with respect to time. Use the maximum principle. 6. Let us define a reachability set Q(t1), t1 t0 for the M-problem with fixed x0, t0 as a set of all points of a phase space in which we can obtain in a moment of time t1 by trajectories of a system of differential equations issuing from the point
164
12
General Optimal Control Problem
x(t0) ¼ x0 under different controls. How the minimum time problem be formulated in terms of the distance between the set Q(t1) and the point x1? Can we write these as a General problem (G-problem)? 7. Relate the existence of an optimal control in the M-problem in the class of piecewise continuous controls with the closing of a reachability set. Give an example of a problem with a non-closed reachability set.
Chapter 13
Problem with Intermediate States
Abstract The maximum principle in a problem with intermediate phase constraints is proved. The application of the results to the optimal control problem with discontinuous right-hand sides of differential equations is shown.
13.1
Problem with Intermediate State. Functional Increment Formula
We are sufficiently prepared to obtain the maximum principle for the problem with intermediate phase states (IS-problem). This is useful for two reasons. First, it is interesting to find out what new the intermediate phase states add to the necessary optimality conditions. Second, on the basis of the IS-problem, one can study the problems of optimal control by composite, discrete, discontinuous and other complex systems. The technique for deriving the maximum principle for the IS-problem remains the same, so we outline the main stages of its derivation. For simplicity, we restrict ourselves by considering a particular case of a problem with one intermediate phase state (IS1-problem) J 0 ¼Φ0 ðxðt 0 Þ, xðt 1 Þ, xðt 2 Þ, t 0 , t 1 , t 2 Þ ! min , ( 0, i ¼ 1, . . . , m0 , J i ¼Φi ðxðt 0 Þ, xðt 1 Þ, xðt 2 Þ, t 0 , t 1 , t 2 Þ ¼ 0, i ¼ m0 þ 1, . . . , m, x_ ¼f ðx, u, t Þ, u 2 U, t 0 < t 1 < t 2 : As before, we assign functions Φi(x0, x1, x2, t0, t1, t2), i ¼ 0, . . ., m to the class C1(Rn Rn Rn R R R ! R) and use the notations t ¼ ðt 0 , t 1 , t 2 Þ, xðtÞ ¼ ðxðt 0 Þ, xðt 1 Þ, xðt 2 ÞÞ: A triple x(t), u(t), t that satisfies all conditions of the problem, except possibly the first one, is called a process. The process x(t), u(t), t is regarded optimal if © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_13
165
166
13
Problem with Intermediate States
Φ0 ðxðtÞ, tÞ Φ0 ex et ,et for any other process exðt Þ, e uðt Þ,et. Our immediate goal is to obtain the necessary optimality conditions for the IS1-problem. Consider two processes x(t), u(t), t and exðt Þ, e uðt Þ,et of IS1-problem that satisfy the initial conditions xðt 0 Þ ¼ x0 , xðet 0 Þ ¼ ex0 . Let us assume that the trajectories x(t), exðt Þ are defined on a common time interval I containing the points t 0 , t 2 , et 0 , et 2 and a control u(t) is continuous at a point t1. Later, the assumption about the continuity of a control at this point will be removed. We put ex0 ¼ x0 þ Δx0 , et 0 ¼ t 0 þ Δt 0 , et 1 ¼ t 1 þ Δt 1 , et 2 ¼ t 2 þ Δt 2 : We will be interested in the increment ΔJ i ¼ Φi ex et ,et Φi ðxðtÞ, tÞ of functional Ji at fixed i 2 {0, . . ., m}. In a detailed entry ΔJ i ¼ Φi ðxðt 0 Þ þ Δx0 , xðt 1 Þ þ ½exðet 1 Þ xðt 1 ÞÞ, xðt 2 Þ þ ½exðet 2 Þ xðt 2 Þ, t 0 þ Δt 0 , t 1 þ Δt 1 , t 2 þ Δt 2 Þ Φi ðxðt 0 Þ, xðt 1 Þ, xðt 2 Þ, t 0 , t 1 , t 2 Þ: Let us select linear terms with respect to the increments of the arguments in the increment of the functional. Using the Taylor series expansion, we obtain ΔJ i ¼Φix0 ðxðtÞ, tÞ0 Δx0 þ
2 X
Φixk ðxðtÞ, tÞ0 ½exðet k Þ xðt k Þ
k¼1
þ
2 X
Φitk ðxðtÞ, tÞΔt k þ o kΔx0 k þ
k¼0
2 X k¼0
! 2 X exðet k Þ xðt k Þ : jΔt k j þ k¼1
By formula (10.14), we have exðet k Þ xðt k Þ ¼F ðtk , t0 Þ½Δx0 Δt 0 f ðxðt0 Þ, e uðt0 Þ, t0 Þ ðtk uðtk Þ, tk Þ þ F ðt k , t ÞΔeuðtÞ f ðxðtÞ, uðt Þ, tÞdt þ Δt k f ðxðt k Þ, e 0
t0
1 ð þ o@kΔx0 k þ jΔt 0 j þ jΔtk j þ Δeu f dτ þ Δeu f x dτA, k ¼ 1, 2: ð I
I
Here F(tk, t) is a fundamental matrix of solutions of the variational equation F t ðt k , t Þ ¼ F ðt k , t Þ f x ðxðt Þ, uðt Þ, t Þ, F ðt k , t k Þ ¼ E, k ¼ 1, 2:
ð13:1Þ
13.1
Problem with Intermediate State. Functional Increment Formula
167
After obvious transformations the increment of functional takes the form " ΔJ i ¼ Φix0 ðxðtÞ, tÞ þ
2 X
#0 0
F ðt k , t 0 Þ Φixk ðxðtÞ, tÞ Δx0
k¼1
"
þ Φit0 ðxðtÞ, tÞ
2 X
# 0
Φixk ðxðtÞ, tÞ F ðt k , t 0 Þf ðxðt 0 Þ, e uðt 0 Þ, t 0 Þ Δt 0
k¼1
þ
2 X Φitk ðxðtÞ, tÞ þ Φixk ðxðtÞ, tÞ0 f ðxðt k Þ, e uðt k Þ, t k Þ Δt k k¼1
þ
t 2 ðk X k¼1
Φixk ðxðtÞ, tÞ0 F ðt k , t ÞΔeuðtÞ f ðxðt Þ, uðt Þ, t Þdt
t0
0
þ o@kΔx0 k þ
2 X k¼0
1 ð jΔt k j þ Δeu f dτ þ Δeu f x dτA: ð I
I
Let us give a Hamiltonian form with a function H(ψ, x, u, t) ¼ ψ 0f(x, u, t) for the last expression. We take F ðt k , t Þ ¼0, t > t k , k ¼ 1, 2; ψ i ðt Þ ¼
2 X
F ðt k , t Þ0 Φixk ðxðtÞ, tÞ, t 0 t t 2 :
ð13:2Þ
k¼1
Then
0 ΔJ i ¼ Φix0 ðxðtÞ, tÞ ψ i ðt 0 Þ Δx0 þ Φit0 ðxðtÞ, tÞ þ H ψ i ðt 0 Þ, xðt 0 Þ, e uðt 0 Þ, t 0 Δt 0 þ
2 X Φitk ðxðtÞ, tÞ þ Φixk ðxðtÞ, tÞ0 f ðxðt k Þ, e uðt k Þ, t k Þ Δt k k¼1
ðt2
ð13:3Þ
ΔeuðtÞ H ψ i ðt Þ, xðt Þ, uðt Þ, t dt t0
0
þ o@kΔx0 k þ
2 X k¼0
1 ð jΔt k j þ Δeu f dτ þ Δeu f x dτA: ð I
I
We show that function ψ i(t) satisfies the conjugate Cauchy problem and jump condition at t ¼ t1: ψ_ i ¼ H x ψ i , xðt Þ, uðt Þ, t , ψ i ðt 2 Þ ¼ Φix2 ðxðtÞ, tÞ, ψ i ðt 1 0Þ ¼ ψ i ðt 1 þ 0Þ Φix1 ðxðtÞ, tÞ:
ð13:4Þ
168
13
Problem with Intermediate States
Indeed, using the relations (13.2), (13.1) we have ψ_ i ðt Þ ¼
2 X
F t ðt k , t Þ0 Φixk ðxðtÞ, tÞ
k¼1
¼ f x ðxðt Þ, uðt Þ, t Þ
0
2 X
! 0
F ðt k , t Þ Φixk ðxðtÞ, tÞ
k¼1
¼ f x ðxðt Þ, uðt Þ, t Þ0 ψ i ðt Þ ¼ H x ψ i ðt Þ, xðt Þ, uðt Þ, t for t 6¼ t0, t1, t2 and for t ¼ t2 ψ i ðt 2 Þ ¼
2 X
F ðt k , t 2 Þ0 Φixk ðxðtÞ, tÞ
k¼1
¼ F ðt 1 , t 2 Þ0 Φix1 ðxðtÞ, tÞ F ðt 2 , t 2 Þ0 Φix2 ðxðtÞ, tÞ ¼ Φix2 ðxðtÞ, tÞ: Further, for a small ε > 0 we can write ψ i ð t 1 εÞ ψ i ð t 1 þ εÞ ¼ 2 2 X X F ðt k , t 1 εÞ0 Φixk ðxðtÞ, tÞ þ F ðt k , t 1 þ εÞ0 Φixk ðxðtÞ, tÞ ¼ k¼1
¼
2 X
k¼1
F ðt k , t 1 εÞ0 Φixk ðxðtÞ, tÞ þ F ðt 2 , t 1 þ εÞ0 Φix2 ðxðtÞ, tÞ:
k¼1
From here, when ε ! 0 we get the jump condition (13.4).
13.2
Preliminary Necessary Conditions of Optimality
The necessary conditions of optimality for the IS1-problem are established in the same way as for the G-problem. By analogy with item (12.2), we introduce the special increments Δx0 ¼
s X
ε j δx0j , Δt k ¼
j¼1
s X
ε j δt kj , k ¼ 0, 1, 2,
j¼1
Δuðt Þ ¼v j uðt Þ, t 2 τ j , τ j þ ε j , j ¼ 1, . . . , s, s Δuðt Þ ¼0, t= 2 [ τ j, τ j þ ε j j¼1
ð13:5Þ
13.2
Preliminary Necessary Conditions of Optimality
169
depending on the parameters of variation ε j , δx0j , δt 0j , δt 1j , δt 2j , τ j , v j , j ¼ 1, . . . , s . Here, as before, εj are are small non-negative parameters; δx0j are arbitrary vectors from the ball Bn ¼ {x 2 Rn : kxk 1}; δt0j, δt1j, δt2j are arbitrary numbers from the segment B1 ¼ [1, 1]; τj are arbitrary, differ from t1, points of continuity of control u(t), t0 < τ1 < ⋯ < τs < t2; v j are arbitrary points of the range of control U. From the parameters of variation, we form the vectors ε ¼ ðε1 , . . . , εs Þ 0, π j ¼ δx0j , δt 0j , δt 1j , δt 2j , τ j , v j ,
j ¼ 1, . . . , s
and the finite subset Πs ⊂ Π, where ΠS ¼ π 1 , . . . , π S , Π ¼ Bn B1 B1 B1 ½t 0 , t 2 U ⊂ Rnþrþ4 : For the special increments (13.5), formula (13.3) can be represented as ΔJ i ðε, ΠS Þ ¼
s X
aij ε j þ oðkεkÞ, i ¼ 0, . . . , m,
j¼1
0 aij ¼ Φix0 ðxðtÞ, tÞ ψ i ðt 0 Þ δx0j þ Φit0 ðxðtÞ, tÞ þ H ψ i ðt 0 Þ, xðt 0 Þ, uðt 0 Þ, t 0 δt 0j þ
2 X Φitk ðxðtÞ, tÞ þ Φixk ðxðtÞ, tÞ0 f ðxðt k Þ, uðt k Þ, t k Þ Δt kj k¼1
Δv j H ψ i τ j , x τ j , u τ j , τ j , i ¼ 0, . . . , m, j ¼ 1, . . . , s: ð13:6Þ Assume that the process x(t), u(t), t is optimal. Having renumbered the functional constraints, if necessary, we can consider the first m1 constraints-inequalities of the IS1-problem as active: Φi ðxðtÞ, tÞ ¼ 0, i ¼ 1, . . . , m1 ,
Φi ðxðtÞ, tÞ < 0, i ¼ m1 þ 1, . . . , m0 :
In the force of optimality of the process x(t), u(t), t, the point ε ¼ 0 will be a solution to the nonlinear programming problem (NP1-problem) ΔJ 0 ðε, ΠS Þ ¼
s X
a0j ε j þ oðkεkÞ ! min ,
j¼1
ΔJ i ðε, ΠS Þ ¼
s X j¼1
aij ε j þ oðkεkÞ
(
0, i ¼ 1, . . . , m1 ,
¼ 0, i ¼ m0 þ 1, . . . , m, ε j 0, j ¼ 1, . . . , s:
170
13
Problem with Intermediate States
Applying Theorem 12.1 to the NP1-problem, we obtain the primary necessary condition of optimality: for an optimal process x(t), u(t), t, for any ΠS ⊂ Π, there s s s is a Lagrange vector e λ ¼ eλ0 , . . . , eλm satisfying the conditions s s e λi 0, i ¼ 0, . . . , m0 , λ 6¼ 0, e s e λi Φi ðxðtÞ, tÞ ¼ 0, i ¼ 1, . . . , m0 , s
λ , π j 0, j ¼ 1, . . . , s N1 e
with the function
m s X s e λ ,πj ¼ λi aij : N1 e
ð13:7Þ
ð13:8Þ
i¼0
of By construction, the Lagrange vector eλ depends on a set Πs of n sparameters o e variation. Following the Sect. 12.5, we refer a sequence of vectors λ converging s
to a vector λ 6¼ 0 to the right sequence of sets {Πs}. Taking in relations (13.7) the limit by s, we obtain λ 6¼ 0, λi 0, i ¼ 0, . . . , m0 , λi Φi ðxðtÞ, tÞ ¼ 0, i ¼ 1, . . . , m0 , N 1 ðλ, π Þ 0, π 2 Π:
ð13:9Þ
Here π ¼ (δx0, δt0, δt1, δt2, τ, v) is any point of a set Πs, whose coordinate τ is different from t1, t2 and the break points of control u(t) on the segment [t0, t2]. The function N1(λ, π) is obtained s by substituting the expressions (13.6) into the formula (13.8) and replacing e λ , π j by (λ, π). As a result, we have N 1 ðλ, π Þ ¼½Lx0 ðxðtÞ, tÞ ψ ðt 0 , λÞ0 δx0 þ ½Lt0 ðλ, xðtÞ, tÞ þ H ðψ ðt 0 , λÞ, xðt 0 Þ, uðt 0 Þ, t 0 Þδt 0 þ
2 X
Ltk ðλ, xðtÞ, tÞ þ Lxk ðλ, xðtÞ, tÞ0 f ðxðt k Þ, uðt k Þ, t k Þ Δt k
k¼1
Δv H ðψ ðτ, λÞ, xðτÞ, uðτÞ, τÞ, where ψ ðt, λÞ ¼
m X i¼0
m X λi ψ i ðt Þ, L λ, x0 , x1 , x2 , t 0 , t 1 , t 2 ¼ λi Φi x0 , x1 , x2 , t 0 , t 1 , t 2 : i¼0
Multiplying each equality (13.4) by λi and summing them over i ¼ 0, . . ., m, we make sure that the function ψ(t, λ) is a solution to the Cauchy problem
13.3
Maximum Principle for the Problem with an Intermediate State
ψ_ ¼ H x ðψ, xðt Þ, uðt Þ, t Þ, ψ ðt 2 Þ ¼ Lx2 ðλ, xðtÞ, tÞ, ψ ðt 1 0Þ ¼ ψ ðt 1 þ 0Þ Lx1 ðλ, xðtÞ, tÞ:
171
ð13:10Þ
Since π 2 Π is arbitrary and function N1(λ, π) is piecewise continuous by τ, it follows, from the last inequality (13.9), the transversality conditions Lx0 ðxðtÞ, tÞ ψ ðt 0 , λÞ ¼ 0, Lt0 ðλ, xðtÞ, tÞ þ H ðψ ðt 0 , λÞ, xðt 0 Þ, uðt 0 Þ, t 0 Þ ¼ 0
Ltk ðλ, xðtÞ, tÞ þ Lxk ðλ, xðtÞ, tÞ0 f ðxðt k Þ, uðt k Þ, t k Þ ¼ 0, k ¼ 1, 2
and condition for the maximum of Hamiltonian Δv H ðψ ðτ, λÞ, xðτÞ, uðτÞ, τÞ 0, v 2 U, τ 2 ½t 0 , t 2 :
ð13:11Þ
Taking into account the initial condition (13.10), the transversality conditions can be written in the form ψ ðt 0 , λÞ ¼ Lx0 ðxðtÞ, tÞ, L_ tk ðλ, xðtÞ, tÞ ¼ 0, k ¼ 0, 1, 2:
ð13:12Þ
Conclusion 13.1 If the process x(t), u(t), t of the IS1-problem is optimal and the control u(t) is continuous at a point t1, then there exist a vector λ and a continuous at t 6¼ t1 solution ψ(t, λ) of the conjugate system (13.10) for which conditions (13.9), (13.11), (13.12) are valid.
13.3
Maximum Principle for the Problem with an Intermediate State
Let us strengthen Conclusion 13.1 by freeing ourselves from the assumption that the optimal control u(t) is continuous at a point t1. To do this, we will associate with the IS1-problem a new “two-stage” IS2-problem J 0 ¼ Φ0 x1 ðt 0 Þ, x1 ðt 1 Þ, x2 ðt 2 Þ, t 0 , t 1 , t 2 ! min , ( 1 0, i ¼ 1, . . . , m0 , 1 2 J i ¼ Φi x ðt 0 Þ, x ðt 1 Þ, x ðt 2 Þ, t 0 , t 1 , t 2 ¼ 0, i ¼ m0 þ 1, . . . , m, x1 ðt 1 Þ x2 ðt 1 Þ ¼ 0, x_ 1 ¼ f x1 , u1 , t , u1 2 U, x_ 2 ¼ f x2 , u2 , t , u2 2 U, t 0 < t 1 < t 2 :
172
13
Problem with Intermediate States
It can be established a correspondence between the processes of IS1- and IS2problems. Namely, to each process xðt Þ, uðt Þ, t 0 , t 1 , t 2
ð13:13Þ
of IS1-problem it corresponds the process x1 ðt Þ ¼ xðt Þ, t 0 t t 1 , x2 ðt Þ ¼ xðt Þ, t 1 t t 2 ; u1 ðt Þ ¼ uðt Þ, t 0 t < t 1 , u2 ðt Þ ¼ uðt Þ, t 1 t t 2 ; t 0 , t 1 , t 2
ð13:14Þ
of IS2-problem. And conversely, to each process x1 ðt Þ, x2 ðt Þ, u1 ðt Þ, u2 ðt Þ, t 0 , t 1 , t 2
ð13:15Þ
of IS2-problem it corresponds the process xðt Þ ¼ x1 ðt Þ, t 0 t t 1 , xðt Þ ¼ x2 ðt Þ, t 1 t t 2 ; uðt Þ ¼ u1 ðt Þ, t 0 t < t 1 , uðt Þ ¼ u2 ðt Þ, t 1 t t 2 ; t 0 , t 1 , t 2
ð13:16Þ
of IS1-problem (Fig.13.1). Due to the existing correspondence between the processes of IS1- and IS2problems, the optimal process of one problem goes over to the optimal process of the other problem. Indeed, let, for example, the process (13.13) in the IS1-problem be optimal and the corresponding process (13.14) in the IS2-problem be not optimal. Then the IS2-problem contains the process (13.15) with a smaller value of the objective functional. By constructing on its base, the process (13.16) of the IS1problem with the same value of the objective functional, we obtain a better process than the optimal one in this problem which is impossible. The same reasoning can be carried out for the optimal process (13.15) of the IS2-problem. What does the above reasoning give? It turns out a lot. Now we can consider the optimal process (13.13) of the IS1-problem with a possible discontinuity of control u(t) at a point t1, construct by it the optimal process (13.14) of the IS2-problem with continuous controls u1(t), u2(t) at the point t1 and write down the Conclusion 13.1 for it in the form of the maximum principle. Fig. 13.1 Trajectory x(t) consists of pieces of trajectories x1(t), x2(t)
x (t )
x1 (t )
x 2 (t )
13.3
Maximum Principle for the Problem with an Intermediate State
173
We start the implementing of this idea. Let the process (13.13) of the IS1-problem be optimal and the optimal control u(t) may have a discontinuity at a point t1. We construct the optimal process (13.14) of the IS2-problem using the process (13.13). Since the values of the controls u1(t) and u2(t) do not affect on the admissibility and optimality of the process (13.14) outside the corresponding intervals (t0, t1) and (t1, t2) then we extend these controls by continuity on R setting uk ðt Þ ¼ uk ðt k1 þ 0Þ, t < t k1 , uk ðt Þ ¼ uk ðt k 0Þ, t > t k , k ¼ 1, 2: We keep on the function xk(t) by continuity from a segment [tk 1, tk] on a maximum interval as a non-extendable solution of the system of differential equations x_ k ¼ f xk , uk ðt Þ, t , k ¼ 1, 2: Then the extended controls u1(t), u2(t) will be continuous at the points t0, t1, t2 and each extended function xk(t), k ¼ 1, 2 will have a continuous derivative at the points tk 1, tk. In order not to complicate the notation, now as the process (13.15) of the IS2-problem we mean the completed optimal process of the IS2-problem. Write down the conclusions of the previous paragraph for it. We accept the previous notations for the IS1-problem λ ¼ ðλ0 , . . . , λm Þ, H ðψ, x, u, t Þ ¼ ψ 0 f ðx, u, t Þ, m X L λ, x0 , x1 , x2 , t 0 , t 1 , t 2 ¼ λ i Φ i x0 , x1 , x2 , t 0 , t 1 , t 2
ð13:17Þ
i¼0
and the similar notations for the IS2-problem
b ¼ H ψ 1 , x 1 , u1 , t þ H ψ 2 , x 2 , u2 , t , λ, bλ ¼ðλ0 , . . . , λm , λmþ1 , . . . , λmþn Þ, H 0 1 b L ¼L λ, x1 ðt 0 Þ, x1 ðt 1 Þ, x2 ðt 2 Þ, t 0 , t 1 , t 2 þ b λ x ð t 1 Þ x2 ð t 1 Þ : As the phase vector in the IS2-problem it is considered a pair of vectors x1, x2 with terminal and intermediate states
1 1 x ðt 1 Þ x ðt 2 Þ , , x2 ð t 0 Þ x2 ð t 1 Þ x2 ð t 2 Þ x1 ð t 0 Þ
and as a conjugate vector – pair of vectors ψ 1, ψ 2. According to the Conclusion 13.1, if the process (13.15) of the IS2-problem is optimal then there exist vector λ, b λ and continuous at t 6¼ t1 solution ψ 1(t), ψ 2(t) of the conjugate system ψ_ 1 ¼ H x ψ 1 , x1 ðt Þ, u1 ðt Þ, t ,
ψ_ 2 ¼ H x ψ 2 , x2 ðt Þ, u2 ðt Þ, t
174
13
Problem with Intermediate States
with a jump condition ψ 1 ð t 1 0 Þ ¼ ψ 1 ð t 1 þ 0 Þ Lx 1 b λ,
ψ 2 ð t 1 0Þ ¼ ψ 2 ð t 1 þ 0Þ þ b λ,
satisfying the requirements 1. nontriviality, nonnegativity and complementary slackness
λ, b λ 6¼ 0, λi 0, i ¼ 0, . . . , m0 , 1 λi Φi x ðt 0 Þ, x1 ðt 1 Þ, x2 ðt 2 Þ, t 0 , t 1 , t 2 ¼ 0, i ¼ 1, . . . , m0 ; 2. transversality ψ 1 ðt 0 Þ ¼Lx0 , ψ 2 ðt 0 Þ ¼ 0, ψ 1 ðt 2 Þ ¼ 0, ψ 2 ðt 2 Þ ¼ Lx2 , 0 L_ t0 ¼0, L_ t1 þ b λ x_ 1 ðt 1 Þ x_ 2 ðt 1 Þ ¼ 0, L_ t2 ¼ 0; 3. maximum of the Hamiltonian Δv1 H ψ 1 ðt Þ, x1 ðt Þ, u1 ðt Þ, t þ Δv2 H ψ 2 ðt Þ, x2 ðt Þ, u2 ðt Þ, t 0, v1 , v2 2 U, t 2 ½t 0 , t 2 (the derivatives of the function L in the transversality conditions are calculated at the point (λ, x1(t0), x1(t1), x2(t2), t0, t1, t2)). We proceed to the analysis of these conditions. For λ ¼ 0, from the conjugate equations, the jump condition and the transversality it follows b λ¼0 conditions,
b which contradicts to the nontriviality of the vector λ, λ . Therefore, λ 6¼ 0. Due to the conditions of transversality and linearity of the conjugate equations, we have ψ 1 ðt Þ ¼ 0, t 1 t t 2 , ψ 2 ðt Þ ¼ 0, t 0 t t 1 :
ð13:18Þ
Form here ψ 1 ðt 1 þ 0Þ ¼ 0, ψ 2 ðt 1 0Þ ¼ 0: Eliminating the vector bλ ¼ ψ 2 ðt 1 þ 0Þ in the jump condition and transversality condition for t1, we obtain ψ 1 ðt 1 0Þ ¼ ψ 2 ðt 1 þ 0Þ Lx1 ,
L_ t1 ψ 2 ðt 1 þ 0Þ0 x_ 1 ðt 1 Þ x_ 2 ðt 1 Þ ¼ 0:
13.3
Maximum Principle for the Problem with an Intermediate State
175
Since L_ t1 ¼ Lt1 þ L0x1 x_ 1 ðt 1 Þ, then from the last two equalities we have 0 ¼L_ t1 ψ 2 ðt 1 þ 0Þ0 x_ 1 ðt 1 Þ x_ 2 ðt 1 Þ ¼Lt1 þ L0x1 x_ 1 ðt 1 Þ ψ 2 ðt 1 þ 0Þ0 x_ 1 ðt 1 Þ x_ 2 ðt 1 Þ 0 ¼Lt1 þ Lx1 ψ 2 ðt 1 þ 0Þ x_ 1 ðt 1 Þ þ ψ 2 ðt 1 þ 0Þ0 x_ 2 ðt 1 Þ ¼Lt1 ψ 1 ðt 1 0Þ0 x_ 1 ðt 1 Þ þ ψ 2 ðt 1 þ 0Þ0 x_ 2 ðt 1 Þ ¼Lt1 H ψ 1 ðt 1 0Þ, x1 ðt 1 Þ, u1 ðt 1 Þ, t 1 þ H ψ 2 ðt 1 þ 0Þ, x2 ðt 1 Þ, u2 ðt 1 Þ, t 1 : By virtue of equalities (13.18), the condition for the maximum of the Hamiltonian takes the form Δv1 H ψ 1 ðt Þ, x1 ðt Þ, u1 ðt Þ, t 0, v1 2 U, t 2 ½t 0 , t 1 , Δv2 H ψ 2 ðt Þ, x2 ðt Þ, u2 ðt Þ, t 0, v2 2 U, t 2 ½t 1 , t 2 : It is convenient to formulate the final conclusions in terms of the conjugate function ψ ðt Þ ¼ ψ 1 ðt Þ, t 0 t < t 1 ; ψ ðt Þ ¼ ψ 2 ðt Þ, t 1 < t t 2 and the optimal process (13.16) of the IS1-problem corresponding to the extended optimal process (13.15) of the IS2-problem. Taking into account possible discontinuities of the optimal control u(t) at the point t1 and the method of extending of the process (13.15), we put x_ 1 ðt 0 Þ ¼ x_ ðt 0 Þ, x_ 1 ðt 1 Þ ¼ x_ ðt 1 0Þ, x_ 2 ðt 1 Þ ¼ x_ ðt 1 þ 0Þ, x_ 2 ðt 2 Þ ¼ x_ ðt 2 Þ: Theorem 13.1 (maximum principle for the problem with an intermediate state) If x(t), u(t), t, t ¼ (t0, t1, t2), is the optimal process of IS1-problem then there exist vector λ ¼ (λ0, . . ., λm) and continuous for t 6¼ t1 solution ψ(t) of the conjugate system ψ_ ¼ H x ðψ, xðt Þ, uðt Þ, t Þ with the jump condition ψ ðt 1 0Þ ¼ψ ðt 1 þ 0Þ Lx1 ðλ, xðtÞ, tÞ, H ðψ ðt 1 0Þ, xðt 1 Þ, uðt 1 0Þ, t 1 Þ ¼H ðψ ðt 1 þ 0Þ, xðt 1 Þ, uðt 1 þ 0Þ, t 1 Þ þ Lt1 ðλ, xðtÞ, tÞ, satisfying the requirements
176
13
Problem with Intermediate States
1. nontriviality, nonnegativity and complementary slackness λ 6¼ 0, λi 0, i ¼ 0, . . . , m0 , λi Φi ðxðtÞ, tÞ ¼ 0, i ¼ 1, . . . , m0 ; 2. transversality ψ ðt 0 Þ ¼ Lx0 ðλ, xðtÞ, tÞ, ψ ðt 2 Þ ¼ Lx2 ðλ, xðtÞ, tÞ, L_ t0 ðλ, xðtÞ, tÞ ¼ 0, L_ t2 ðλ, xðtÞ, tÞ ¼ 0; 3. maximum of the Hamiltionian H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ ¼ max H ðψ ðt Þ, xðt Þ, u, t Þ, t 2 ½t 0 , t 2 u2U
with the functions H, L of the form (13.17).
Remark 13.1 If a control u(t) is continuous at the point t1, then the jump condition takes the form ψ ðt 1 0Þ ¼ ψ ðt 1 þ 0Þ Lx1 ðλ, xðtÞ, tÞ, L_ t1 ðλ, xðtÞ, tÞ ¼ 0: As can be seen from the above formulation, the additional phase state x(t1) causes a jump in the conjugate function at an intermediate time t1. The same picture is observed when several intermediate phase states are included in the conditions of the G-problem, while the transversality conditions for the ends of the trajectory and terminal points of time remain unchanged. Theorem 13.1 includes, in particular, the maximum principle for the G-problem. If the conditions of the IS1-problem do not depend on x(t1) and t1 then the jump condition turns into the condition of continuity of the conjugate function and the Hamiltonian.
13.4
Discontinuous Systems
By discontinuous systems we mean systems of ordinary differential equations with piecewise continuous right-hand side. Such systems arise in the control of dynamic processes in nonhomogeneous environment with different physical properties, for example, in the design of composite or multilayer materials with specified characteristics. Discontinuous systems naturally appear in the theory of optimal control when synthesizing optimal systems and in the theory of automatic control when ensuring the stability of controlled systems by introducing discontinuous feedbacks.
13.4
Discontinuous Systems
177
Some non-smooth control systems can be described in terms of discontinuous systems. For example, a differential equation x_ ¼ jxj þ u is equivalent to an equation with a “discontinuous” right-hand side
x_ ¼
x þ u, x < 0, x þ u, x 0:
The theory of discontinuous systems is a whole direction in optimal control which is interesting for its approaches, methods and applications. Here is just one result that illustrates its specificity. Consider a generalization of the general optimal control problem (D-problem) J 0 ¼Φ0 ðxðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ ! min , ( 0, i ¼ 1, . . . , m0 , J i ¼Φi ðxðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ ¼ 0, i ¼ m0 þ 1, . . . , m, ( 1 f ðx, u, t Þ, pðx, t Þ < 0, x_ ¼ u 2 U, t 0 < t 1 f 2 ðx, u, t Þ, pðx, t Þ > 0, in which the right-hand side of the system of differential equations has a finite discontinuity on the surface P ¼ fðx, t Þ 2 Rn R : pðx, t Þ ¼ 0g: We regard the analytical properties of the functions f 1(x, u, t), f 2(x, u, t) to be the same as those of a function f(x, u, t). We assign the function p(x, t) to the class C1(Rn R ! R) and assume that at each pare (x, t) 2 P, u 2 U the following sewing conditions are satisfied p_ 1 ðx, u, t Þ ¼px ðx, t Þ0 f 1 ðx, u, t Þ þ pt ðx, t Þ > 0, p_ 2 ðx, u, t Þ ¼px ðx, t Þ0 f 2 ðx, u, t Þ þ pt ðx, t Þ > 0:
ð13:19Þ
Let us explain the meaning of conditions (13.19). Suppose that a trajectory xk(t) of the system of equations x_ ¼ f k ðx, uðt Þ, t Þ, k ¼ 1, 2 corresponds to some control u(t) and initial values (ξ, τ) 2 P. Then p xk ðτÞ, τ ¼ pðξ, τÞ ¼ 0 and by virtue of conditions (13.19), we have
178
13
Fig. 13.2 Integral curves (xk(t), t), k ¼ 1, 2 intersect the surface P without one-sided tangencies
P
Problem with Intermediate States
( x 2 (t ), t )
( x1 (t ), t )
([W , )
( x(t ), t ) Fig. 13.3 Disposition of integral curves of a discontinuous system relative to the rupture surface when sewing conditions are met
P
d k p x ðt Þ, t ¼ p_ k xk ðτÞ, uðτ 0Þ, τ > 0: dt t¼τ0 Consequently, the function t ! p(xk(t), t) in the neighborhood of the point t ¼ τ increases and changes the sign from minus to plus, i.e. the integral curve (xk(t), t) intersects the surface P at a point (ξ, τ) without one-sided tangencies (Fig. 13.2). Let the trajectories xk(t) for some δ > 0 be defined on a common interval (τ δ, τ + δ). We compose a continuous function x(t) from them setting xðt Þ ¼ x1 ðt Þ, τ δ < t < τ, xðt Þ ¼ x2 ðt Þ, τ t < τ þ δ (Fig. 13.2). By construction x(τ) ¼ ξ. Function x(t) satisfies the conditions p(x, t) < 0, x_ ¼ f 1 ðx, uðt Þ, t Þ on the interval (τ δ, τ) and the conditions p(x, t) > 0, x_ ¼ f 2 ðx, uðt Þ, t Þ on the interval (τ, τ + δ). And we consider the function x(t) as a solution of the system of differential equations ( x_ ¼
f 1 ðx, uðt Þ, t Þ, pðx, t Þ < 0, f 2 ðx, uðt Þ, t Þ, pðx, t Þ > 0
ð13:20Þ
with discontinuous right-hand side, corresponding to the control u(t). Thus, if the sewing conditions (13.19) are satisfied then for any fixed control exactly one continuous integral curve of the system (13.20) passes through each point of the surface P and has no one-sided tangencies with P (Fig. 13.3). We proceed to the derivation of the necessary conditions for optimality in the Dproblem. Consider the processes
13.4
Discontinuous Systems
179
xðt Þ, uðt Þ, t 0 , t 1
ð13:21Þ
of D-problem which at times t0, τ, t1, t0 < τ < t1 satisfy the conditions pðxðt 0 Þ, t 0 Þ < 0,
pðxðτÞ, τÞ ¼ 0,
pðxðt 1 Þ, t 1 Þ > 0,
and suppose that the optimal process is among them. Let us connect the D-problem with the next two-stage problem (T-problem) J 0 ¼ Φ0 x1 ðt 0 Þ, x2 ðt 1 Þ, t 0 , t 1 ! min , ( 1 0, i ¼ 1, . . . , m0 , 2 J i ¼ Φi x ðt 0 Þ, x ðt 1 Þ, t 0 , t 1 ¼ 0, i ¼ m0 þ 1, . . . , m, 1 1 2 x ðτÞ x ðτÞ ¼ 0, p x ðτÞ, τ ¼ 0, x_ 1 ¼ f 1 x1 , u1 , t , x_ 2 ¼ f 2 x2 , u2 , t , u1 , u2 2 U, t 0 < τ < t 1 : In the T-problem, the trajectories of two controlled systems are joined by continuity on the surface P at a non-fixed time τ. The same conditions as in the D-problem are imposed on the left end of the integral curve of the first system and the right end of the integral curve of the second system. The process (13.21) of the D-problem corresponds to the process x1 ðt Þ, x2 ðt Þ, u1 ðt Þ, u2 ðt Þ, t 0 , τ, t 1
ð13:22Þ
of the T-problem and conversely. The correspondence formulas are similar to (13.13)–(13.16) if we replace t1 by τ and t2 by t1. By virtue of this correspondence, the optimal process (13.22) of the T-problem corresponds to the optimal process (13.21) of the D-problem. We define the optimal process in the manner indicated in Sect. 13.3, extending by continuity a pair x1(t), u1(t) to the exterior of the interval (t0, τ) and a pair x2(t), u2(t) to the exterior of the interval (τ, t1). The values of the functionals J0, . . ., Jm on the extended process do not change, so we retain the notation (13.22) for it. Let us write down the necessary optimality conditions for the extended optimal process (13.22) of the T-problem (Theorem 13.1 and Remark 13.1). We put λ ¼ ðλ0 , . . . , λm Þ, λ1 ¼ ðλmþ1 , . . . , λmþn Þ, μ 2 R, H k ðψ, x, u, t Þ ¼ ψ 0 f k ðx, u, t Þ, k ¼ 1, 2, m X L λ, x0 , x1 , t 0 , t 1 ¼ λi Φi x0 , x1 , t 0 , t 1 , i¼0
e ¼ H 1 ψ 1 , x 1 , u1 , t þ H 2 ψ 2 , x 2 , u2 , t , H 0 e L ¼ L λ, x1 ðt 0 Þ, x2 ðt 1 Þ, t 0 , t 1 þ λ1 x1 ðτÞ x2 ðτÞ þ μp x1 ðτÞ, τ :
ð13:23Þ
180
13
Problem with Intermediate States
According to the maximum principle, there are vectors λ, λ1, number μ and continuous for t 6¼ τ solutions ψ 1(t), ψ 2(t) of the conjugate system of differential equations ψ_ 1 ¼ H 1x ψ 1 , x1 ðt Þ, u1 ðt Þ, t ,
ψ_ 2 ¼ H 2x ψ 2 , x2 ðt Þ, u2 ðt Þ, t
with the jump condition ψ 1 ðτ 0Þ ¼ ψ 1 ðτ þ 0Þ λ1 μpx x1 ðτÞ, τ , ψ 2 ðτ 0Þ ¼ ψ 2 ðτ þ 0Þ þ λ1 , 1 0 1 λ x_ ðτÞ x_ 2 ðτÞ þ μp_ 1 x1 ðτÞ, u1 ðτÞ, τ ¼ 0, satisfying the requirements 1. nontriviality, nonnegativity and complementary slackness
λ, λ1 , μ 6¼ 0, λi 0, i ¼ 0, . . . , m0 ,
λi Φi x1 ðt 0 Þ, x2 ðt 1 Þ, t 0 , t 1 ¼ 0,
i ¼ 1, . . . , m0 ; 2. transversality ψ 1 ðt 0 Þ ¼ Lx0 , ψ 1 ðt 1 Þ ¼ 0, ψ 2 ðt 0 Þ ¼ 0, ψ 2 ðt 1 Þ ¼ Lx1 , L_ t0 ¼ 0, L_ t1 ¼ 0; 3. maximum of Hamiltonian Δv1 H 1 ψ 1 ðt Þ, x1 ðt Þ, u1 ðt Þ, t þ Δv2 H 2 ψ 2 ðt Þ, x2 ðt Þ, u2 ðt Þ, t 0, v1 , v2 2 U, t 2 ½t 0 , t 1 (the derivatives of the function L in transversality conditions are calculated at the point (λ, x1(t0), x2(t1), t0, t1)). We analyze these conditions. For λ ¼ 0, from the conjugate equations, the jump condition and the transversality conditions, it follows λ1 ¼ 0, μ ¼ 0 which contradicts the requirement (λ, λ1, μ) 6¼ 0. Therefore, λ 6¼ 0. Due to the conditions of transversality and linearity of the conjugate equations, we have ψ 1 ðt Þ ¼ 0, τ t t 1 ; ψ 2 ðt Þ ¼ 0, t 0 t τ:
ð13:24Þ
Then, in the jump condition for the conjugate functions ψ 1 ðτ þ 0Þ ¼ 0, ψ 2 ðτ 0Þ ¼ 0 ψ 2 ðτ þ 0Þ ¼ λ1 : Eliminating the vector λ1 in the jump condition and in the transversality condition for τ, we obtain
13.4
Discontinuous Systems
181
ψ 1 ðτ 0Þ ¼ ψ 2 ðτ þ 0Þ μpx x1 ðτÞ, τ , ψ 2 ðτ þ 0Þ0 x_ 1 ðτÞ x_ 2 ðτÞ þ μp_ 1 x1 ðτÞ, u1 ðτÞ, τ ¼ 0: By virtue of equalities (13.24), the condition for the maximum of the Hamiltonian takes the form Δv1 H 1 ψ 1 ðt Þ, x1 ðt Þ, u1 ðt Þ, t 0, v1 2 U, t 2 ½t 0 , τ, Δv2 H 2 ψ 2 ðt Þ, x2 ðt Þ, u2 ðt Þ, t 0, v2 2 U, t 2 ½τ, t 1 : It is convenient to formulate the final conclusions in terms of the conjugate function
ψ ðt Þ ¼
ψ 1 ðt Þ, t 0 t < τ, ψ 2 ðt Þ, τ < t t 1 ,
Hamiltonian ( H ðψ, x, u, t Þ ¼
ψ 0 f 1 ðx, u, t Þ, pðx, t Þ < 0, ψ 0 f 2 ðx, u, t Þ, pðx, t Þ > 0
ð13:25Þ
of the discontinuous system (13.20) and the optimal process (13.21) of the Dproblem corresponding to the extended optimal process (13.22) of the T-problem. Theorem 13.2 (maximum principle for D-problem) Let the D-problem satisfies the sewing condition (13.19) and let x(t), u(t), t0, t1 be the optimal process of this problem the integral curve of which intersects the discontinuity surface p(x, t) ¼ 0 at a time τ 2 (t0, t1). Then there exist vector λ and continuous solution ψ(t) for t 6¼ τ of the conjugate system ψ_ ¼ H x ðψ, xðt Þ, uðt Þ, t Þ with the jump condition ψ ðτ 0Þ ¼ ψ ðτ þ 0Þ μpx ðxðτÞ, τÞ, μ ¼ ψ ðτ þ 0Þ0 f 1 ðxðτÞ, uðτÞ, τÞ f 2 ðxðτÞ, uðτÞ, τÞ =p_ 1 ðxðτÞ, uðτÞ, τÞ
ð13:26Þ
for which the following conditions hold 1. nontriviality, nonnegativity and complementary slackness λ 6¼ 0, λi 0, i ¼ 0, . . . , m0 ,
λi Φi ðxðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ ¼ 0, i ¼ 1, . . . , m0 ;
182
13
Problem with Intermediate States
2. transversality ψ ðt 0 Þ ¼ Lx0 ðλ, xðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ, ψ ðt 1 Þ ¼ Lx1 ðλ, xðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ, L_ t0 ðλ, xðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ ¼ 0, L_ t1 ðλ, xðt 0 Þ, xðt 1 Þ, t 0 , t 1 Þ ¼ 0; 3. maximum of Hamiltonian H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ ¼ max H ðψ ðt Þ, xðt Þ, u, t Þ, t 2 ½t 0 , t 1 u2U
with the functions L, H of the form (13.23), (13.25). The statement is proved by the previous reasoning. As can be seen from the Theorem 13.2, the discontinuity of the right-hand side of the differential equations causes a jump in the solution of the conjugate system at the moment of sewing the integral curve of the discontinuity surface. The solution jump occurs in the direction of the gradient px. The existence of a factor μ in the jump condition was pointed out in the fundamental monograph [13]. The formula (13.26) for the calculation of μ was obtained in [5]. Theorem 13.2 was derived under rather stringent assumptions: the sewing conditions must be satisfied over the entire discontinuity surface for any control. This is due to the method of proof – reduction of the original problem to a problem with an intermediate condition. If we directly use the method of increments and the implicit function theorem then we can get rid of this assumption [2] and require the fulfillment of the sewing conditions only at the point of intersection of the optimal trajectory with the discontinuity surface. We also note that other possible cases of mutual position of the optimal trajectory and the discontinuity surface remained outside of our vision. The corresponding results can be found in the monograph [2]. Exercise Set 1. Show the sufficiency of the maximum principle in the linear convex IS1-problem with a fixed intermediate time. 2. Let us call two (or more) systems of differential equations a composite system if they are given on adjacent time intervals and have the conditions for the conjugation of trajectories at the ends of the intervals. For example, the initial values of the next system are some functions of the right end of the trajectory of the previous system. Formulate a two-stage two-point performance problem with the condition of continuous conjugation of trajectories at some non-fixed interval t1 2 (t0, t2). Apply the maximum principle for the IS1-problem to this problem. 3. The problem of optimal control of a discrete system ΦðxðN ÞÞ ! min , xðk þ 1Þ ¼ f k ðxðk Þ, uðkÞÞ, k ¼ 1, . . . , N 1, xð1Þ ¼ 0, uðkÞ 2 ½0, 1r , k ¼ 1, . . . , N 1
13.4
Discontinuous Systems
183
it is possible to reduce to an IS1-problem Φ xN ð2Þ ! min , xkþ1 ð1Þ f k xk ð1Þ, yk ð1Þ ¼ 0, k ¼ 1, . . . , N 1, x1 ð0Þ ¼ 0, yk ð1Þ 2 ½0, 1r , k ¼ 1, . . . , N 1, x_ k ¼ 0, y_ k ¼ 0, k ¼ 1, . . . , N with moments of time t0 ¼ 0, t1 ¼ 1, t2 ¼ 2 and phase states xk(t), yk(t), k ¼ 1, . . ., N. Apply Theorem 13.1 to the reduced problem. Check whether there is a recursive connection of Lagrange multipliers corresponding to constraints-equality. Formulate the necessary conditions of optimality in terms of the initial problem. 4. Let the objective functional, the constraints and the system of differential equations of the IS1-problem include a vector of parameters w with values in a given set W ⊂ Rq. Find the functional increment corresponding to the parameters e ¼ w þ Δw. To ensure the condition w e 2 W the increment Δw can be w and w chosen from the conical approximation K(w, W) of the set W at the point w. By definition, for any finite set of vectors δw1, . . ., δws from K(w, W ), there exists a vector function o(kεk) such that wþ
s X
ε j δw j þ oðkεkÞ 2 W
j¼1
for all small in norm vectors ε ¼ (ε1, . . ., εs) 0. The further scheme for obtaining the maximum principle remains unchanged. A necessary condition for the optimality of the parameter w is in satisfying inequality g0δw 0 for all δw 2 K(w, W), where g is the vector expressed in terms of the problem data.
Chapter 14
Extremals Field Theory
Abstract We introduce the concept “field of extremals” and prove the main statement of the section on the optimality of extremals – sufficiency of the maximum principle. We also consider on the application of extremals field theory to the problem of constructing invariant systems.
14.1
Specifying of the Problem
The subject of our attention is the problem of the sufficiency of the maximum principle. We have already touched on it in relation to linear-convex optimal control problems. The proof of the sufficiency of the maximum principle used the specifics of the linear-convex problem. Now we get acquainted with another technique [7] for inverting the necessary optimality conditions using the concept of an extremal, a field of extremals, and an exact formula for large increments of a functional. In presenting the theory, we deviate somewhat from the author’s original [7] and instead of the concept of L-continuity of the field of extremals we use more simple concept of the Lipschitz property of this field. Along the way, we will consider [6] the important applied problem of invariance – the independence of a functional from perturbations acting on the motion of controlled object. It is convenient to present field theory in the particular case of the general problem (PG-problem) J 0 ¼ Φ0 ðxðt 1 Þ, t 1 Þ ! min , J i ¼ Φi ðxðt 1 Þ, t 1 Þ ¼ 0, i ¼ 1, . . . , m, x_ ¼ f ðx, u, t Þ, xðt 0 Þ ¼ x0 , u 2 U, t 0 t 1 with the fixed left end of the integral curve. In the sense of the problem, it is required to transfer the controlled object from the initial state (x0, t0) on the terminal manifold M ¼ fðx, t Þ 2 Rn R : Φi ðx, t Þ ¼ 0, i ¼ 1, . . . , mg, m n
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_14
185
186
14 Extremals Field Theory
along the integral curve of the system of differential equations with the minimum value of the objective functional. The unknowns are control, trajectory and moment of time t1.
14.2
Field of Extremals
We call a process that satisfies the maximum principle for the PG-problem an extremal process and its integral curve – an extremal. According to the maximum principle, for an extremal process x(t), u(t), t0, t1 there exist a vector λ ¼ (λ0, . . ., λm) 6¼ 0 and a continuous solution ψ(t) of the conjugate system ψ_ ¼ H x ðψ, xðt Þ, uðt Þ, t Þ, satisfying the conditions ψ ðt 1 Þ ¼ Lx1 ðλ, xðt 1 Þ, t 1 Þ,
L_ t1 ðλ, xðt 1 Þ, t 1 Þ ¼ 0,
H ðψ ðt Þ, xðt Þ, uðt Þ, t Þ ¼ max H ðψ ðt Þ, xðt Þ, u, t Þ, t 2 ½t 0 , t 1 u2U
with the functions m X λi Φi x1 , t 1 : H ðψ, x, u, t Þ ¼ ψ 0 f ðx, u, t Þ, L λ, x1 , t 1 ¼ i¼0
Since (x(t0), t0) ¼ (x0, t0), we say that the extremal comes out or emanates from the point (x0, t0). If in the conditions of the PG-problem we replace the fixed (x0, t0) ones by parameters ξ, τ then we can consider the extremals emanating from different points in space Rn R (Fig. 14.1). The set E of all points (ξ, τ) from which extremals
Fig. 14.1 Formation of the field of extremals E
14.2
Field of Extremals
187
come out is called the field of extremals. Obviously, at least one extremal passes through each point E, each extremal entirely lies in the field E, and M ⊂ E. With the point (ξ, τ) 2 E, it is associated an extreme process x(t; ξ, τ), u(t; ξ, τ), τ, t1(ξ, τ) with the corresponding Lagrange vector λ(ξ, τ) and the solution ψ(t; ξ, τ) of the conjugate system of differential equations. We define on the set E the supporting function of the field V ðξ, τÞ ¼Φ0 ðxðt 1 ðξ, τÞ; ξ, τÞ, t 1 ðξ, τÞÞ, b uðξ, τÞ ¼ uðτ; ξ, τÞ, b ðξ, τÞ ¼ H ðψ b ðξ, τÞ ¼ψ ðτ; ξ, τÞ, H b ðξ, τÞ, ξ, b ψ uðξ, τÞ, τÞ:
ð14:1Þ
Consider two extremal processes x t; e ξ, eτ , u t; e ξ, eτ , eτ, t 1 e ξ, eτ , ð14:2Þ whose extremal emanates from a point (ξ, τ) close to e ξ, eτ . If necessary, by virtue of xðt; ξ, τÞ, uðt; ξ, τÞ, τ, t 1 ðξ, τÞ,
the maximum principle, we extend the processes (14.2) on a total time interval n oi h ξ, eτ I ξ, τ, e ξ, eτ ¼ min fτ, eτg, max t 1 ðξ, τÞ, t 1 e and give the characteristic of the proximity of these processes h e ξ, eτ; ξ, τ ¼ e ξ ξ þ jeτ τj þ t 1 e ξ, eτ t 1 ðξ, τÞ ð þ Δu t;eξ,eτ f ðxðt; ξ, τÞ, uðt; ξ, τÞ, τÞdτ I ξ, τ;e ξ,e τ ð þ Δu t;eξ,eτ f x ðxðt; ξ, τÞ, uðt; ξ, τÞ, τÞdτ: I ξ, τ;e ξ,e τ The field ofextremals is said to be Lipschitz if there is such constant C > 0 that for any points e ξ, eτ , ðξ, τÞ 2 E the inequality h e ξ, eτ; ξ, τ < C e ξ ξ þ jeτ τj holds, and normal if the Lagrange multiplier λ0(ξ, τ) ¼ 1 for all (ξ, τ) 2 E.
188
14.3
14 Extremals Field Theory
Exact Formula for Large Increments of a Functional
We return to the PG-problem with fixed initial values x0, t0 and consider its extremal process x(t), u(t), t0, t1 and an arbitrary process exðt Þ, e uðt Þ, t 0 , et 1, whose integral curve lies entirely in E (Fig. 14.2). Let us determine the increment ΔJ 0 ¼ Φ0 ðexðet 1 Þ, et 1 Þ Φ0 ðxðt 1 Þ, t 1 Þ of the objective functional on these processes. We assume that the field of extremals E is Lipschitz, normaland and open so that there is a possibility for local continuation of extremals. By definition of the supporting function (14.1), we have Φ0 ðexðet 1 Þ, et 1 Þ ¼ V ðexðet 1 Þ, et 1 Þ and Φ0 ðxðt 1 Þ, t 1 Þ ¼ V ðxðt 0 Þ, t 0 Þ ¼ V ðexðt 0 Þ, t 0 Þ. Therefore, the increment ΔJ0 is equal to the increment of the supporting function of the field at the ends of the integral curve ðexðt Þ, t Þ: ΔJ 0 ¼ V ðexðet 1 Þ, et 1 Þ V ðexðt 0 Þ, t 0 Þ: Divide the segment ½t 0 , et 1 by points t 0 ¼ τ0 < τ1 < ⋯ < τsþ1 ¼ et 1 into s + 1 partial segments [τk, τk + 1] of the equal length Δτ ¼ τk + 1 τk, k ¼ 0, . . ., s. Then we can write ΔJ 0 ¼ V ðexðet 1 Þ, et 1 Þ V ðexðt 0 Þ, t 0 Þ ¼
s X
½V ðexðτkþ1 Þ, τkþ1 Þ V ðexðτk Þ, τk Þ: ð14:3Þ
k¼0
Let us label the extremals outgoing from the point ðexðτk Þ, τk Þ and connected with them functions by the index k: λk ¼ λðexðτk Þ, τk Þ, ψ k ðt Þ ¼ ψ ðt; exðτk Þ, τk Þ, xk ðt Þ ¼ xðt; exðτk Þ, τk Þ, uk ðt Þ ¼ uðt; exðτk Þ, τk Þ, t 1k ¼ t 1 ðexðτk Þ, τk Þ, k ¼ 0, . . . , s
Fig. 14.2 Integral curves of the extrel and arbitrary process in E emanating from a point (x0, t0)
( x (t1 ), t1 )
( x (t ), t ) ( x0 , t0 )
M ( x (t1 ), t1 )
( x(t ), t )
ð14:4Þ
14.3
Exact Formula for Large Increments of a Functional
189
M ( x (t ), t ) 1
(
( x (t ), t ) (
)
,
1
)
( x (W k 1 ),W k 1 )
( x k 1 (t1,k 1 ), t1,k 1 )
( x (W k ),W k )
( x k (t1k ), t1k )
( x0 , t0 )
( x(t ), t )
( x (t ), t )
1
1
Fig. 14.3 Extremals in E for determining the increment of the objective functional
(Fig. 14.3). Since the field of extremals is normal and the functions Φ1, . . ., Φm are equal to zero on the terminal manifold M, then V ðexðτk Þ, τk Þ ¼ L λk , xk ðt 1k Þ, t 1k , V ðexðτkþ1 Þ, τkþ1 Þ ¼ L λk , xkþ1 ðt 1,kþ1 Þ, t 1,kþ1 :
ð14:5Þ
Therefore, from (14.3), (14.5) we have ΔJ 0 ¼
s X k¼0 k
ΔLk ,
ΔLk ¼L λ , x
kþ1
ðt 1,kþ1 Þ, t 1,kþ1 L λ , x ðt 1k Þ, t 1k , k
k
ð14:6Þ k ¼ 0, . . . , s:
We pick out in increments ΔLk the terms of the first order of smallness using formula (12.1). In this formula we accept Φi x0 , x1 , t 0 , t 1 ¼L λk , x1 , t 1 , ðxðt Þ, uðt Þ, t 0 , t 1 Þ ¼ xk ðt Þ, uk ðt Þ, τk , t 1k , uðt Þ, et 0 , et 1 Þ ¼ xkþ1 ðt Þ, ukþ1 ðt Þ, τkþ1 , t 1,kþ1 : ðexðt Þ, e Then ΔLk ¼ ψ k ðτk Þ0 ½exðτkþ1 Þ exðτk Þ þ H ψ k ðτk Þ, xk ðτk Þ, ukþ1 ðτk Þ, τk Δτ þ Lt1 λk , xk ðt 1k Þ, t 1k H ψ k ðt 1k Þ, xk ðt 1k Þ, ukþ1 ðt 1k Þ, t 1k ðt 1,kþ1 t 1k Þ tð1k Δukþ1 ðtÞ H ψ k ðt Þ, xk ðt Þ, uk ðt Þ, t dt þ oðhk Þ, τk
ð14:7Þ
190
14 Extremals Field Theory
where ð hk ¼kexðτkþ1 Þ exðτk Þk þ Δτ þ jt 1,kþ1 t 1k j þ Δukþ1 f xk ðt Þ, uk ðt Þ, t dtþ Ik
ð
þ Δukþ1 f x xk ðt Þ, uk ðt Þ, tÞdt, I k ¼ ½ min fτk , τkþ1 g, max ft 1k , t 1,kþ1 g: Ik
ð14:8Þ Changing the limits of integration in the integral in (14.7) tð1k
τkþ1 ð
Δukþ1 ðtÞ Hdt ¼ τk
t 1,kþ1 ð
Δukþ1 ðtÞ Hdt þ τk
tð1k
Δukþ1 ðtÞ Hdt þ τkþ1
Δukþ1 ðtÞ Hdt t 1,kþ1
and applying the mean theorem to the compensating integrals, we obtain ΔLk ¼ ψ k ðτk Þ0 ½exðτkþ1 Þ exðτk Þ þ H ψ k ðτk Þ, xk ðτk Þ, uk ðτk Þ, τk Δτ þ Lt1 λk , xk ðt 1k Þ, t 1k H ψ k ðt 1k Þ, xk ðt 1k Þ, uk ðt 1k Þ, t 1k ðt 1,kþ1 t 1k Þ t 1,kþ1 ð
Δukþ1 ðtÞ H ψ k ðt Þ, xk ðt Þ, uk ðt Þ, t dt þ oðhk Þ:
τkþ1
Due to the extremality of the process xk(t), uk(t), τk, t1k, this implies ΔLk ψ k ðτk Þ0 ½exðτkþ1 Þ exðτk Þ þ H ψ k ðτk Þ, xk ðτk Þ, uk ðτk Þ, τk Δτ þ oðhk Þ h i ¼ ψ k ðτk Þ0ex_ ðτk Þ þ H ψ k ðτk Þ, xk ðτk Þ, uk ðτk Þ, τk Δτ þ oðhk Þ:
ð14:9Þ
From (14.6) and (14.9) we find a lower bound for the increment of the objective functional ΔJ 0
s X
"
s X ¼ ψ k ðτk Þ0ex_ ðτk Þ þ H ψ k ðτk Þ, xk ðτk Þ, uk ðτk Þ, τk Þ Δτ þ oðhk Þ:
k¼0
k¼0
ð14:10Þ In the same way, we estimate the increment ΔJ 0 ¼
s X L λkþ1 , xk ðt 1k Þ, t 1k L λkþ1 , xkþ1 ðt 1,kþ1 Þ, t 1,kþ1 k¼0
14.3
Exact Formula for Large Increments of a Functional
191
considering now xk + 1(t), uk + 1(t), τk + 1, t1, k + 1 as the basic process. We get ΔJ 0
s
X
ψ kþ1 ðτkþ1 Þ0 ½exðτk Þ exðτkþ1 Þ
k¼0 s X oð hk Þ þH ψ kþ1 ðτkþ1 Þ, xkþ1 ðτkþ1 Þ, ukþ1 ðτkþ1 Þ, τkþ1 ÞðΔτÞ þ k¼0
or, after obvious transformations, ΔJ 0
sþ1 h X
ψ k ðτk Þ0ex_ ðτk Þ
k¼1
þH ψ ðτk Þ, x ðτk Þ, u ðτk Þ, τk Þ Δτ þ k
k
k
s X
ð14:11Þ oðhk Þ:
k¼0
Inequalities (14.10), (14.11) give lower and upper bounds for the increment of the objective functional. In the notation (14.1) we can write down the common terms of these bounds in the form b ðexðτk Þ, τk Þ, exðτk Þ, e ψ k ðτk Þ0ex_ ðτk Þ þ H ψ k ðτk Þ, xk ðτk Þ, uk ðτk Þ, τk ¼ H ðψ uðτk Þ, τk Þ b ðexðτk Þ, τk Þ, exðτk Þ, b þ H ðψ uðexðτk Þ, τk Þ, τk Þ b ðexðτk Þ, τk Þ, exðτk Þ, b ¼ Δeuðτk Þ H ðψ uðexðτk Þ, τk Þ, τk Þ: Therefore
s X k¼0
b ðexðτk Þ, τk Þ, exðτk Þ, b Δeuðτk Þ H ðψ uðexðτk Þ, τk Þ, τk ÞΔτ þ
sþ1 X k¼1
s X
oðhk Þ ΔJ 0
k¼0
b ðexðτk Þ, τk Þ, exðτk Þ, b Δeuðτk Þ H ðψ uðexðτk Þ, τk Þ, τk ÞΔτ þ
s X
oðhk Þ:
k¼0
ð14:12Þ We evaluate the remainder σ ¼
s P k¼0
oðhk Þ in (14.12). For the piecewise continuous
function ex_ ðt Þ on the interval ½t 0 , et 1 the following estimate holds:
192
14 Extremals Field Theory
τ kþ1 ð e e f ð x , ð t Þ, u ð t Þ, Þdt kexðτkþ1 Þ exðτk Þk ¼ τk
τkþ1 ð
uðt Þ, tÞkdt DΔτ, k ¼ 0, . . . , s, k f ðexðt Þ, e
τk
uðt Þ, tÞk: D ¼ max k f ðexðt Þ, e t 0 te t1 At the same time, by virtue of (14.8) and the Lipschitz property of the field of extremals, we have hk C ðkexðτkþ1 Þ exðτk Þk þ ΔτÞ C ð1 þ DÞΔτ: Therefore hk ! 0 for Δτ ! 0. The estimate joðhk Þj joðhk Þj hk joðhk Þj Cð1 þ DÞ ¼ Δτ hk Δτ hk shows that |o(hk)| is small of order higher than Δτ, that is, there is such o(Δτ) that | o(hk)| o(Δτ) for any k ¼ 1, . . ., s. Then X X s s s X oð hk Þ oðΔτÞ ¼ ðs þ 1ÞoðΔτÞ j oð hk Þ j jσ j ¼ k¼0 k¼0 k¼0 ¼ðs þ 1ÞΔτ
oðΔτÞ oðΔτÞ ¼ ðet 1 t 0 Þ Δτ Δτ
and σ ! 0 for Δτ ! 0. From inequalities (14.12), in the limit at Δτ ! 0, we obtain an exact formula for large increments of the objective functional ΔJ 0 ¼Φ0 ðexðet 1 Þ, et 1 Þ Φ0 ðxðt 1 Þ, t 1 Þ eðt1 b ðexðt Þ, t Þ, exðt Þ, b ¼ ΔeuðtÞ H ðψ uðexðt Þ, t Þ, t Þdt:
ð14:13Þ
t0
As one can see, the increment of functional is expressed by the curvilinear integral of b ðx, t Þ, x, b uðx, t Þ, t Þ along the trajectory exð½t 0 , et 1 Þ ¼ the function ΔeuðtÞ H ðψ
e exðt Þ : t 2 ½t 0 , t 1 . In a short note ð ΔJ 0 ¼
ex t0 ,et1
b ðx, t Þ, x, b uðx, t Þ, t Þdt: ΔeuðtÞ H ðψ
14.3
Exact Formula for Large Increments of a Functional
193
The formula of the increment of functional includes all elements of the process exðt Þ, e uðt Þ, t 0 , et 1. The components of the extreme process x(t), u(t), t0, t1 are expressed here by an implicitly synthesized control b uðx, t Þ and initial values x0, t0. Indeed, if the extremal (x(t), t) outgoing from the point (x0, t0) is unique then b uðxðt Þ, t Þ ¼ uðt; xðt Þ, t Þ ¼ uðt Þ: Along with the control b uðx, t Þ, the formula for the functional increment includes the b ðx, t Þ. The determining of these functions in some domain b ðx, t Þ and H functions ψ requires the construction of extremals emanating from the points of the field of extremals, that is, solving the corresponding boundary-value problems of the maximum principle for the PG-problem. Obviously, the accuracy of the formula and the ability to calculate large increments of the functional come at the cost of constructing a field of extremals. Example 14.1 Consider a particular case of the PG-problem t 1 ! min , e0 xðt 1 Þ ¼ 0, x_ ¼ u, xðτÞ ¼ ξ, u 2 ½1, 1n , τ t 1 with the vector e ¼ (1, . . ., 1) 2 Rn and parameters ξ 2 Rn, τ 2 R, e0ξ 0. Let us construct a field of extremals. We put H ¼ ψ 0u, L ¼ λ0t1 + λ1e0x1 and write out the conditions of the maximum principle: λ ¼ ðλ0 , λ1 Þ 6¼ 0, λ0 0; ψ_ ¼ 0, ψ ðt 1 Þ ¼ λ1 e; λ0 þ λ1 e0 ujt¼t1 ¼ 0; u ¼ ðsignψ 1 , . . . , signψ n Þ: From here we find λ1 6¼ 0, ψ ¼ λ1 e, u ¼ ðsign λ1 Þe, λ0 ¼ njλ1 j > 0: According to the conditions of the problem, the extremal trajectories outgoing from the all points of the half-space e0x 0 must fall on the plane e0x ¼ 0. For extreme control u ¼ (sign λ1)e, this is possible only when λ1 < 0. Taking λ1 ¼ ν (ν ¼ 1/ n), we get λ0 ¼ n|λ1| ¼ 1. As a result, all elements of the extreme process are obtained from the conditions of the problem and the maximum principle: xðt; ξ, τÞ ¼ ξ þ ðt τÞe, uðt; ξ, τÞ ¼ e, τ, t 1 ðξ, τÞ ¼ τ νe0 ξ:
ð14:14Þ
In the phase space, the extremal trajectories are line segments connecting the points of the half-space e0x 0 with their orthogonal projections onto the plane e0x ¼ 0 (Fig. 14.4). The extremal field E is formed by all admissible pairs of parameters (ξ, τ).
194
14 Extremals Field Theory
Fig. 14.4 Extremal trajectories in Example 14.1 for n ¼ 2
Over the extreme process, we determine functions of the field b ðξ, τÞ ¼ 1: b ðξ, τÞ ¼ νe, b uðξ, τÞ ¼ e, H V ðξ, τÞ ¼ τ νe0 ξ, ψ It is easy to see that the field of extremals is Lipschitz and normal. Therefore, formula (14.13) of large increments of the functional is valid. For illustration, let us use it to calculate the increment of the objective functional on the process exðt Þ ¼ ðνt 1Þe, e uðt Þ ¼ νe, 0, et 1 ¼ n and on the extreme process (14.14) with the corresponding parameters ξ ¼ e, τ ¼ 0 and t1(e, 0) ¼ 1. By formula (14.13), we obtain ðn ΔJ 0 ¼
ν2 e0 e 1 dt ¼ ðν 1Þn ¼ n 1:
0
14.4
Sufficiency of the Maximum Principle
We prove the main statement of the section on the optimality of extremals. Theorem 14.1 (sufficiency of the maximum principle for PG-problem) The extremal emanating from any point (x0, t0) of the open, normal and Lipschitz field of extremals E provides a global minimum to the objective functional of the PG-problem. Proof Consider an extremal process x(t), u(t), t0, t1 and any process exðt Þ, e uðt Þ, t 0 , et 1 of the PG-problem the integral curves of which come out from an arbitrary fixed point (x0, t0) 2 E and lie entirely in E. Processes with integral curves that are not entirely in E can be excluded from consideration, since they do not satisfy the maximum principle and therefore are not optimal. By formula (14.13), the increment of the objective functional on these processes is equal to
14.5
Invariance of the Systems
195
eðt1
b ðexðτÞ, τÞ, exðτÞ, b Φ0 ðexðet 1 Þ, et 1 Þ Φ0 ðxðt 1 Þ, t 1 Þ ¼ ΔeuðτÞ H ðψ uðexðτÞ, τÞdτ: ð14:15Þ t0
For each τ 2 ½t 0 , et 1 , an extremal process xðt; exðτÞ, τÞ, uðt; exðτÞ, τÞ, τ, t 1 ðexðτÞ, τÞ with the corresponding Lagrange vector λðexðτÞ, τÞ and conjugate function ψ ðt; exðτÞ, τÞ satisfies the condition for the maximum of the Hamiltonian, therefore ΔeuðtÞ H ðψ ðt; exðτÞ, τÞ, xðt; exðτÞ, τÞ, uðt; exðτÞ, τÞ, t Þ 0, t 2 ½τ, t 1 ðexðτÞ, τÞ: From here, at t ¼ τ we get ΔeuðτÞ H ðψ ðτ; exðτÞ, τÞ, xðτ; exðτÞ, τÞ, uðτ; exðτÞ, τÞ, τÞ b ðexðτÞ, τÞ, exðτÞ, b uðexðτÞ, τÞ, τÞ 0: ¼ ΔeuðτÞ H ðψ Since the last inequality is true for all τ 2 ½t 0 , et 1 , we have Φ0 ðexðet 1 Þ, et 1 Þ Φ0 ðxðt 1 Þ, t 1 Þ 0 from the formula (14.15). Consequently, the objective functional reaches a global minimum on the process x(t), u(t), t0, t1. The statement is proven.
14.5
Invariance of the Systems
In many applications of optimal control, it is necessary to take into account the action of random perturbations on the controlled object. For example, in the problem of soft landing of an aircraft, it is required to ensure zero vertical speed of the aircraft at the moment of landing, regardless of the state of the atmosphere. In this and other similar cases, the goal of control is to ensure the constancy (invariance) of the given functionals on the set of perturbed processes. Example 14.2 Consider the model example illustrating the above discussion. Given the problem J 0 ¼ ðx1 ð1Þ x2 ð1ÞÞ2 ! inv, x_ 1 ¼ x2 þ 2u þ v, x_ 2 ¼ u, x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0: Here u is a control and v ¼ v(t) is a piecewise continuous perturbation on R. The symbol J0 ! inv means that we are considering the invariance problem. Let us find out whether it is possible to provide a global minimum J0 0 in the problem for any perturbations. Let us take u ¼ x2 v in differential equations. Then x_ 1 ¼ x_ 2 ¼
196
14 Extremals Field Theory
x2 v. From here, by integrating and taking into account the initial conditions, we obtain x1(1) x2(1) ¼ 0. Hence, J0 0. We see that control u ¼ x2 v not exclude the influence of perturbations on the equations of motion but ensures the constancy of the objective functional at the minimum level. It is clear that a special theory is required to solve more complicated problems of invariance. We will get acquainted with one of the approaches to the problem [6], using the theory of the field of extremals. Consider a special case of the PG-problem (PG1-problem) J 0 ¼Φ0 ðxðt 1 Þ, t 1 Þ ! inv, Φ1 ðxðt 1 Þ, t 1 Þ ¼ 0, x_ ¼f ðx, v, t Þ, xðt 0 Þ ¼ x0 , v 2 V, t 0 t 1 , where v ¼ v(t) is an arbitrary piecewise continuous vector function on R with values in a compact set V ⊂ Rq. We are interested in the question of independence (invariance) of the functional J0 from perturbations v(t) in the PG1-problem. When deriving the invariance criterion, we additionally assume that on the terminal surface M 1 ¼ fðx, t Þ 2 Rn R : Φ1 ðx, t Þ ¼ 0g it is satisfied the condition Φg1 ðx, v, t Þ ¼ Φ1x ðx, t Þ0 f ðx, v, t Þ þ Φ1t ðx, t Þ 6¼ 0, ðx, t Þ 2 M 1 , v 2 V:
ð14:16Þ
We start with the necessary conditions for invariance. Let the objective functional of the PG1-problem be independent of perturbations. Then on any two processes xðt Þ, vðt Þ, t 0 , t 1 ;
exðt Þ, evðt Þ, t 0 , et 1
ð14:17Þ
of this problem, the following equality is true ΔJ 0 ¼ Φ0 ðexðet 1 Þ, et 1 Þ Φ0 ðxðt 1 Þ, t 1 Þ ¼ 0:
ð14:18Þ
For definiteness, we regard the first process as fixed basic and the second one as arbitrary. To obtain meaningful conclusions from equality (14.18), we construct a special field of extremals, replacing the fixed initial values x0, t0 in the PG1-problem with the parameters ξ 2 Rn, τ 2 R. For the basic perturbation v(t) and for parameters ξ, τ, we define the functions λðξ, τÞ ¼ ðλ0 ðξ, τÞ, λ1 ðξ, τÞÞ, from conditions
ψ ðt; ξ, τÞ, xðt; ξ, τÞ, t 1 ðξ, τÞ
ð14:19Þ
14.5
Invariance of the Systems
197
x_ ðt; ξ, τÞ ¼ f ðxðt; ξ, τÞ, vðt Þ, t Þ, τ t t 1 ðξ, τÞ, xðτ; ξ, τÞ ¼ ξ, Φ1 ðxðt 1 ðξ, τÞ; ξ, τÞ, t 1 ðξ, τÞÞ ¼ 0, ψ_ ðt; ξ, τÞ ¼ H x ðψ ðt; ξ, τÞ, xðt; ξ, τÞ, vðt Þ, t Þ, τ t t 1 ðξ, τÞ, X ψ ðt 1 ðξ, τÞ; ξ, τÞ ¼ λi ðξ, τÞΦix ðxðt 1 ðξ, τÞ; ξ, τÞ, t 1 ðξ, τÞÞ, X
ð14:20Þ
i¼0, 1
λi ðξ, τÞΦi ðxðt 1 ðξ, τÞ; ξ, τÞ, vðτÞ, t 1 ðξ, τÞÞ
¼ 0:
i¼0, 1
According to the definition, the integral curves (x(t; ξ, τ), t) emanate from the points (ξ, τ) and fall on the surface M1 at t ¼ t1(ξ, τ); the conjugate function ψ(t; ξ, τ) and Lagrange multipliers λ0(ξ, τ), λ1(ξ, τ) satisfy the conditions of transversality of the maximum principle for the PG1-problem for x0 ¼ ξ, t0 ¼ τ. We may accept λ0(ξ, τ) ¼ 1. Then, by virtue of (14.16), the multiplier λ1(ξ, τ) is determined uniquely from the last equality of (14.20). We denote by E1 the set of points (ξ, τ) 2 Rn R for which conditions (14.20) are satisfied. Obviously, M1 ⊂ E1 and (x0, t0) 2 E1. Due to the first two conditions (14.20) and the accepted assumptions, there is a correspondence between points (ξ, τ) 2 E1 and points (ξ1, τ1) ¼ (x(t1(ξ, τ); ξ, τ), t1(ξ, τ)) 2 M1. Therefore, the set E1 can be equivalently defined as a set of points of integral curves of the Cauchy problem x_ ¼ f ðx, vðt Þ, t Þ, xðτ1 Þ ¼ ξ1 for all (ξ1, τ1) 2 M1. Let us now consider process exðt Þ, evðt Þ, t 0 , et 1 of the form (14.18), whose integral curve ðexðt Þ, t Þ lies completely in the set E1, that is, exðt Þ ¼ xðt; exðt Þ, t Þ,
t 0 t et 1 :
ð14:21Þ
The subset of the space Rn R filled by the integral curves of all such processes (with property (14.21)) will be denoted by E. By construction E ⊂ E1. Due to the uniqueness of solutions of the system of differential equations (14.20), we have b ðexðt Þ, t Þ: ψ ðt; exðt Þ, t Þ ¼ ψ ðt; xðt; exðt Þ, t Þ, t Þ ¼ ψ
ð14:22Þ
Since the process exðt Þ, evðt Þ, t 0 , et 1 is arbitrary, formula (14.22) defines the conjugate b ðx, t Þ on the set E. By construction, we have bvðx, t Þ ¼ vðt Þ for supporting function ψ all (x, t) 2 E. By virtue of the choice λ0(ξ, τ) ¼ 1, the field E is normal. Suppose additionally that E is open and Lipschitz. Let us choose an arbitrary fixed process exðt Þ, evðt Þ, t 0 , et 1 of the PG1-problem, whose integral curve lies entirely in Е. Using formulas (14.13), (14.18), we obtain eðt1
b ðexðt Þ, t Þ, exðt Þ, vðt Þ, t Þdt: ð14:23Þ 0 ¼ Φ0 ðexðet 1 Þ, et 1 Þ Φ0 ðxðt 1 Þ, t 1 Þ ¼ ΔevðtÞ H ðψ t0
198
14 Extremals Field Theory
It is easy to see that the openness of the set E and the assumption (14.6) admit the application in the formula (14.23) of the needle variation evðt Þ ¼ v, t 2 ½τ, τ þ εÞ, evðt Þ ¼ vðt Þ, t= 2½τ, τ þ εÞ of the basic perturbation with parameters v 2 V, τ 2 ½t 0 , et 1 Þ and small ε > 0. Applying it to (14.23) and then passing to the limit at ε ! 0, we obtain b ðexðτÞ, τÞ, exðτÞ, vðτÞ, τÞ ¼ 0, τ 2 ½t 0 , et 1 Þ: Δ v H ðψ Since the processes exðt Þ, evðt Þ, t 0 , et 1 are arbitrary, this implies b ðx, t Þ, x, vðt Þ, t Þ ¼ 0, ðx, t Þ 2 E, v 2 V: Δv H ðψ
ð14:24Þ
Using the exact formula for large increments of the functional, it is easy to verify that the last equality is sufficient for condition (14.18) for any (x0, t0) 2 E. Let us formulate the obtained conclusion under the above assumptions and notations. Theorem 14.2 (criterion of invariance) Let E ⊂ E1 be an open Lipschitz subset of Rn R in the PG1-problem. For the invariance of the functional J0, it is necessary and sufficient that for some supporting perturbation v(t) the equality b ðx, t Þ0 ½ f ðx, v, t Þ f ðx, vðt Þ, t Þ ¼ 0 ψ holds for all (x, t) 2 E, v 2 V. The invariance of the functional J0 takes place for any initial values (x0, t0) 2 E. We show one of the applications of the Theorem 14.2.
14.6
Examples of an Invariant System
Example 14.3 Let us find out when in the simplest linear stationary problem of invariance J 0 ¼c0 xðt 1 Þ ! inv, ð14:25Þ x_ ¼Ax þ Bu, xðt 0 Þ ¼ 0, u 2 U, t 0 t t 1 the functional J0 does not depend on the control for a given time segment [t0, t1]. We assume that c 6¼ 0, U is a compact set in Rr, 0 2 int U. Concretizing conditions (14.20) for problem (14.25) and assuming the basic control (perturbation) u ¼ 0, function Φ1(x, t) ¼ t t1, we obtain the Cauchy problems x_ ¼ Ax, xðτÞ ¼ ξ, ψ_ ¼ A0 ψ, ψ ðt 1 Þ ¼ c,
14.6
Examples of an Invariant System
199
where ξ 2 Rn, τ 2 R are parameters. Using the fundamental matrix F(t, τ) of the homogeneous equation x_ ¼ Ax and the Cauchy formula, we have xðt; ξ, τÞ ¼F ðt, τÞξ, ψ ðt; ξ, τÞ ¼ F ðt 1 , t Þ0 c, b ðξ, τÞ ¼ψ ðτ; ξ, τÞ ¼ F ðt 1 , τÞ0 c: ψ The domain of the solutions of a stationary system of differential equations is real line totally. Hence, extremals fill the entire space Rn R. By Theorem 14.2, the condition c0 F ðt 1 , t ÞBu ¼ 0, t 2 ½t 0 , t 1 , u 2 U
ð14:26Þ
is the invariance criterion. Since 0 2 int U, then equality (14.26) holds for arbitrary u 2 U if and only if c0 F ðt 1 , t ÞB ¼ 0, t 0 t t 1 : Differentiating the last identity n 1 times, we obtain c0 F ðt 1 , t ÞB ¼ c0 F ðt 1 , t ÞAB ¼ . . . ¼ c0 F ðt 1 , t ÞAn1 B ¼ 0, t 0 < t < t 1 : In the limit at t ! t1, t < t1, we arrive at a criterion for the invariance of the objective functional: c0 B, AB, . . . , An1 B ¼ 0: This ratio means incomplete controllability of the system differential equations (14.25) on the segment [t0, t1]. We discussed the reason of this obtaining in the Sect. 4.11. Here it can be easily explained. According to Cauchy formula, the reachability set Q(t1) of the system (14.25) consists of the points ðt1 xðt 1 Þ ¼
F ðt 1 , t ÞBuðt Þdt, t0
when the control runs through the entire class K(R ! U ). Hence, by virtue of (14.26), we have c0x(t1) ¼ 0, that is, the reachability set of an invariant (and at the same time not completely controllable system) lies in a plane orthogonal to the vector c and passing through the origin of coordinates of the space Rn. Example 14.4 Consider the linear-quadratic invariance problem 2J 0 ¼kxðt 1 Þk2 ! inv, x_ ¼Aðt Þx þ Bðt Þu þ Cðt Þv, xðt 0 Þ ¼ 0, t 2 ½t 0 , t 1
ð14:27Þ
200
14 Extremals Field Theory
in standard vector-matrix notation with corresponding continuous matrices A(t), B(t), C(t) on R and given t0, t1. We have J0 ¼ 0 if x ¼ 0, u ¼ 0, v ¼ 0. The problem is: we need to keep the previous acceptable value of the functional under perturbation v 6¼ 0 using the control correction u. Taking the basic “perturbation” u ¼ 0, v ¼ 0 and function Φ1(x, t) ¼ t t1, we construct a field of extremals for the problem (14.27) by conditions (14.20). For arbitrary parametres ξ 2 Rn, τ t1, we find solutions x(t; ξ, τ), ψ(t; ξ, τ) of the Cauchy problems x_ ¼ Aðt Þx, xðτ; ξ, τÞ ¼ ξ; ψ_ ¼ Aðt Þ0 ψ, ψ ðt 1 ; ξ, τÞ ¼ xðt 1 ; ξ, τÞ: Using a fundamental matrix F(t, τ) of the system x_ ¼ Aðt Þx and the Cauchy formula, we obtaine xðt; ξ, τÞ ¼ F ðt, τÞξ, ψ ðt; ξ, τÞ ¼ F ðt 1 , t Þ0 F ðt 1 , τÞξ, t 0 t t 1 : Consequently, a function b ðξ, τÞ ¼ ψ ðτ; ξ, τÞ ¼ F ðt 1 , τÞ0 F ðt 1 , τÞξ ψ is defined in the field of extremals E ¼ Rn [t0, t1]. By the Theorem 14.2, for the invariance of the system, it is necessary and sufficient to fulfill the condition b ðx, t Þ0 ½Bðt Þu þ C ðt Þv ¼x0 F ðt 1 , t Þ0 F ðt 1 , t Þ½Bðt Þu þ C ðt Þv ψ ¼0, t 0 t t 1 :
ð14:28Þ
Assume that an arbitrary fixed perturbation v(t) corresponds to a piecewise continuous solution u(t) of Eq. (14.28) and a solution x(t) of the system (14.27), that is, the equalities xðt Þ0 F ðt 1 , t Þ0 F ðt 1 , t Þ½Bðt Þuðt Þ þ Cðt Þvðt Þ ¼ 0, x_ ðt Þ ¼ Aðt Þxðt Þ þ Bðt Þuðt Þ þ Cðt Þvðt Þ, xðt 0 Þ ¼ 0 hold for t 2 [t0, t1]. Using these equalities, it is easy to check that ðt1 2
2J 0 ¼ kxðt 1 Þk ¼
xðt Þ0 F ðt 1 , tÞ0 F ðt 1 , tÞxðt ÞÞ dt ¼ 0:
t0
If the choice of control from condition (14.28) is possible for any x 6¼ 0, t 2 [t0, t1], then functional J0 does not depend on the perturbations.
14.6
Examples of an Invariant System
201
Exercise Set 1. Will the exact formula for the increment of functional remain valid for the PGproblem with inequality constraints? 2. What changes will arise in the theory of the field of extremals for problems with a moving left endpoint of the integral curve? 3. Write down the exact formula for the large increments of functional for the Sproblem. 4. Construct the field of extremals in the example “mass-spring”.
Chapter 15
Sufficient Optimality Conditions
Abstract We introduce the Krotov method for obtaining the sufficient optimality conditions for optimal control problem with mixed constraints. It is illustrated the application of sufficient conditions for the solution of particular examples and the problem of analytical formation of the regulator. The relationship of sufficient optimality conditions and the Bellman method of dynamic programming is considered.
15.1
Common Problem of Optimal Control
Here we get acquainted with another approach [11] for solving general optimal control problem based on sufficient optimality conditions. The idea for their product is simple. Assume that in the optimal control problem a lower boundary of the objective functional on a set of processes is known. If a lower boundary is attained for some process, it will be optimal. The main difficulty is to obtain a lower bound of an objective functional and to check its reachability. The object of our attention is the common optimal control problem (C-problem) ðt1 J ¼Φðxðt 1 ÞÞ þ F ðxðt Þ, uðt Þ, t Þ ! min , t0
x_ ¼f ðx, u, t Þ, xðt 0 Þ ¼ x0 , xðt 1 Þ 2 G, ðxðt Þ, uðt ÞÞ 2 V ðt Þ, t 2 ½t 0 , t 1 with fixed time, fixed left and moving right ends of the trajectory and mixed constraints on control and phase state of the control object. Here, the scalar functions Φ(x) and F(x, u, t) are continuous on the sets Rn and Rn Rr R respectively, the vector function f(x, u, t) meets the requirements of Sect. 2.3, sets G ⊂ Rn and V(t) ⊂ Rn Rr for t 2 [t0, t1] are given, moments of time t0, t1, t0 < t1 and point x0 2 Rn are fixed. Understanding a control u(t) and solution x(t) of the system x_ ¼ f ðx, uðt Þ, t Þ in the previous sense, we call the pair x(t), u(t) a process of the Cproblem if it satisfies all of its conditions, except, possibly, the first. The C-problem © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7_15
203
204
15
Sufficient Optimality Conditions
is to find the optimal process x(t), u(t) with the minimum value of the objective functional J.
15.2
Basic Theorems
Introduce a function φ(x, t) of the class C1(Rn R ! R) and define the scalar functions QðxÞ ¼ ΦðxÞ þ φðx0 , t 0 Þ φðx, t 1 Þ, x 2 Rn , Pðx, u, t Þ ¼ F ðx, u, t Þ þ φx ðx, t Þ0 f ðx, u, t Þ þ φt ðx, t Þ, ðx, u, t Þ 2 Rn Rr R: ð15:1Þ For an arbitrary process x(t), u(t) of the C-problem we have ðt1 J ¼ Qðxðt 1 ÞÞ þ Pðxðt Þ, uðt Þ, t Þdt:
ð15:2Þ
t0
Indeed, we have in the notation (15.1) ðt1 Qðxðt 1 ÞÞ þ Pðxðt Þ, uðt Þ, t Þdt ¼ Φðxðt 1 ÞÞ þ φðx0 , t 0 Þ φðxðt 1 Þ, t 1 Þ t0
ðt1
ðt1
t0
t0
þ F ðxðt Þ, uðt Þ, t Þdt þ ½φx ðxðt Þ, t Þ0 f ðxðt Þ, uðt Þ, t Þ þ φt ðxðt Þ, t Þdt ðt1 ¼ J þ φðxðt 0 Þ, t 0 Þ φðxðt 1 Þ, t 1 Þ þ dφðxðt Þ, t Þ ¼ J: t0
With the aid of (15.2), we determine a lower boundary of the objective functional on the set of processes ðt1
ðt1
J ¼ Qðxðt 1 ÞÞ þ Pðxðt Þ, uðt Þ, t Þdt inf QðxÞ þ
inf
ðx, uÞ2V ðt Þ
x2G
t0
From here, the following statements [11] can be made.
t0
Pðx, u, t Þdt:
15.2
Basic Theorems
205
Theorem 15.1 (sufficient conditions of optimality) For the optimality of a process x(t), u(t) of the C-problem, it is sufficient that there exists a function φ(x, t) satisfying the conditions Qðxðt 1 ÞÞ ¼ inf QðxÞ, Pðxðt Þ, uðt Þ, t Þ ¼ x2G
inf
ðx, uÞ2V ðt Þ
Pðx, u, t Þ, t 2 ½t 0 , t 1
ð15:3Þ
or less restrictive conditions ðt1 Qðxðt 1 ÞÞ ¼ inf QðxÞ,
ðt1 Pðxðt Þ, uðt Þ, t Þdt ¼
x2G
t0
inf
Pðx, u, t Þdt:
ðx, uÞ2V ðt Þ t0
Example 15.1 We use the sufficient conditions to find a solution of the problem ð1 J ¼ ½xðt Þ uðt Þdt ! min , 0
x_ ¼u, xð0Þ ¼ 0, xð1Þ ¼ 0, juj x 1, 0 t 1: Here n ¼ r ¼ 1, x0 ¼ 0, t0 ¼ 0, t1 ¼ 1. The functions Φ(x) ¼ 0, f(x, u, t) ¼ u, F(x, u, t) ¼ x u do not depend on t, the set G consists of one point x ¼ 0 on the number line R, and the set V(t) given by inequalities |u| x 1 is a triangle in the plane R2 of variables x, u. Functions (15.1) take the form QðxÞ ¼ φð0, 0Þ φðx, 1Þ, Pðx, u, t Þ ¼ x u þ φx ðx, t Þu þ φt ðx, t Þ: So that the function P does not depend on u, we put φ(x, t) ¼ x. Then QðxÞ ¼ x, Pðx, u, t Þ ¼ x, min QðxÞ ¼ min ðxÞ ¼ 0, arg min QðxÞ ¼ 0, x2G
x¼0
x2G
min Pðx, u, t Þ ¼ min x ¼ 0, arg min Pðx, u, t Þ ¼ ð0, 0Þ, t 2 ½0, 1:
ðx, uÞ2V ðt Þ
jujx1
ðx, uÞ2V ðt Þ
The minimum point x ¼ 0, u ¼ 0 forms the process x(t) ¼ 0, u(t) ¼ 0 of the problem which satisfies the sufficient optimality conditions (15.3). Therefore, it is optimal. The solution of some of the general optimal control problems in a given class of controls may not exist. In this case, as the solution of C-problem, it is natural to regard any minimizing sequence {xs(t), us(t)} of processes along which a corresponding sequence {Js} of the values of objective functional converges to its infimum J on a set of processes of the problem as Js ! J.
206
15
Sufficient Optimality Conditions
We provide the sufficient conditions to minimize a sequence. Consider an arbitrary fixed sequence {xs(t), us(t)} of the processes of the C-problem. We introduce a function φ(x, t) of the class C1(Rn R ! R) and the corresponding functions (15.1). By formula (15.2) for each process xs(t), us(t), we compute a corresponding value of the objective functional ðt1 J s ¼ Qðx ðt 1 ÞÞ þ Pðxs ðt Þ, us ðt Þ, t Þdt s
t0
and determine its lower boundary ðt1 J ¼ inf QðxÞ þ
inf
Pðx, u, t Þdt:
ðx, uÞ2V ðt Þ
x2G
t0
Obviously ðt1
ðt1
J s ¼ Qðx ðt 1 ÞÞ þ Pðx ðt Þ, u ðt Þ, t Þdt inf QðxÞ þ s
s
s
inf
Pðx, u, t Þdt ¼ J :
ðx, uÞ2V ðt Þ
x2G
t0
t0
From here, it immediately follows an analogue of the Theorem 15.1. Theorem 15.2 (sufficient conditions for minimizing sequence) In order for the sequence {xs(t), us(t)} of the processes of C-problem to be minimized, it is sufficient for a function φ(x, t) to exist satisfying the conditions ðt1 Qðx ðt 1 ÞÞ ! inf QðxÞ,
ðt1 Pðx ðt Þ, u ðt Þ, t Þdt !
s
s
x2G
s
t0
inf
ðx, uÞ2V ðt Þ
Pðx, u, t Þdt:
t0
Example 15.2 We apply Theorem 15.2 to solve the problem ð1 J ¼ x2 ðt Þdt ! min , x_ ¼ u, xð0Þ ¼ xð1Þ ¼ 0, juj ¼ 1, t 2 ½0, 1: 0
By formula (15.1), we compose the functions QðxÞ ¼ φð0, 0Þ φðx, 1Þ, Pðx, u, t Þ ¼ x2 þ φx ðx, t Þu þ φt ðx, t Þ: Let us put φ(x, t) ¼ 1. Then Q(x) ¼ 0, P(x, u, t) ¼ x2. Function P has minimum in the point x ¼ 0. The function x(t) ¼ 0 cannot be the solution of a differential equation as
15.3
Analytical Construction of the Controller
207
x
Fig. 15.1 Graph of a function xs(t), t 2 [0, 1] for s¼4
0
1
t
a result of condition |u| ¼ 1. We construct its approximation by the xs(t), us(t) processes, putting ðt x ðt Þ ¼ us ðτÞdτ, us ðt Þ ¼ sign sin 2πst, t 2 R, s ¼ 1, 2, . . . : s
0
The relay control us(t) alternately achieves values +1 and 1 on the intervals of constancy of the same length 1/(2s). The corresponding trajectory xs(t) is a sawtooth curve (Fig. 15.1) with s teeth that approximates the function x(t) ¼ 0 on the segment [0,1] with an accuracy of ð1
ð1 2
1=2s ð 2
J s ¼ ðx ðt Þ xðt ÞÞ dt ¼ ðx ðt ÞÞ dt ¼ 2s s
s
0
t 2 dt ¼
1 : 12s2
0
0
For the sequence {xs(t), us(t)}, the conditions of Theorem 15.2 hold Qðxs ð1ÞÞ ¼ min QðxÞ ¼ 0, x2G
ð1
ð1
1 J s ¼ Pðx ðt Þ, u ðt Þ, t Þdt ¼ ðx ðt ÞÞ dt ¼ ! 12s2 s
0
s
s
ð1
2
0
min Pðx, u, t Þdt ¼ 0:
ðx, uÞ2V 0
Consequently, this sequence is minimized.
15.3
Analytical Construction of the Controller
The problem in completing an analytical construction of a regulator is to maintain a controlled object in a small neighborhood of a given trajectory. Let the motion of the object be described by a system of differential equations x_ ¼ f ðx, uÞ
208
15
Sufficient Optimality Conditions
with a given initial state x(t0) ¼ x0 6¼ 0 Assume, f(0, 0) ¼ 0 and x ¼ 0 are a desired state of equilibrium of the object for u ¼ 0. It is thus necessary to bring closer the trajectory x(t) of a motion of the object to the equilibrium x ¼ 0 under a correction control u(t) in a finite amount of time. Formally, we refer to the problem of an analytical construction of a regulator (AC-problem) ðt1
0
J ¼xðt 1 Þ Φxðt 1 Þ þ
xðt Þ0 Cxðt Þ þ uðt Þ0 Duðt Þ dt ! min ,
t0
x_ ¼Ax þ Bu, xðt 0 Þ ¼ x0 , t 2 ½t 0 , t 1 : Here, the quadratic objective functional with given weight matrices characterizes the closeness of the processes x(t), u(t) to the desired process x ¼ 0, u ¼ 0, and a stationary linear system of differential equations is obtained by the linearization of the original non-linear system of equations along the process x ¼ 0, u ¼ 0, i.e., A ¼ fx(0, 0), B ¼ fu(0, 0). Let us assume, that Φ, C are symmetric non-negatively defined matrices and that D is a symmetric positively-defined matrix. Obviously, the AC-problem is the particular case of a C-problem with sets G ¼ Rn, V(t) ¼ Rn Rr. We apply the sufficient optimality conditions to solve it. Using formulas (15.1), we compose the functions QðxÞ ¼ x0 Φx þ φðx0 , t 0 Þ φðx, t 1 Þ, x 2 Rn , Pðx, u, t Þ ¼ x0 Cx þ u0 Du þ φx ðx, t Þ0 ðAx þ BuÞ þ φt ðx, t Þ, ðx, u, t Þ 2 Rn Rr R: We seek the function φ in a quadratic form φðx, t Þ ¼ x0 K ðt Þx with a symmetric differentiable matrix K(t). After transformations that are not complicated, we obtain QðxÞ ¼ x0 ½Φ K ðt 1 Þx þ x00 K ðt 0 Þx0 , Pðx, u, t Þ ¼ x0 K_ ðt Þ þ K ðt ÞA þ A0 K ðt Þ þ C x þ u0 Du þ 2x0 K ðt ÞBu: Find the minimum of the function P(x, u, t) on the set V ¼ Rn Rr. For a fixed t 2 [t0, t1], we have min
ðx, uÞ2Rn Rn
Pðx, u, t Þ ¼ minn minr Pðx, u, t Þ: x2R u2R
By virtue of the positive definiteness of the matrix D, the condition Pu ðx, u, t Þ ¼ 2½Du þ B0 K ðt Þx ¼ 0:
15.3
Analytical Construction of the Controller
209
is necessary and sufficient for the minimum of the function P by u. We use it to obtain the minimum point uðx, t Þ ¼ D1 B0 K ðt Þx and
ð15:4Þ
minr Pðx, u, t Þ ¼ x0 K_ ðt Þ þ K ðt ÞA þ A0 K ðt Þ K ðt ÞBD1 B0 K ðt Þ þ C x: u2R
In order to eliminate the dependence of the functions Q(x) and the minr Pðx, u, t Þ on x, u2R
we subject the matrix K(t) to the conditions K_ ¼ KA A0 K þ KBD1 B0 K C, K ðt 1 Þ ¼ Φ:
ð15:5Þ
If the Cauchy problem (15.5) has a solution K(t) on the entire segment [t0, t1], the formula (15.4) completely defines the synthesized control u(x, t) and minn QðxÞ ¼ x00 K ðt 0 Þx0 , minn minr Pðx, u, t Þ ¼ 0: x2R
x2R u2R
ð15:6Þ
The solution x(t) of the linear Cauchy problem x_ ¼ A BD1 B0 K ðt Þ x, xðt 0 Þ ¼ x0 corresponding to control u(x, t) is defined on all segments [t0, t1]. By force of equalities (15.6), the conditions (15.3) of Theorem 15.1 hold for the process x(t), u(t) ¼ D1B0K(t)x(t). Consequently, the process x(t), u(t) is optimal. With the aid of formulas (15.2), (15.6), we determine the minimum of the objective functional J ¼ x00 K ðt 0 Þx0 :
ð15:7Þ
In summary, Theorem 15.3 Let the data of the AC-problem satisfy the above conditions and the solution K(t) of the Cauchy problem (15.5) exist over the entire segment [t0, t1]. Then, the synthesized control (15.4) is optimal for any initial value x0. The minimum value of the objective functional in the AC-problem equals x00 K ðt 0 Þx0 . The above arguments prove this theorem. Note that the assumption of the existence of a solution of the matrix Riccati equation (15.5) in the conditions of the theorem is essential. For example, the scalar Riccati equation y_ ¼ y2 has a solution y ¼ 1/(t + c) with a singularity of t ¼ c for any constant c. Thus, success in solving the AC-problem is a result of the selection of a function φ in a quadratic form with variable coefficients. Formula (15.7) shows that such a choice corresponds to the essence of the problem. This formula also clarifies the meaning of the function φ. The value φ(ξ, τ) is the minimum of the objective functional in the AC-problem if we take ξ 2 Rn, τ t1 as the initial values of a linearized system of equations. Everything said above is true as well with natural changes for the AC-
210
15
Sufficient Optimality Conditions
problem with variable coefficients. In technical applications based on synthesized control u(x, t), automatic devices (regulators) are constructed to correct the motion of a controlled object by a deviation from the equilibrium. This explains the name of AC-problem.
15.4
Relation with Dynamic Programming
We show that for a certain choice of function φ from the sufficient conditions of optimality, the fundamental equation of dynamic programming follows [3]. Let us return to the G-problem formulated in Sect. 15.1. For convenience of presentation, we introduce the sets associated with the set V(t) ⊂ Rn Rr for t 2 [t0, t1]. The first of them U(x, t) ⊂ Rr is the section V(t) with fixed x 2 Rn U ðx, t Þ ¼ fu 2 Rr : ðx, uÞ 2 V ðt Þg: The second set X(t) ⊂ Rn is the orthogonal projection V(t) onto Rn X ðt Þ ¼ fx 2 Rn : U ðx, t Þ 6¼ ∅g (Fig. 15.2). The third set D ⊂ Rn t ! X(t) on [t0, t1]
+ 1
is the graph of the multivalued function
D ¼ ðx, t Þ ⊂ Rnþ1 : x 2 X ðt Þ, t 2 ½t 0 , t 1 (Fig. 15.3). Obviously, if t 2 [t0, t1], then each point (x, u) 2 V(t) corresponds to a pair of points x 2 X(t), u 2 U(x, t) and vice versa. This means that the exact lower bound of the function g(x, u) defined on V(t) can be found sequentially: first by u 2 U(x, t) with fixed x 2 X(t) and then by x 2 X(t). In other words inf
ðx, uÞ2V ðt Þ
Fig. 15.2 Projections X(t) and cut U(x, t) of a set V(t) for a fixed t
gðx, uÞ ¼ inf
inf gðx, uÞ:
ð15:8Þ
x2X ðt Þu2U ðx, t Þ
Rr V(t) U(x,t) x
X (t )
Rn
15.4
Relation with Dynamic Programming
211
Rn
Fig. 15.3 Set D in space Rn + 1
D X(t)
t0
t
t1
After this digression we will return to the discussed topic. We use the formulas (15.1) for an arbitrary function φ(x, t) of the class C1(Rn R ! R) to determine the functions (15.1). Find inf QðxÞ ¼ φðx0 , t 0 Þ þ inf ½ΦðxÞ φðx, t 1 Þ
x2G
x2G
and using the formula (15.8) inf
ðx, uÞ2V ðtÞ
Pðx, u, t Þ ¼ inf
inf Pðx, u, t Þ
x2X ðt Þu2U ðx, tÞ
¼ inf
x2X ðt Þ
φt ðx, t Þ þ inf ½Fðx, u, tÞ þ φx ðx, tÞ0 f ðx, u, tÞ , t 2 ½t 0 , t 1 : u2U ðx, tÞ
We subject the function φ(x, t) to the conditions φt ðx, t Þ þ inf
u2U ðx, t Þ
F ðx, u, t Þ þ φx ðx, t Þ0 f ðx, u, t Þ ¼0, ðx, t Þ 2 D, ½φðx, t 1 Þ ΦðxÞjx2G ¼0:
ð15:9Þ
If the soluyion φ(x, t) of Eq. (15.9) exists, then inf QðxÞ ¼ φðx0 , t 0 Þ,
x2G
inf
Pðx, u, t Þ ¼ 0, t 2 ½t 0 , t 1 :
ðx, uÞ2V ðt Þ
ð15:10Þ
Suppose that the equalities (15.8) are identically satisfied for the function φ(x, t) and the infimum in the first of them is attained in a point u(x, t) 2 U(x, t). That is, φt ðx, t Þ þ F x, uðx, t Þ, t þ φx ðx, t Þ0 f ðx, uðx, t Þ, t Þ ¼ 0, ðx, t Þ 2 D: Suppose further that the solution x(t) of the Cauchy problem x_ ¼ f ðx, uðx, t Þ, t Þ, xðt 0 Þ ¼ x0
ð15:11Þ
212
15
Sufficient Optimality Conditions
exists and (x(t), t) 2 D, t 2 [t0, t1] and x(t1) 2 G. Putting x ¼ x(t) in the identity (15.11) and denoting u(t) ¼ u(x(t), t), we have φt ðxðt Þ, t Þ þ F ðxðt Þ, uðt Þ, t Þ þ φx ðxðt Þ, t 0 Þf ðxðt Þ, uðt Þ, t Þ ¼ Pðxðt Þ, uðt Þ, t Þ ¼ 0, t 2 ½t 0 , t 1 : Besides, we use the definition of Q to obtain Qðxðt 1 ÞÞ ¼ Φðxðt 1 ÞÞ þ φðx0 , t 0 Þ φðxðt 1 Þ, t 1 Þ ¼ φðx0 , t 0 Þ: From the last two equalities and (15.9), we conclude that the process x(t), u(t) satisfies the sufficient conditions for Theorem 15.1 Qðxðt 1 ÞÞ ¼φðx0 , t 0 Þ ¼ inf QðxÞ, x2G
Pðxðt Þ, uðt Þ, t Þ ¼0 ¼
inf
ðx, uÞ2V ðt Þ
Pðx, u, t Þ, t 2 ½t 0 , t 1
ð15:12Þ
and, as a consequence, it is optimal. By (15.2), (15.11), we find the minimum of the objective function on a set of processes of a C-problem min J ¼ φðx0 , t 0 Þ:
ð15:13Þ
The partial differential equation (15.9) is referred to as the fundamental equation of dynamic programming or the Bellman equation. Its analog in the calculus of variations is the Hamilton-Jacobi equation. In accordance with formula (15.13), the solution φ(x, t) of the Bellman equation describes the dependence of the minimum of the objective functional on the initial values of the C-problem. The boundary value problem (15.9) is solved as a separate issue due to the possible lack of smoothness of the left side of the equation and the possible lack of uniqueness of the solutions. Therefore, the boundary value problem φt xjφx 1j þ x ¼ 0, x 0, 0 t 1, φð0, 1Þ ¼ 0 in Example 15.1 has two solutions: φ1(x, t) ¼ 0, φ2(x, t) ¼ 2x. Exercise Set 1. Write the fundamental equation of dynamic programming for the S-problem. 2. Using the previous exercise, show that under a proper smoothness of data of the S-problem and a solution φ(x, t) of Bellman equation, the function ψ(t) ¼ φx(x(t), t) satisfies the conjugate Cauchy problem along the optimal process x(t), u(t) and the triple of functions ψ(t), x(t), u(t) satisfies the condition of the maximum of the Hamiltonian. 3. Consider the G-problem without a fixed time t1 getting values in a given set T ⊂ R. How will the formulations of the Theorems 15.1, 15.2 change for this problem?
Conclusion
The material presented in this book covers the theory of linear systems and the theory of necessary and sufficient conditions of optimality with relative completeness. Of course, there are many interesting and important applications of optimal control, lighting that would require a significant expansion of the book, and these are left beyond our sights for now. When writing this book, we made an attempt to present the material in the most simple and intelligible form, and it is designed to produce a first acquaintance with the subject. The readers may judge whether we have succeeded, of course. We will be grateful for any recommendations in improving the content of this book.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7
213
Appendix
A.1. Elements of Multidimensional Geometry A.1.1. Finite-Dimensional Vector Space Consider the set Rn of ordered sequences of n 1 real numbers 0
1 0 1 0 1 x1 y1 z1 B C B C B C x ¼ @ . . . A, y ¼ @ . . . A, z ¼ @ . . . A , . . . xn
yn
zn
The sequences x, y, z, . . . are called vectors (points), and their constituent numbers are called coordinates.We define the operations of vector addition and multiplication with real numbers λ, μ, . . . 0
1 0 1 0 1 x1 y1 x1 þ y1 B C B C B C x þ y ¼ @ . . . A þ @ . . . A ¼ @ . . . . . . . . . A, xn
yn
xn þ yn
0
1 0 1 x1 λx1 B C B C λx ¼ λ@ . . . A ¼ @ . . . A: xn
λxn
These operations have properties 1Þ x þ y ¼ y þ x
4Þ λðμxÞ ¼ ðλμÞx,
2Þ ðx þ yÞ þ z ¼ x þ ðy þ zÞ,
5Þ ðλ þ μÞx ¼ λx þ μx,
3Þ 1x ¼ x,
6Þ λðx þ yÞ ¼ λx þ λy:
The set Rn contains the zero vector 0 ¼ 0x, and it contains the opposite vector x ¼ (1)x for each x 2 Rn. By the vector addition rule, we have x + 0 ¼ x, x x ¼ 0.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7
215
216
Appendix
Thus, the set Rn meets all the axioms of a linear space. We introduce in Rn the transposition of the vector 0
10 0 1 x1 x1 B C B C 0 @ . . . A ¼ ðx1 , . . . , xn Þ, ðx1 , . . . , xn Þ ¼ @ . . . A, xn
xn
the dot product of two vectors 0
y1
1
B C x0 y ¼ ðx1 , . . . , xn Þ@ . . . A ¼ x1 y1 þ ⋯ þ xn yn yn and the Euclidean norm of the vector kx k ¼ ð x 0 x Þ
1=2
with properties: 0
1Þ ðx0 Þ ¼ x, 2Þ x0 y ¼ y0 x,
5Þ kxk ¼ 0 , x ¼ 0, 6Þ kλxk ¼ jλjkxk,
3Þ ðλxÞ0 y ¼ λx0 y, 0
0
7Þ kx þ yk kxk þ kyk, 0
4Þ x ðy þ zÞ ¼ x y þ x z,
8Þ jx0 yj kxkkyk:
Inequality 8) is referred to as the Cauchy-Bunyakovskii inequality, and the set R1 is referred to as the real line and is denoted as R.
A.1.2. Geometric Objects in Rn Fix a nonzero vector c 2 Rn and an arbitrary number β 2 R. The set P of points x 2 Rn satisfying the equation c0 x ¼ β is referred to as the plane, and the vector c is referred to as the normal vector (or normal if kck ¼ 1) of the plane. If x 2 P, i.e., c0 x ¼ β, then the equation of the plane can be written as c0 x ¼ c0 x or c0 ðx xÞ ¼ 0:
Appendix
217
c
P
x 0
c
c x 0
x
x
0
Fig. A.1.1 Plane, closed and open half-spaces in R2
In this case, we have said that the plane P passes through point x. It is convenient to identify the plane with its equation and to talk about the “plane” c0x ¼ β or c0 ðx xÞ ¼ 0. For any point x 2 Rn, one of the following conditions is satisfied: c0x < β, c0x ¼ β, c0x > β. The sets P ¼ fx 2 Rn : c0 x βg, Pþ ¼ fx 2 Rn : c0 x βg are referred to as closed half-spaces formed by plane P. Along with them, these are considered open half-spaces P\P, P+\P formed by the strict inequalities c0x < β, c0x > β. The Cauchy-Bunyakovskii inequality allows us to put in correspondence each pair of vectors x, y 2 Rn and the angle α, 0 α π between them according to the formula cos α ¼
x0 y : kx kky k
If x0y ¼ 0, i.e., α ¼ π/2, then the vectors x, y are orthogonal. Obviously, the plane c0x ¼ 0 consists of vectors x orthogonal to the normal c. The half-spaces c0x 0, c0x > 0 are composed of vectors x forming obtuse and acute angles, respectively, with respect to the normal c (Fig. A.1.1). Introduce in Rn the sets ½x, y ¼ fz ¼ ð1 λÞx þ λy : 0 λ 1g segment with the ends x, y, ðx, yÞ ¼ fz ¼ ð1 λÞx þ λy : 0 < λ < 1g interval, ðx, y ¼ fz ¼ ð1 λÞx þ λy : 0 < λ 1g half ‐interval, ½x, yÞ ¼ fz ¼ ð1 λÞx þ λy : 0 λ < 1g half ‐segment: When x ¼ y, a segment, an interval, a half-interval and a half-segment degenerate to the point x and are determined to be degenerated. If x 6¼ y, they are non-degenerated. The points z ¼ x + λy for 1 < λ < 1 form a line, and when λ 0 – a ray. The set A 2 Rn is convex if for any two points x, y it contains a segment [x, y]. The examples of convex sets are a segment, a ray, a line, a plane, open and closed halfspaces, a closed ball S(y, r) ¼ {x 2 Rn : kx yk r} of a radius r 0 centered in y, or a space Rn.
218
Appendix
The set K 2 Rn is a cone if, along with the vector x, it contains the vector λx, λ > 0. Cones may be convex and non-convex. The plane passing through the origin, and the closed or open half-spaces formed by this plane, are convex cones. The union of two different lines passing through the origin is a non-convex cone.
A.2. Elements of Convex Analysis A.2.1. Separability of Sets Let A, B be the sets and P be a plane in a space Rn. If A ⊂ P, B ⊂ P+, we say that the plane P divides the sets A, B or that A, B are separable by plane P sets (Fig. A.2.1). Analytically, the separation of sets A, B implies the existence of a vector c 6¼ 0 and a number β such that c0 x β c0 y, x 2 A, y 2 B:
ðA2:1Þ
Obviously, the converse is also true since the inequality (A2.1) is equivalent to a separability of sets A, B by plane c0x ¼ β. If for some ε > 0 the inequalities (A2.1) are valid in a strengthened form as c0 x β ε < β þ ε c0 y, x 2 A, y 2 B,
ðA2:2Þ
and then we say there is a strict separation of sets A, B (Fig. A.2.1). Theorem A.2.1 (Strict Separability of Sets) If convex closed sets A, B ⊂ Rn are disjoint and one of them is bounded, then they are strictly separated by some plane. Proof Suppose for definiteness that the set A is bounded. We choose an arbitrary point z 2 B and construct a ball S(z, r) comprising the set A for some r > 0. Let B1 ¼ B \ S(z, r). By construction, B1 is bounded and closed (Fig. A.2.2). We use the Weierstrass theorem a continuous function (x, y) ! kx yk2 on a closed bounded set A B1 ⊂ Rn Rn has a minimum at some point (a, b): ka bk2 kex eyk2 , ðex, eyÞ 2 A B1 :
ðA2:3Þ
Put Fig. A.2.1 Separable and strictly separable sets A and B
B
A P
A
B
Appendix
219
Fig. A.2.2 Construction of the set B1
B1
a
b
z
A
B
ex ¼ a þ λðx aÞ, ey ¼ b þ μðy bÞ, where x, y are arbitrary fixed points of sets A, B1, respectively, and λ, μ are small non-negative numbers. The sets A, B1 are convex, and therefore ex 2 A, ey 2 B1 and ðex, eyÞ 2 A B1. Assuming that c ¼ b a 6¼ 0, we represent the inequality (A2.3) in the form kck2 kc þ λðx aÞ μðy bÞk2 or, after obvious transformations, λc0 ðx aÞ þ μc0 ðy bÞ þ oðλ þ μÞ 0: Putting λ ! 0, μ ¼ 0 and λ ¼ 0, μ ! 0 consistently, we obtain c0 ðx aÞ 0, x 2 A, 0
c ðy bÞ 0, y 2 B1 :
ðA2:4Þ ðA2:5Þ
In inequality (A2.5), B1 can be replaced by B. Indeed, if by is an arbitrary point of the set B, then due to the convexity of B, a point z ¼ b þ νðby bÞ for a small ν > 0 simultaneously lies in the sets B and B1. Assuming y ¼ z in (A2.5), we get νc0 ðby bÞ 0 or c0 ðby bÞ 0. Let us verify that the plane 1 1 c0 x ¼ c0 x, x ¼ a þ c ¼ b c 2 2 strictly separates the sets A and B. We then apply inequalities (A2.4), have
220
Appendix
Fig. A.2.3 Examples of sets that do not meet the conditions of Theorem A.2.1
1 1 1 c0 ðx xÞ ¼ c0 x a c ¼ c0 ðx aÞ kck2 kck2 , x 2 A, 2 2 2 use (A2.5) and substitute B instead of B1 to obtain 1 1 1 c0 ðy xÞ ¼ c0 y b þ c ¼ c0 ðy bÞ þ kck2 kck2 , y 2 B: 2 2 2 From the last two inequalities 1 1 c0 x c0 x kck2 < c0 x þ kck2 c0 y, x 2 A, y 2 B: 2 2 Thus, the theorem has been proven. Figure A.2.3 demonstrates that all of the assumptions of the separation Theorem A.2.1 are significant.
A.2.2. Reference Plane A plane c0 ðx xÞ ¼ 0 is said to be the reference plane to the set A ⊂ Rn in a point x 2 A if c0 ðx xÞ 0, x 2 A. In other words, if a set A has a reference plane in some point, then it is located in a closed half-space that is formed by this reference plane (Fig. A.2.4). Theorem A.2.2 (Existence of a Reference Plane) If A ⊂ Rn is a convex closed set and A 6¼ Rn, then there exists a reference plane in each boundary point of A. Proof Let a set A satisfy the conditions of the theorem. Choose any boundary point x 2 A and construct a sequence of points {xk} converging to A from the outside: = A, k ¼ 1, 2, . . .. By using the separation theorem, for every point xk xk ! x, xk2 there exists a vector ck 2 Rn with the property
Appendix
221
Fig. A.2.4 Set A lies on one side of the reference plane
c A
x
Fig. A.2.5 Open triangle, a closed square and a closed cycle have empty, finite and an infinite set of extreme points on the plane
k 0 k k 0 c x > c x, x 2 A: It is obvious that ck 6¼ 0, therefore, without loss of generality, we can regard kckk ¼ 1. Then a sequence {ck} is bounded. From {ck} we allocate a convergent subsequence cki : cki ! c, kck ¼ 1. Putting k ¼ ki in the last inequality and fulfilling a limiting transition by ki ! 1, we obtain c0 x c0 x, x 2 A: Therefore, the plane c0 ðx xÞ ¼ 0 is a reference for the set A 2 Rn in a point x, and the theorem is proven.
A.2.3. Representation of a Convex Set The convex combinations of the points and extreme points play an important role in the description of a convex set. Let x1, . . ., xm be some points in a space Rn. A point x ¼ λ1x1 + ⋯ + λmxm is called a convex combination of points x1, . . ., xm, if λ1 0, . . ., λm 0, λ1 + ⋯ + λm ¼ 1. Numbers λ1, . . ., λm are referred to as coefficients of a convex combination. It is easy to verify that all convex combinations of two points x1, x2 form a segment [x1, x2] and that a set of convex combinations of a finite number of points is convex. A point x of a set A 2 Rn is an extreme point of this set if it is not the middle of any non-degenerate segment [x1, x2] ⊂ A. In other words, if a point x 2 A can be represented as x ¼ 12 x1 þ 12 x2 , x1 , x2 2 A, x1 6¼ x2, then it is not an extreme point of A. In Fig. A.2.5 shows convex sets in the plane with different subsets of extreme points.
222
Appendix
Fig. A.2.6 Explanation to the proof of Theorem A.2.3
P x
A0
A
A
m
u
x
v
L
x 1
x
Theorem A.2.3 (Representation of Convex Set) Every convex compact set A 2 Rn has at least one extreme point. Any point of a set A can be represented using a convex combination of a finite number of extreme points of A. Proof We prove the theorem by induction. We first consider an n-dimensional space containing a set A. For n ¼ 1, a convex compact set on a real line R is a segment, and the theorem is obviously true. Suppose that the theorem is true for every convex compact set in the space of dimension n 1, and then let A be a convex compact set in Rn. We consider the case of the boundary and the internal points of A separately. We then choose an arbitrary boundary point x 2 A. Theorem A.2.2 indicates that there exists a reference plane P to the set A in a point x with normal c for which c0 ðx xÞ 0, x 2 A
ðA2:6Þ
(Fig. A.2.6). Intersection A \ P ¼ A0 is a convex compact set lying in the space of dimension n 1. We use the hypothesis of induction to represent point x by a convex combination x¼
m X
λ i xi
ðA2:7Þ
i¼1
of extreme points x1, . . ., xm 2 A0 with coefficients of convex combination λ1, . . ., λm. We show that x1, . . ., xm are extreme points of A. Now, assume the contrary with a point xk 2 {x1, . . ., xm} that is not an extreme point of A. It can then be written as 1 1 xk ¼ y þ z, y, z 2 A, y 6¼ z: 2 2 Since xk 2 P \ A, we can use (A2.6), (A2.8) to obtain c0 x ¼ c0 xk ¼ c0
1 1 1 1 1 1 y þ z ¼ c0 y þ c0 z c0 x þ c0 x ¼ c0 x 2 2 2 2 2 2
ðA2:8Þ
Appendix
223
Consequently, c0 y ¼ c0 z ¼ c0 x and y, z 2 A0. Then, by representation, (A2.8) xk is not an extreme point of A0 that contradicts the definition of xk. Thus, set A has extreme points x1, . . ., xm, and representation (A2.7) is true. We now choose an arbitrary interior point x 2 A. We draw some line L through x, and since A is a compact set, then a line L intersects the border ∂A in two points u, v. Then, there is a number γ 2 [0.1] such that x ¼ ð1 γ Þu þ γv: With the above proof, the boundary points u, v can be represented using convex combinations u¼
p X
αi ui , v ¼
q X
β jv j
j¼1
i¼1
of extreme points u1, . . ., u p and v1, . . ., vq of a set A with corresponding coefficients of convex combinations α1, . . ., αp and β1, . . ., βq. The convex combinations in the previous equality are substituted, and we get x ¼ ð1 γ Þu þ γv ¼
p X
ð1 γ Þαi ui þ
q X
γβ j v j
j¼1
i¼1
It is easy to see that the numbers (1 γ)αi, i ¼ 1, . . ., p, γβj, j ¼ 1, . . ., q are non-negative and that their sum is 1: p X
ð1 γ Þαi þ
i¼1
q X
γβ j ¼ ð1 γ Þ
j¼1
p X i¼1
αi þ γ
q X
β j ¼ 1,
j¼1
i.e., they are the coefficients of a convex combination. Therefore, x is a convex combination of the extreme points of A, and the theorem has been proven.
A.2.4. Convex Hull of a Set A set of all convex combinations of A 2 Rn is a convex hull of A and is designated coA. Obviously, coA is the smallest convex set containing A (Fig. A.2.7). Fig. A.2.7 The set A and its convex hull coA in R2
A coA
224
Appendix
Theorem A.2.4 Each point of a convex hull of a set A 2 Rn can be represented using a convex combination of not more than n + 1 points of A. Proof Pick any point x 2 coA. By definition of a convex hull,
x¼
m X
λi xi , xi 2 A, λi 0, i ¼ 1, . . . , m,
m X
i¼1
λi ¼ 1:
ðA2:9Þ
i¼1
Suppose m > n + 1 and λi > 0, i ¼ 1, . . ., m. Vectors yi ¼ (xi, 1), i ¼ 1, . . ., m of dimension n + 1 in a space Rn + 1 are linearly dependent – that is, there are not zero simultaneous numbers αi, i ¼ 1, . . ., m such that m X
m X αi y ¼ αi xi , 1 ¼ i
i¼1
i¼1
m X
αi x ,
i¼1
i
m X
! αi
¼ 0:
i¼1
Consequently, m X
αi xi ¼ 0,
i¼1
m X
αi ¼ 0:
ðA2:10Þ
i¼1
From (A2.9) and (A2.10) for any θ 2 R, we have x¼
m X i¼1
λ i xi θ
m X i¼1
α i xi ¼
m X
ðλi θαi Þxi :
ðA2:11Þ
i¼1
By force of the second condition of (A2.10) there are positive numbers among α1, . . ., αm, so the system of inequalities λi θαi 0, i ¼ 1, . . ., m is the smallest positive solution θ. If θ ¼ θ, all coefficients λi θαi of the expansion (A2.11) are non-negative, and there are zeros among them. Besides, m X i¼1
m m m X X X λi θαi ¼ λi θ αi ¼ λi ¼ 1: i¼1
i¼1
i¼1
Then, a convex combination (A2.11) contains less than m points, and this process is continued to obtain the required result. The theorem is thus proven. Theorem A.2.5 The convex hull of a compact set A 2 Rn is compact. Proof Let A be a compact, i.e., bounded and closed set in Rn. Due to the limitation of A, there is a number r > 0 such that A ⊂ S(0, r). Then for any convex combination (A2.9), we have X X m m m X i λi x λi xi r λi ¼ r: kx k ¼ i¼1 i¼1 i¼1
Appendix
225
Hence, coA ⊂ S(0, r) is a bounded set. We show that the limit x of any convergent sequence {xk} ⊂ coA belongs to coA. By Theorem A.2.4, each point xk 2 coA can be represented in the form of xk ¼
nþ1 X
λik xik , xik 2 A, λik 0, i ¼ 1, . . . , n þ 1,
i¼1
nþ1 X
λik ¼ 1:
ðA2:12Þ
i¼1
Due to limitation of set A and of a segment [0,1], we can regard the sequences {xik} ⊂ A, {λik} ⊂ [0, 1] convergent to corresponding limits xi 2 A, λi 2 ½0, 1 for every i ¼ 1, . . ., n + 1 without loss of generality. Taking into (A2.12) the limit by k as k ! 1, we obtain x¼
nþ1 X
λi xi , xi 2 A, λi 0, i ¼ 1, . . . , n þ 1,
nþ1 X
i¼1
λi ¼ 1:
i¼1
Consequently, x 2 coA and coA is a closed set, and the theorem has been proven.
A.3. Maximum and Minimum of a Function Consider a real function y ¼ f(x) defined on a set D ⊂ Rn. If for some point a 2 D, the inequality f(a) f(x) is true for any x 2 D, then f(a) is the minimum, a is a minimum point (or point of minimum) of the function f on D, and we write f ðaÞ ¼ min f ðxÞ, a ¼ arg min f ðxÞ: x2D
x2D
The set of all minimum points is denoted as Arg min f ðxÞ. By definition, a function x2D
f has the same value as min f ðxÞ on a set Arg min f ðxÞ. The sign of the inequality is x2D
x2D
replaced in the definition of the minimum with the opposite one, and we arrive at the concept of the maximum and maximum point of a function f on D. We use these a similar notation f ð aÞ ¼
max f ðxÞ, a ¼ arg max f ðxÞ, x2D x2D
a 2 D : f ðaÞ ¼ max f ðxÞ : Arg max f ðxÞ ¼ x2D
x2D
The Weierstrass theorem indicates that a continuity of function f(x) on D and compactness of a set D ⊂ Rn are sufficient for the existence of the minimum and maximum.
226
Appendix
A.3.1. Properties of a Maximum and Minimum The following properties of the minimum and maximum are a direct result of the definitions: 1)
min ðc þ f Þ ¼ c þ min f ,
6)
max ðc þ f Þ ¼ c þ max f ,
2)
min ðkf Þ ¼ k min f ,
7)
max ðkf Þ ¼ k max f ,
3)
min ðf Þ ¼ max f ,
8)
max ðf Þ ¼ min f ,
4)
min ð f 1 þ f 2 Þ min f 1 þ min f 2 ,
9)
max ð f 1 þ f 2 Þ max f 1 þ max f 2 ,
5)
min f min f , D1 ⊂ D,
10)
max f max f , D1 ⊂ D,
D
D
D
D
D
D
D
D1
D
D
D
D
D
D
D
D
D
D
D1
D
D
D
where с, k 0 are constants. Verify, for example, the property 3). Let a ¼ arg min ½f ðxÞ. By definition of the x2D
minimum, f ðaÞ ¼ min ½f ðxÞ f ðxÞ, x 2 D: x2D
From here f ðaÞ ¼ min ½f ðxÞ f ðxÞ, x 2 D, x2D
or f ðaÞ ¼ min ½f ðxÞ ¼ max f ðxÞ: x2D
x2D
A.3.2. Continuity of a Maximum and Minimum Theorem A.3.1 Let A 2 Rm be an open set, B 2 Rn be a compact set and f(x, y) be a function from the class C(A B ! R). Then a function μðxÞ ¼ max f ðx, yÞ y2B
is defined and continuous on A. If for every x 2 A a point of maximum yðxÞ ¼ arg max f ðx, yÞ y2B
is unique, then the function y(x) is also continuous on A.
Appendix
227
Proof By the Weierstrass theorem, continuous function y ! f(x, y) in compact B has a maximum point y(x) for each x 2 A, therefore the function μðxÞ ¼ max f ðx, yÞ ¼ f ðx, yðxÞÞ y2B
is defined on A. According to the definition of a maximum, we have μðxÞ ¼ f ðx, yðxÞÞ f ðx, yðxÞÞ, μðxÞ ¼ f ðx, yðxÞÞ f ðx, yðxÞÞ for any x, x 2 A. From here, the two-sided estimation follows with f ðx, yðxÞÞ f ðx, yðxÞÞ μðxÞ μðxÞ f ðx, yðxÞÞ f ðx, yðxÞÞ: Let {xk} be an arbitrary sequence of the points of a set A converging to x and let {yk} be the corresponding sequence of points yk ¼ y(xk) of a set B. Due to the limitation of B, a sequence {yk} is bounded, and hence, it has a convergent subsequence. Without loss of generality, we can assume the sequence {yk} itself is convergent to y . Obviously, y 2 B. For x ¼ xk the two-sided estimation has the form f xk , yðxÞ f ðx, yðxÞÞ μ xk μðxÞ f xk , yk f x, yk : Put k ! 1 and evaluate the limit by k in the last inequality. Taking the continuity of f into account, we obtain μ xk μðxÞ ! 0. Since we choose a sequence {xk} and a point x randomly, then the latter indicates the continuity of function μ on a set A. We then prove the second part of the theorem by contradiction. Suppose a function y(x) is unambiguously defined on a set A but has at least one point of discontinuity a 2 A. Then, there is a sequence exk ⊂ A, exk ! a along which the corresponding values of function eyk ¼ y exk converge to the limit b 6¼ y(a) (Fig. A.3.1). By definition μðaÞ ¼ f ða, yðaÞÞ: At the same time, by continuity Fig. A.3.1 Explanation of the proof of Theorem A.3.1
y
y (a ) 1 y
y k
1
b x
1
x
k
a
A
x
228
Appendix
μðaÞ ¼ lim μ exk ¼ lim f exk , y exk ¼ lim f exk , eyk ¼ f ða, bÞ: k!1
k!1
k!1
The last two equations show that the function y ! f(a, y) has two different maximum points on a set B. The contradiction that is obtained indicates a continuity of function y(x) on B, and the theorem is proven. Remark A.3.1 Under the assumptions of Theorem A.3.1, we can prove the continuity of functions in the same manner μðxÞ ¼ min f ðx, yÞ, yðxÞ ¼ arg min f ðx, yÞ: y2B
y2B
Remark A.3.2 The requirement of compactness of B in the theorem can be replaced by the requirement of the existence of a maximum point y(x) 2 B for every x 2 A and a limitation of a set B.
A.4. Tasks and Solutions Examples of exercises and the solutions of thematic problems are given below in order to provide a consolidation of the theoretical material.
A.4.1. Tasks Fundamental Matrix Find a fundamental matrix for the given system of differential equations.
1. 4. 7.
x_ 1 ¼ 2x1 þ x2 ; x_ ¼ 2x2 2 x_ 1 ¼ 3x1 þ 2x2 ; x_ 2 ¼ 2x1 þ x2 x_ 1 ¼ 2x2 ; x_ 2 ¼ 2x1
8 10. > < x_ 1 ¼ x2 þ x3 x_ 2 ¼ x1 þ x2 x3 ; > : x_ 3 ¼ x2 x3
x_ 1 ¼ x2 ; x_ ¼ x1 2 5. x_ 1 ¼ x1 þ x2 ; x_ ¼ x2 2 8. x_ 1 ¼ 2x2 ; x_ 2 ¼ x1 þ x2 2.
8 11. > < x_ 1 ¼ x2 x_ 2 ¼ x1 þ x2 : > : x_ 3 ¼ x1
x_ 1 ¼ 2x1 þ x2 ; x_ ¼ 3x1 þ 4x2 2 6. x_ 1 ¼ 4x1 x2 ; x_ ¼ x1 þ 2x2 8 2 9. > < x_ 1 ¼ x1 þ x2 ; x_ 2 ¼ 2x2 > : x_ 3 ¼ x1 þ 2x2 x3 3.
Appendix
229
Reachability Set With the aid of the definition and Cauchy formula, write the reachability set Q(t1) of a linear system for a given initial value, range of control and time t1.
1. 2. 3.
x_ 1 ¼ x1 x2 , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 2, juj 1,t 1 ¼ 2: x_ ¼ 2x1 x2 þ u 2 x_ 1 ¼ x1 þ x2 u1 , x1 ð1Þ ¼ x2 ð1Þ ¼ 0, ju1 j 1, ju2 j 3, t 1 ¼ 4: x_ ¼ x1 þ x2 þ u2 2 x_ 1 ¼ 3x1 x2 u1 þ 2u2 , x1 ð0Þ ¼ 1, x2 ð0Þ ¼ 0, ju1 j 1, x_ 2 ¼ x1 x2 þ u2 ju2 j 1, t 1 ¼ 3:
x_ 1 ¼ 2x1 x2 þ u , x1 ð0Þ ¼ 1, x2 ð0Þ ¼ 2, 12 u 2, t 1 ¼ 5: x_ 2 ¼ 4x1 þ 2x2 ( 5. x_ 1 ¼ x1 þ x2 u1 þ u2 , x1 ð2Þ ¼ 2, x2 ð2Þ ¼ 1, ju1 j 1, ju2 j 1, t 1 ¼ 7: 1 x_ 2 ¼ x1 þ x2 þ u2 2 6. x_ 1 ¼ 4x1 x2 2u , x1 ð0Þ ¼ x2 ð0Þ ¼ 0, juj 1, t 1 ¼ 2: x_ ¼ x1 x2 þ u ( 2 7. x_ 1 ¼ 2x1 þ x2 þ u2 1 , x1 ð1Þ ¼ 15 , x2 ð1Þ ¼ 8, ju1 j 14 , ju2 j 1, t 1 ¼ 6: x_ 2 ¼ 4x1 2x2 þ u1 3 8. x_ 1 ¼ 7x1 3x2 , x1 ð3Þ ¼ 1, x2 ð3Þ ¼ 0, 2 u 3, t 1 ¼ 9: x_ 2 ¼ x1 þ x2 2u ( 9. x_ 1 ¼ 4x1 þ x2 2u1 u2 , x1 ð1Þ ¼ 1, x2 ð1Þ ¼ 1, ju1 j 1, ju2 j 1, t 1 ¼ 4: 1 x_ 2 ¼ 2x1 þ x2 þ u1 5 ( 10. x_ 1 ¼ 3x1 4x2 þ u , x1 ð0Þ ¼ 5, x2 ð0Þ ¼ 1, 1 u 0, t 1 ¼ 1: 1 x_ 2 ¼ x1 þ x2 u 2 4.
Reference Plane Construct a reference plane with a given normal vector c to a reachability set Q(t1) ⊂ R2 that satisfies the appropriate conditions.
1.
0 x_ 1 ¼ 4x1 þ 3x2 2u : , x1 ð2Þ ¼ 1, x2 ð2Þ ¼ 0, juj 1,t 1 ¼ 7, c ¼ 1 x_ 2 ¼ x1 x2 þ u
230
2. 3. 4. 5. 6.
7.
8. 9.
Appendix
2 x_ 1 ¼ x1 2x2 þ 3u : , x1 ð0Þ ¼ 0,x2 ð0Þ ¼ 2, 2 u 1,t 1 ¼ 3,c ¼ 1 x_ 2 ¼ 5x1 x2 u 1 x_ 1 ¼ 2x1 þ x2 : , x1 ð1Þ ¼ 1, x2 ð1Þ ¼ 0, juj 1,t 1 ¼ 4, c ¼ 1 x_ 2 ¼ 7x1 þ u x_ 1 ¼ x1 x2 u1 þ u2 1 : , x1 ð0Þ ¼ 15 ,x2 ð0Þ ¼ 1, ju1 j 1, ju2 j 1,t1 ¼ 5,c ¼ x_ 2 ¼ 2x1 2u2 1 ! 0 x_ 1 ¼ 3x1 4x2 þ 3u2 , x1 ð0Þ ¼ x2 ð0Þ ¼ 0, ju1 j 12 , ju2 j 1,t 1 ¼ 3,c ¼ 1 : x_ 2 ¼ x1 þ x2 u1 3 x_ 1 ¼ x1 þ x2 þ 3u2 , x1 ð1Þ ¼ 2,x2 ð1Þ ¼ 1, ju1 j 2, ju2 j 3, x_ 2 ¼ 2x1 þ x2 þ u1 2u2 1 t 1 ¼ 4,c ¼ : 1 x_ 1 ¼ 7x1 þ 3x2 u1 þ u2 , x1 ð0Þ ¼ 1, x2 ð0Þ ¼ 0, ju1 j 1, ju2 j 1, x_ 2 ¼ 2x1 þ x2 2u2 2 t 1 ¼ 5, c ¼ : 0 0 x_ 1 ¼ 9x1 x2 u : , x1 ð3Þ ¼ 25 ,x2 ð3Þ ¼ 1, juj 1,t 1 ¼ 7, c ¼ 7 x_ 2 ¼ x2 þ 2u x_ 1 ¼ 5x1 x2 3u1 þ u2 , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 1, ju1 j 2, ju2 j 1, x_ 2 ¼ x1 þ x2 þ 2u2 3 t 1 ¼ 2, c ¼ : 1
Point-to-Point Controllability Verify if the given system is point-to-point controllable. Find a control with minimal norm if a system is point-to-point controllable. 1.
2.
8 0 1 0 1 0 1 > < x_ 1 ¼ x2 B C B C x_ 2 ¼ 2x2 þ u1 , t 0 ¼ 0, t 1 ¼ 2,xðt 0 Þ ¼ @ 0 A, xðt 1 Þ ¼ @ 1 A, u 2 R2 : > : 1 1 x_ ¼ x1 þ u2 8 3 0 1 0 1 _ ¼ x u þ u x > 1 1 1 2 1 1 > < B C B C x_ 2 ¼ x3 þ u1 , t 0 ¼ 2, t 1 ¼ 5,xðt 0 Þ ¼ @ 2 A, xðt 1 Þ ¼ @ 2 A, u 2 R3 : > 1 > : x_ 3 ¼ u2 þ u3 3 0 2
Appendix
231
8 0 1 0 1 1 > 0 1 > < x_ 1 ¼ x3 3 u1 þ u3 B C B C , t ¼ 1,t ¼ 2, x ð t Þ ¼ 0 ,x ð t Þ ¼ 1 @ A 1 @ A,u 2 R3 : 0 1 0 x_ 2 ¼ u1 þ u2 > > : 0 2 x_ 3 ¼ x2 þ u1 8 0 1 0 1 4. > x_ 1 ¼ x2 u1 2 0 < 1 B C B C x_ 2 ¼ 2x3 þ u3 , t 0 ¼ 0,t 1 ¼ 3, xðt 0 Þ ¼ @ 1 A, xðt 1 Þ ¼ @ 0 A,u 2 R3 : > 5 : 3 0 x_ 3 ¼ u1 u2 8 0 1 0 1 5. > 1 1 < x_ 1 ¼ x2 B C B C x_ 2 ¼ x3 u1 þ u2 , t 0 ¼ 2, t 1 ¼ 5,xðt 0 Þ ¼ @ 0 A,xðt 1 Þ ¼ @ 0 A, u 2 R2 : > : 2 0 x_ ¼ 3u1 8 3 0 1 0 1 1 6. > 0 1 > < x_ 1 ¼ x2 u1 þ 2 u2 B C B C , t 0 ¼ 0,t 1 ¼ 4, xðt 0 Þ ¼ @ 0 A,xðt 1 Þ ¼ @ 1 A,u 2 R3 : x_ 2 ¼ 2u1 u3 > > : 1 0 x_ ¼ x1 þ u2 8 3 0 1 0 1 7. > x_ 1 ¼ x2 u1 1 1 > < B C B C x_ 2 ¼ u2 , t 0 ¼ 1,t 1 ¼ 5, xðt 0 Þ ¼ @ 0 A, xðt 1 Þ ¼ @ 1 A,u 2 R3 : > 1 > : x_ 3 ¼ x1 þ u2 u3 2 2 2 8 0 1 0 1 8. > 0 1 < x_ 1 ¼ x2 u3 B C B C x_ 2 ¼ x3 þ u1 þ u2 , t 0 ¼ 0,t 1 ¼ 2, xðt 0 Þ ¼ @ 0 A, xðt 1 Þ ¼ @ 0 A,u 2 R3 : > : 1 2 x_ ¼ 3u1 8 3 0 1 0 1 1 9. > 1 1 > < x_ 1 ¼ x2 2 u1 B C B C 2 x_ 2 ¼ x3 þ u1 þ u2 , t 0 ¼ 2, t 1 ¼ 3,xðt 0 Þ ¼ @ 2 A,xðt 1 Þ ¼ @ 0 A, u 2 R : > > : 0 0 x_ ¼ 2u1 u2 8 3 0 1 0 1 10. > 0 1 < x_ 1 ¼ x1 x3 u1 B C B C , t 0 ¼ 0, t 1 ¼ 5,xðt 0 Þ ¼ @ 0 A, xðt 1 Þ ¼ @ 0 A, u 2 R2 : x_ 2 ¼ u1 þ u2 > : 0 7 x_ 3 ¼ x1 u2
3.
Total Controllability Verify the total controllability of the given system. 8 1. > < x_ 1 ¼ x1 þ u1 þ u3 ; x_ 2 ¼ x3 þ u3 > : x_ 3 ¼ u1 þ u2
8 2. > < x_ 1 ¼ x2 þ x3 u1 þ u2 x_ 2 ¼ x1 þ 2x2 þ 2u2 ; > : x_ 3 ¼ x3 2u1 þ u2
232
Appendix
8 8 t 4. > 3. > 2 x_ 1 ¼ x1 þ t 2 x3 tu2 _ x x ¼ t x þ þ u > > 1 1 2 1 > < 3 > 1 > < x_ 2 ¼ ðt 3 1Þx1 3 x3 þ u1 tu2 ; t3 > t ; > x_ 2 ¼ x1 þ 2x2 : > 2 > x_ 3 ¼ t 5 x1 þ x2 þ u2 > > 1 > : x_ 3 ¼ x1 þ tx2 þ ð cos t Þu2 t 8 8 2 t 5. > x_ 1 ¼ t 2 x2 tu1 6. > < x_ 1 ¼ x2 þ ð sin t Þx3 e u2 > < x_ 2 ¼ x3 þ ð lnt Þu2 x_ 2 ¼ tx1 þ e2tþ1 x3 þ u1 t 2 u2 ; ; > > : 3 > : x_ ¼ t x þ ð cos tÞx þ ðt 1Þu x_ 3 ¼ x1 þ t 2 x2 u1 ð sin t Þu2 3 1 2 2 2 8 8 7. > x_ 1 ¼ x2 u1 þ 2u2 8. > x_ 1 ¼ x1 t2 x2 tu1 þ u3 > < < 1 x_ 2 ¼ sin ðt 3 1Þx1 ln 2 t u1 tu2 ; ; x_ 2 ¼ x2 þ u3 > 2 > : > : x_ ¼ x t2 x þ 1 u þ tg3 tu x_ 3 ¼ x1 u2 þ u3 3 1 3 2 3 2 1 8 8 2 3 t 2 9. > 10. > x_ 1 ¼ t x1 þ e t x3 u1 þ ð cos tÞu2 x_ 1 ¼ tx3 þ ðarctgt Þt u2 > < < cos t x_ 2 ¼ x1 t2 x3 þ u1 x þ tu1 u2 ; x_ 2 ¼ ln ðt3 1Þx1 ; > t 2 > : 1 > : x_ 3 ¼ tx1 x2 þ ð ln tÞu1 u2 x_ 3 ¼ x3 þ u1 þ u2 t 8 2 11.> < x_ 1 ¼ ðt þ 2t 1Þx1 x2 ð sin t Þu2 x_ 2 ¼ et ð ln tÞx1 x3 þ u1 ðctgt Þu2 : > : x_ 3 ¼ x1 tx2 þ u1 ð arcsin tÞu2
Minimum Time Problem Determine an optimal process for the minimum time problem.
1.
t1 !
2.
t1 !
3.
t1 !
4.
t1 !
5.
t1 !
6.
t1 !
x_ 1 min , x_ 2 x_ 1 min , x_ 2 x_ 1 min , x_ 2 x_ 1 min , x_ 2 x_ 1 min , x_ 2 x_ 1 min , x_ 2
¼ x2 u , ¼ x1 ¼ 9x2 u ¼ x1 ¼ x2 u
, ¼ 9x1 ¼ 3x2 , ¼ 3x1 þ u ¼ 4x2 þ u , ¼ x1 ¼ 2x2 u , ¼ 2x1
0 2 : , xð t 1 Þ ¼ juj 2, xð0Þ ¼ 0 3 0 1 : , juj 1, xð0Þ ¼ , xðt 1 Þ ¼ 0 0 0 2 : , xðt 1 Þ ¼ juj 1, xð0Þ ¼ 0 1 0 2 : , xð t 1 Þ ¼ juj 1, xð0Þ ¼ 0 1 0 1 : , xðt 1 Þ ¼ juj 1, xð0Þ ¼ 0 3 0 2 : , xðt 1 Þ ¼ juj 2, xð0Þ ¼ 0 1
Appendix
233
7.
t1 !
8.
t1 !
9.
t1 !
10.
t1 !
11.
t1 !
12.
t1 !
13.
t1 !
14.
t1 !
15.
t1 !
x_ 1 min , x_ 2 x_ 1 min , x_ 2 x_ 1 min , x_ 2 x_ 1 min , x_ 2 x_ 1 min , x_ 2 x_ 1 min , x_ 2 x_ 1 min , x_ 2 x_ 1 min , x_ 2 x_ 1 min , x_ 2
0 : , juj 1, xð0Þ ¼ , xðt 1 Þ ¼ 0 1 ¼ 16x1 0 ¼ 2x2 u 4 : , juj 4, xð0Þ ¼ , xð t 1 Þ ¼ 0 1 ¼ 8x1 0 ¼ 5x2 5 : , juj 5, xð0Þ ¼ , xðt 1 Þ ¼ 0 ¼ 5x1 þ u 3 0 ¼ x2 1 : , juj 1, xð0Þ ¼ , xðt 1 Þ ¼ 0 ¼ 4x1 þ u 1 0 ¼ x2 þ u 3 : , juj 2, xð0Þ ¼ , xð t 1 Þ ¼ 0 3 ¼ x1 0 ¼ 25x2 4 : , juj 3, xð0Þ ¼ , xðt 1 Þ ¼ 0 ¼ x1 u 1 0 4 ¼ 2x2 u : , juj 4, xð0Þ ¼ , xð t 1 Þ ¼ 0 1 ¼ 8x1 0 ¼ ðt 1Þx2 1 : , juj 1, xð0Þ ¼ , xð t 1 Þ ¼ 0 ¼u 1 ¼ u 2 0 , 0 u 2, xð0Þ ¼ : , xð t 1 Þ ¼ ¼ 7x1 5 1
¼ x2 u
2
Hint for problems 1–15, draw the phase portrait and then determine the structure of an optimal control for the given initial and terminal points.
Observation Problem Restore a vector x(t1) ¼ (x1(t1), x2(t1)) by using known measurements. 1. 2. 3. 4. 5. 6.
x_ 1 ¼ 5x2 , x_ 2 ¼ 2u x_ 1 ¼ u , x_ 2 ¼ 3x1 x_ 1 ¼ 2x2 , x_ 2 ¼ 3u x_ 1 ¼ 2u , x_ ¼ 5x1 2 x_ 1 ¼ 3x2 , x_ ¼ u 2 x_ 1 ¼ 2u , x_ 2 ¼ 3x1
uðt Þ ¼ 12 , x1 ðt Þ þ 2x2 ðt Þ ¼ 2t 2 t, 0 t t 1 : uðt Þ ¼ 1, 2x1 ðt Þ x2 ðt Þ ¼ 3t 2 þ 2t , 0 t t 1 : uðt Þ ¼ 3, x1 ðt Þ þ 4x2 ðt Þ ¼ t 2 þ 2t, 0 t t 1 : 2
uðt Þ ¼ 2, x2 ðt Þ ¼ t3 t, 0 t t 1 : uðt Þ ¼ 13 , x1 ðt Þ þ 2x2 ðt Þ ¼ t 2 þ t, 0 t t 1 : uðt Þ ¼ 1, 3x1 ðt Þ ¼ 2t 2 6t, 0 t t 1 :
234
Appendix
x_ 1 x_ 2 8. x_ 1 x_ 2 9. x_ 1 x_ 2 10. x_ 1 x_ 2 11. x_ 1
7.
x_ 2 x_ 1 x_ 2 13. x_ 1 x_ 2 14. x_ 1 x_ 2 15. x_ 1 x_ 2 12.
¼ 5x2
, uðt Þ ¼ 3, 2x1 ðt Þ þ 2x2 ðt Þ ¼ t 2 2t, 0 t t 1 : ¼ 2u ¼u , uðt Þ ¼ 2, x1 ðt Þ x2 ðt Þ ¼ t 2 þ t, 0 t t 1 : ¼ 4x1 ¼ 2x2 , uðt Þ ¼ 12 , 6x1 ðt Þ þ 3x2 ðt Þ ¼ t 2 3t, 0 t t 1 : ¼ u ¼ 3u , uðt Þ ¼ 13 , x1 ðt Þ x2 ðt Þ ¼ t 2 þ 6t, 0 t t 1 : ¼ 2x1 ¼ 4x2 , uðt Þ ¼ 5, 2x1 ðt Þ þ x2 ðt Þ ¼ 4t 2 2t, 0 t t 1 : ¼ 3u ¼ u , uðt Þ ¼ 2, x1 ðt Þ ¼ 6t 2 þ 2t, 0 t t 1 : ¼ 5x1 ¼ x2 , uðt Þ ¼ 12 , 2x1 ðt Þ þ x2 ðt Þ ¼ t 2 t, 0 t t 1 : ¼ 4u ¼ 3u , uðt Þ ¼ 1, 3x2 ðt Þ ¼ 4t 2 þ 3t, 0 t t 1 : ¼ 2x1 ¼ 7x2 , uðt Þ ¼ 1, x1 ðt Þ þ x2 ðt Þ ¼ 5t 2 4t , 0 t t 1 : ¼u
Identification Problem Restore a vector w 2 Rr by known measurements. Set x1(0) ¼ x2(0) ¼ 0 in problems 1–10. 1. 2. 3. 4. 5. 6. 7.
x_ 1 ¼ 3w 2 , 4x1 ðt Þ þ 3x2 ðt Þ ¼ t6 , 0 t t 1 : x_ 2 ¼ 2x1 x_ 1 ¼ 3x2 , 2x1 ðt Þ 3x2 ðt Þ ¼ t 2 þ 3t , 0 t t 1 : x_ 2 ¼ w x_ 1 ¼ 4w 2 , x1 ðt Þ þ 3x2 ðt Þ ¼ t2 t, 0 t t 1 : x_ 2 ¼ 5x1 x_ 1 ¼ 2x2 , 12 x1 ðt Þ þ x2 ðt Þ ¼ t 2 þ 3t, 0 t t 1 : x_ 2 ¼ 3w x_ 1 ¼ x2 þ w1 , x1 ðt Þ þ x2 ðt Þ ¼ t 2 2t , 0 t t 1 : x_ 2 ¼ w2 x_ 1 ¼ w1 , x1 ðt Þ x2 ðt Þ ¼ t 2 þ t, 0 t t 1 : x_ 2 ¼ 3x1 þ w2 x_ 1 ¼ 4x2 , x1 ð t Þ þ x2 ð t Þ ¼ t 2 , 0 t t 1 : x_ 2 ¼ w
Appendix
x_ 1 x_ 2 9. x_ 1 x_ 2 10. x_ 1 x_ 2
8.
235
¼ 2x2
2
, x2 ðt Þ ¼ t2 þ t, 0 t t 1 : ¼ w ¼ tw , x1 ðt Þ þ x2 ðt Þ ¼ 5t 2 t, 0 t t 1 : ¼ x1 ¼ 3x2 , 1 x1 ðt Þ ¼ 3t, 0 t t 1 : ¼ ðt 1Þw 2
Synthesis of Control Construct a synthesized control in the optimal control problems for arbitrary initial conditions x1(0) ¼ ξ1, x2(0) ξ2.
1.
t 1 ! min ,
2.
t 1 ! min ,
3.
t 1 ! min ,
4.
t 1 ! min ,
5.
t 1 ! min ,
6.
t 1 ! min ,
7.
t 1 ! min ,
x_ 1 ¼ u , x1 ðt 1 Þ ¼ x2 ðt 1 Þ ¼ 0, juj 1: x_ ¼ x1 2 x_ 1 ¼ x2 , x1 ðt 1 Þ ¼ x2 ðt 1 Þ ¼ 0, 0 u 1: x_ 2 ¼ u x_ 1 ¼ x2 , x1 ðt 1 Þ ¼ 0, 1 u 1: x_ 2 ¼ u x_ 1 ¼ x2 , x2 ðt 1 Þ ¼ 0, 1 u 1: x_ 2 ¼ u x_ 1 ¼ u , x1 ðt 1 Þ ¼ 0, 0 u 1: x_ ¼ x1 2 x_ 1 ¼ x2 , x1 ðt 1 Þ ¼ x2 ðt 1 Þ ¼ 0, juj 1: x_ ¼ x1 þ u 2 x_ 1 ¼ x2 , x2 ðt 1 Þ þ x22 ðt 1 Þ ¼ 1, juj 1: x_ 2 ¼ x1 þ u 1
Variants of Tasks. Tests Find an optimal processes. Variant 1 1)
x1 ðπ Þ ! min ,
2) R1 0
x_ 1 ¼ 9x2 u , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, juj 2, 0 t π: x_ 2 ¼ x1
x_ 2 x dt xð21Þ ! min , 0 x_ 12 , xð0Þ ¼ 1:
236
Appendix
3) R2
j€xjdt ! min , €x 2, xð0Þ ¼ x_ ð0Þ ¼ 0, xð2Þ ¼ 3: x_ 1 ¼ x2 þ tu 4) t1 ! min , , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 2, x1 ðt 1 Þ ¼ 15, x2 ðt1 Þ ¼ 5, juj 1: x_ 2 ¼ u 0
Variant 2 1)
2x21 ðt 1 Þ !
max ,
2) R6
x_ 1 ¼ x2 þ tu , x1 ð0Þ ¼ 2, x2 ð0Þ ¼ 5, x2 ðt 1 Þ 15 x1 ðt 1 Þ ¼ 4, juj 1: x_ 2 ¼ x1 u
x_ 2 þ 2x dt ! min , 12 x_ 0,xð0Þ ¼ 0: 0 x_ 1 ¼ x2 3) x1 ðπ Þ ! min , , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, juj 1,0 t π: x_ 2 ¼ 4x1 þ u 4) t 1 ! min , 3 €x 1, xð0Þ ¼ 3, x_ ð0Þ ¼ x_ ðt 1 Þ ¼ 0, xðt 1 Þ ¼ 5:
Variant 3 1)
x2 ðπ Þ ! min ,
2) R1 0
3)
x_ 1 ¼ 3x2 , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, juj 1, 0 t π: x_ 2 ¼ 3x1 þ u
€x2 dt ! min , j€xj 1, xð0Þ ¼ x_ ð0Þ, xð1Þ ¼ 11 24 :
Rπ
x sin tdt ! min , jx_ j 1, xðπ Þ ¼ 0: x_ 1 ¼ x2 tu , x1 ðt 0 Þ ¼ 0, x2 ðt 0 Þ ¼ 0, x1 ðt 1 Þ ¼ 6, t 1 t 0 ! min , x_ 2 ¼ 2u x2 ðt 1 Þ ¼ 2, 1 u 3, t 0 ¼ 1:
π
4)
Variant 4 1)
x2 ðπ Þ ! max ,
2) R1 0
3)
x_ 1 ¼ 2x2 u , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, juj 2, 0 t π: x_ 2 ¼ 2x1
x_ 2 dt ! min , 0 x_ 12 , xð1Þ ¼ 1:
x2 ðπ Þ ! min , (1 x_ 1 ¼ x2 þ u
, x1 ð0Þ ¼ 2, x2 ð0Þ ¼ 1, juj 2, x1 ðt 1 Þ þ 2x2 ðt 1 Þ ¼ 4: x_ 2 ¼ x1 u 4) t 1 ! min , j€xj 2, xð0Þ ¼ ξ1 , x_ ð0Þ ¼ ξ2 , x_ ðt 1 Þ ¼ 0:
Appendix
237
Variant 5 1) 2)
x1 ðt 1 Þ þ x2 ðt 1 Þ ! max,
(
t 1 ! min ,
x_ 1 ¼ x2 þ u ,x1 ð0Þ ¼ 0,x2 ð0Þ ¼ 0, juj 1,x1 ðt1 Þ 4x2 ðt1 Þ ¼ 0: x_ 2 ¼ u
x_ 1 ¼ x2 þ u2 u
, x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 1, x_ 2 ¼ x1 u2 þ u 1 5 1 5 x1 ðt 1 Þ ¼ e2 e2 þ 2, x2 ðt 1 Þ ¼ e2 þ e2 2, juj 1: 2 2 2 2 3) R6 2 x_ þ x dt ! max , 2 x_ 1, xð6Þ ¼ 0: 0 x_ 1 ¼ x2 þ 2u 4) x1 ðπ Þ ! max , ,x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, juj 1, 0 t π: x_ 2 ¼ x1
Variant 6 1) R74π
x sin tdt ! min , 1 x_ 0, x 74 π ¼ 0: 0 2) x_ 1 ¼ 2x2 þ u x2 ðπ Þ ! min , ,x1 ð0Þ ¼ 0,x2 ð0Þ ¼ 0, juj 2, 0 t π: x_ 2 ¼ 2x1 3) t 1 ! min , j€xj 2,xð0Þ ¼ 1, x_ ð0Þ ¼ 0, xðt 1 Þ ¼ 0, x_ ðt 1 Þ ¼ 3: 4) R2 2 €x dt ! min ,€x 6, xð0Þ ¼ x_ ð0Þ ¼ 0, xð2Þ ¼ 17: 0
Variant 7
x_ 1 ¼ x2 þ u ,x1 ð0Þ ¼ 2,x2 ð0Þ ¼ 3, 1 u 2,x1 ðt 1 Þ þ x2 ðt 1 Þ ¼ 0: x_ ¼ u (2 2) x_ 1 ¼ x2 þ u2 u , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 1, t 1 ! min , x_ 2 ¼ x1 u2 þ u 1 1 1 1 1 1 x1 ðt 1 Þ ¼ e2 e2 , x2 ðt 1 Þ ¼ e2 þ e2 þ , juj 1: 2 4 4 2 4 4 3) R5 3x þ x_ 2 dt ! max , 12 x_ 1, xð5Þ ¼ 0: 0 4) x_ 1 ¼ 4x2 x2 ð2π Þ ! min , , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, juj 1, 0 t 2π: x_ 2 ¼ x1 þ u
1)
t 1 ! min,
238
Appendix
Variant 8 1) R2
x_ 2 2x dt xð32Þ ! min , 1 x_ 12 , xð0Þ ¼ 0: 0 3 2) x_ 1 ¼ 4x2 x2 2 π ! min , , x1 ð0Þ ¼ 0 x2 ð0Þ ¼ 0, juj 2, 0 t 32 π: x_ 2 ¼ x1 þ u 3) R3 €x2 dt ! max , €x 30, xð3Þ ¼ 0, x_ ð3Þ ¼ 0, xð0Þ ¼ 11: 0
4) t 1 ! min , 0 €x 2, xð1Þ ¼ 2, xðt 1 Þ ¼ 0, x_ ð1Þ ¼ 1, x_ ðt 1 Þ ¼ 3:
Variant 9 1)
Rπ
x cos tdt ! min , 0 x_ 1, xðπ Þ ¼ 0: 2) x_ 1 ¼ x2 , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, juj 1,0 t 2π: 2x2 ð2π Þ ! min , x_ 2 ¼ 9x1 þ u x_ 1 ¼ 2x2 þ tu 3) π
t1 ! min,
4)
x_ 2 ¼ u
x1 ð1Þ þ x2 ð1Þ ! max ,
,x1 ð0Þ ¼ 0,x2 ð0Þ ¼ 2,x1 ðt1 Þ ¼ 15,x2 ðt 1 Þ ¼ 5, 1 u 3:
x_ 1 ¼ x2 þ u ,x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, juj 1,x1 ðt 1 Þ þ 2x2 ðt1 Þ 4: x_ 2 ¼ u
Variants of Tasks. Verification of a Process on Optimality Given a process of a two-point minimum time problem, determine whether it is optimal. Variant 1 ( t 1 t 0 ! min ,
x_ 1 ¼ x1 x2 þ 2u
, x1 ðt 0 Þ ¼ 1, x2 ðt 0 Þ ¼ 1, x_ 2 ¼ x1 x2 u 1 π 3 3 π 1 x1 ðt 1 Þ ¼ e2 eπ þ , x2 ðt 1 Þ ¼ e2 eπ þ , juj 1: 2 2 2 2 π π uðt Þ ¼ 0, 0 t < ; uðt Þ ¼ 1, t π: 2 2
Appendix
239
Variant 2 ( t 1 t 0 ! min ,
x_ 1 ¼ 4x2 u
, x1 ðt 0 Þ ¼ 0, x2 ðt 0 Þ ¼ 0, x_ 2 ¼ x1 þ 2u 3 10 5 3 x1 ðt 1 Þ ¼ , x2 ðt 1 Þ ¼ þ , 1 u 2: 2 π 4 π π 8 π π uðt Þ ¼ 1, 0 t < ; uðt Þ ¼ t 2, t : 4 π 4 2
Variant 3 ( t 1 t 0 ! min ,
x_ 1 ¼ 2x1 u
, x1 ðt 0 Þ ¼ 1, x2 ðt 0 Þ ¼ 1, x_ 2 ¼ x1 þ 2x2 þ 2u 5 3 1 31 15 3 þ , x2 ð t 1 Þ ¼ e 2 e2 þ , juj 1: x1 ð t 1 Þ ¼ e2 e2 4 4 2 8 8 4 1 uðt Þ ¼ , 0 t 1, uðt Þ ¼ 1, 1 t 2: 2 Variant 4 (
x_ 1 ¼ x1 x2 þ u
5 1 , x1 ð t 0 Þ ¼ , x2 ð t 0 Þ ¼ , 2 2 x_ 2 ¼ x1 þ x2 u 1 1 1 3 x1 ðt 1 Þ ¼ e2 e2 þ 1 þ , x2 ðt 1 Þ ¼ e2 e2 1 þ , juj 1: 2 2 2 2 uðt Þ ¼ 1, 0 t 1; uðt Þ ¼ 1, 1 t 2: t 1 t 0 ! min ,
Variant 5 ( t 1 t 0 ! min ,
x_ 1 ¼ x1 x2 þ 2u
, x1 ðt 0 Þ ¼ 1 e2 , x2 ðt 0 Þ ¼ 0, x_ 2 ¼ 2x2 u 1 3 3 1 x1 ðt 1 Þ ¼ e5 e3 þ 3e , x2 ðt 1 Þ ¼ e6 e2 þ , 2 u 1: 2 2 2 2 uðt Þ ¼ 2, 1 t < 3; uðt Þ ¼ 1, 3 t 4:
240
Appendix
Variant 6 8 < x_ ¼ x x 1 u 1 1 2 2 , x1 ðt 0 Þ ¼ 1, x2 ðt 0 Þ ¼ 0, t 1 t 0 ! min , : x_ 2 ¼ 4x1 þ x2 þ u 2 1 1 1 4 2 1 x1 ðt 1 Þ ¼ e6 e3 þ 3e þ e2 þ ,x2 ðt 1 Þ ¼ e6 þ e3 þ e2 , 3 3 2 6 3 3 3 1 u 2: uðt Þ ¼ 1, 1 t < 2; uðt Þ ¼ 1, 2 t 3: Variant 7 (
x_ 1 ¼ x1 þ x2 u
1 , x1 ðt 0 Þ ¼ 0, x2 ðt 0 Þ ¼ 1, x1 ðt 1 Þ ¼ eπ þ , 2 x_ 2 ¼ x1 þ x2 þ u
t 1 t 0 ! min ,
3 π x2 ðt 1 Þ ¼ eπ þ e2 , juj 1: 2 π 1 π uðt Þ ¼ 1, 0 t < , uðt Þ ¼ , t π: 2 2 2 Variant 8 ( t 1 t 0 ! min ,
x_ 1 ¼ x2 x_ 2 ¼ u
, x1 ðt 0 Þ ¼ 0, x2 ðt 0 Þ ¼ 0,
3 x1 ðt 1 Þ ¼ , x2 ðt 1 Þ ¼ 1, juj 1: 2 uðt Þ ¼ 1, 0 t < 1; uðt Þ ¼ 1, 1 t < 2; uðt Þ ¼ 1, 2 t 3: Variant 9 ( t 1 t 0 ! min ,
x_ 1 ¼ x2
, x1 ðt 0 Þ ¼ 0, x2 ðt 0 Þ ¼ 0, x1 ðt 1 Þ ¼ 0, x2 ðt 1 Þ ¼ 1, juj 2: x_ 2 ¼ u uðt Þ ¼ 1, 0 t < 1; uðt Þ ¼ 1, 1 t 2:
Appendix
241
A.4.2. Examples of a Solution Point-to-Point Controllability 8 > < x_ 1 ¼ x2 Verify the point-to-point controllability of the system x_ 2 ¼ u from the > : _ x ¼ x þ u 3 1 0 1 0 1 0 2 B C B C position x0 ¼ @ 0 A, t 0 ¼ 0 to position x1 ¼ @ 1 A, t 1 ¼ 1 for u 2 R. If this 1 0 model is point-to-point controllable, determine a control with a minimal norm that steers the system from a given position to another. Solution We use the Kalman Theorem 4.2. By condition 0
0 1
B n ¼ 3, Aðt Þ ¼ @ 0 0 1 0
0 1 0 C B C 0 A, Bðt Þ ¼ @ 1 A: 0 1 0
1
The fundamental matrix is 0
1 B 0 F ðt, τÞ ¼ B @ tτ
tτ 1 ðt τ Þ2 2
1 0 0C C: A 1
According to the Kalman theorem, a linear system is point-to-point controllable if and only if a system of linear algebraic equations W ðt 0 , t 1 Þz ¼ x1 F ðt 1 , t 0 Þx0 with the matrix of coefficients Zt1 W ðt 0 , t 1 Þ ¼ t0
has a solution. Here
F ðt 1 , t ÞBðt ÞBðt Þ0 F ðt 1 , t Þ0 dt
242
Appendix
0
1 1 B 0 1 F ð1, 0Þ ¼ B @ 1 1 2
0
1
1 0 B3 B C 1 0 C, W ð0, 1Þ ¼ B B A B2 @ 1 5 8
1 5 0 1 8 C 2 C B C 7 C C, x1 F ð1, 0Þx0 ¼ @ 1 A: 1 6 C A 1 7 83 6 60 1 2
The linear system 1 5 0 1 0 1 8 C 2 C z1 C 7 CB C B C@ z 2 A ¼ @ 1 A 1 6 C A z3 1 7 83 6 60 0 1 0 1 z1 312 B C B C has a unique solution z ¼ @ z2 A ¼ @ 613 A and, accordingly, the original 660 z3 system is point-to-point controllable. We determine a control with a minimal norm. By (4.15) we have 0
1 B3 B B1 B B2 @ 5 8
1 2
0
10 1 1t 312 2C B ð1 t Þ CB 613 C uz ðt Þ ¼ Bðt Þ0 F ðt 1 , t Þ0 z ¼ ð0 1 1ÞB A @1 t 1 A@ 2 660 0 0 1 2 ¼ 330ð1 t Þ 312ð1 t Þ þ 47: 1
0
Non-stationary System. Rank Criterion 8 > < x_ 1 ¼ x2 Verify the total controllability of the system x_ 2 ¼ ðt 1Þu , u 2 R for the segment > : x_ 3 ¼ x1 of time [0,1]. 0 1 0 1 0 1 0 0 B C B C Solution Here n ¼ 3, Aðt Þ ¼ @ 0 0 0 A , Bðt Þ ¼ @ t 1 A . A fundamental 1 0 0 0 0 1 1 tτ 0 B 0 1 0C C . We use criterion (4.16) matrix has the form F ðt, τÞ ¼ B @ A 2 ðt τ Þ tτ 1 2
Appendix
243
according to which the condition rank W(t0, t1) ¼ n (in this case rank W(0, 1) ¼ 3) is necessary and sufficient for the total controllability of the system on the segment [0,1]. Since the determinant of the matrix 0
8 B 15 B B 1 W ð0, 1Þ ¼ B B 4 @ 1 3
1 4 1 3 1 10
1 1 3 C C 1 C C 10 C A 173 140
is different from zero (check it!), the criterion of the total controllability holds, and the system is totally controllable on [0,1].
Non-stationary System. Krasovsky’s Theorem 8 x_ 1 ¼ tx3 t 2 u1 > > < x_ 2 ¼ x1 t 2 x3 þ u2 Verify the total controllability of the system , > > : x_ 3 ¼ tx1 1 x2 þ ð ln t Þu1 u2 t
u 2 R2 on the segment of time 12 , 32 . Solution We use a sufficient condition for the total controllability (Krasovskii Theorem 4.3). Here 0
0
B 1 n ¼ 3, Aðt Þ ¼ B @ t
0
t
1
0
t 2
C t 2 C, Bðt Þ ¼ B @0 A 0 ln t
0 1 t
0
1
C 1 A: 1
Construct the matrix K(t) ¼ (K0(t), K1(t), . . ., Kn 1(t)), using the recursion K mþ1 ðt Þ ¼ Aðt ÞK m ðt Þ þ K_ m ðt Þ, m ¼ 0, . . . , n 2, K 0 ðt Þ ¼ Bðt Þ: We get 0
t 2
B0 K ðt Þ ¼ ðK 0 ðt Þ, K 1 ðt Þ, . . .Þ ¼ B @
ln t
0
2t t ln t
t
1
t 2 ð1 þ ln t Þ 1 þ t3 t
t 2 1 t
1
To solve the problem, it is enough to take matrices K0(t) and K1(t).
...
1
...C C: A ...
244
Appendix
If we pick t ¼ τ ¼ 1 2 0
1 B @0
0 1
2 1
0 1 2 controllable.
1
3 2, 2
, then the rank of the matrix K ð1Þ ¼
1 1 ... C 1 . . . A is equal to 3. Consequently, the system is totally ...
1
Stationary System. Total Controllability Verify the total controllability of the system 8 > < x_ 1 ¼ x1 þ u1 þ u2 , u 2 R2 : x_ 2 ¼ x3 þ u2 > : x_ 3 ¼ u1 Solution The system of equations has constant coefficients (it is stationary), so to verify the total controllability, we can apply Kalman Theorem 4.4. Here 0
1 B n ¼ 3, A ¼ @ 0 0
1 0 0 0 1 C B 0 1 A, B ¼ @ 0 0 0 1
1 1 C 1 A: 0
Verify the condition (4.24) rank(B, AB, . . ., An 1B) ¼ n. We have 0
1
1
1
B rank B, AB, A2 B ¼ rank@ 0
1
1 1
1
1
1 0
0
C 0 A ¼ 3:
1
0
0 0
0
0
Consequently, the original system is totally controllable.
Minimum Time Problem Solve the minimum time problem t 1 ! min ,
x_ 1 ¼ x2 , juj 1, xð0Þ ¼ x_ 2 ¼ 2u
on the phase plane x ¼
x1 . x2
1 0 , xðt 1 Þ ¼ 0 0
Appendix
245
Solution The conditions of the problem satisfy Theorem 5.5 about n intervals. 0 1 0 Indeed, here n ¼ 2, A ¼ ,B ¼ and 0 0 2 1. rank ðBw, ABwÞ ¼ rank
0
2w
¼ 2 for w 6¼ 0; 2w 0 2. a polyhedron U is given by inequality |u| 1; 3. the eigenvalues of matrix A are real (λ1 ¼ 0, λ2 ¼ 0). So, an extreme control has no more than one switching point (two intervals of constancy). Consequently, there are four possible options:
1) u(t) ¼ 1, 0 t t1; 3) 1, 0 t < τ, ; uð t Þ ¼ 1, τ t t 1 :
2) u(t) ¼ 1, 0 t t1; 4) 1, 0 t < τ, uð t Þ ¼ 1, τ t t1 :
This analysis shows that the optimal control follows the third case. The optimal trajectory is a solution of the Cauchy problem with control 3) substituted into the system of differential equations. If u(t) ¼ 1, 0 t < τ, then the system of differential equations with the given initial conditions has a solution x1(t) ¼ t2, x2(t) ¼ 2t. When t ¼ τ, we obtain x1(τ) ¼ τ2, x2(τ) ¼ 2τ. On the segment τ t t1by condition 3), the control is u(t) ¼ 1. The general solution of the differential equations has the form x1(t) ¼ t2 + c2t + c1, x2(t) ¼ 2t + c2, where c1, c2 are the constants of integration. Taking into account the conditions of continuity of a trajectory at t ¼ τ and the condition to pass the trajectory through the point x1(t1) ¼ 1, x2(t1) ¼ 0 at the moment pffiffi pffiffiffi 2 t ¼ t1, we get t 1 ¼ 2 and τ ¼ 2 .
S-Problem Find an optimal process in the simplest problem of optimal control ( x2 ðπ Þ ! min ,
x_ 1 ¼ u x , x ð0Þ ¼ 0, x2 ð0Þ ¼ 0, 0 u 1, 0 t π: x_ 2 ¼ 1 sin t 1 2
Solution In this case, the problem is linearly-convex, so the maximum principle is a necessary and sufficient condition of optimality. We write the Hamiltonian x H ðψ, x, u, t Þ ¼ ψ 1 u þ ψ 2 1 sin t 2 and conjugate Cauchy problem
246
Appendix
1 ψ_ 1 ¼ ψ 2 sin t, ψ_ 2 ¼ 0, ψ 1 ðπ Þ ¼ 0, ψ 2 ðπ Þ ¼ 1: 2 From here, we get ψ 1 ðt Þ ¼
1 1 cos t þ , ψ 2 ðt Þ ¼ 1: 2 2
According to the maximum principle, an optimal control satisfies the condition uðt Þ ¼ arg max
0u1
1 1 cos t þ u ¼ 1, 0 t π: 2 2
Substituting the optimal control into the original differential equations and integrating them, we obtain the optimal trajectory x1 ðt Þ ¼ t, x2 ðt Þ ¼
t 1 cos t sin t, 0 t π: 2 2
G-Problem Determine an optimal process in the general problem t 1 ! min ,
x_ 1 ¼ x2 , x1 ð0Þ ¼ ξ1 , x2 ð0Þ ¼ ξ2 , x2 ðt 1 Þ ¼ 0, juj 1, t 1 0, x_ 2 ¼ u
where ξ ¼ (ξ1, ξ2) is a parameter vector. Solution We construct the Lagrange function and the Hamiltonian L ¼ λ0 t 1 þ λ1 x2 ðt 1 Þ, H ¼ ψ 1 x2 þ ψ 2 u and write a conjugate system of differential equations and transversality conditions ψ_ 1 ¼ 0, ψ_ 2 ¼ ψ 1 , ψ 1 ðt 1 Þ ¼ 0, ψ 2 ðt 1 Þ ¼ λ1 , λ0 þ λ1 x_ 2 ðt 1 Þ ¼ 0: (λ0, λ1, t1 depend on ξ). From here, we obtain ψ 1 ðt Þ ¼ 0, ψ 2 ðt Þ ¼ λ1 , t 2 ½0, t 1 :
Appendix
247
The condition of the maximum for function H by u, gives constant controlu ¼ sign ψ 2(t) ¼ sign λ1, t 2 [0, t1]. Then the last transversality condition has the form λ0 þ λ1 x_ 2 ðt 1 Þ ¼ λ0 þ λ1 u ¼ λ0 λ1 signλ1 ¼ λ0 jλ1 j ¼ 0: If λ0 ¼ 0, then we have from the above equality λ1 ¼ 0 which leads to the triviality of the Lagrange multipliers and contradicts the maximum principle. Therefore, without a loss of generality, we set λ0 ¼ |λ1| ¼ 1. As a result, u ¼ sign λ1, t 2 [0, t1]. This is a constant extreme control that takes the values +1 or 1. Control u ¼ + 1 generates the trajectory x1 ð t Þ ¼
t2 þ ξ2 t þ ξ1 , x2 ðt Þ ¼ t þ ξ2 , 2
that intersects the line x2 ¼ 0 in moments t1 ¼ ξ2 for ξ2 < 0. Analogously, the trajectory x1 ð t Þ ¼
t2 þ ξ2 t þ ξ1 , x2 ðt Þ ¼ t þ ξ2 2
corresponding to control u ¼ 1, intersects the line x2 ¼ 0 in moments t1 ¼ ξ2 for ξ2 > 0. Extreme control u ¼ sign ξ2 is unambiguously defined, except for the straight line ξ2 ¼ 0. On this straight line, the solution to the problem is trivial and does not depend on control, therefore we accept u ¼ 1 for ξ2 ¼ 0. Then the extremal control u ¼ signξ2 unique and, therefore, optimal for all points ξ 2 R2. Replacing here the parameters ξ1, ξ2 by the phase variables x1, x2, we obtain the optimal synthesized control uðx1 , x2 Þ ¼ sign x2 :
Variational Problem Problem 1 Solve the problem of the variational calculus Z4
2 x_ þ x dt ! min , jx_ j 1, xð0Þ ¼ 0
0
by using methods of optimal control.
248
Appendix
Solution We introduce new variables x_ ¼ u , x1 ¼ x, transform the Lagrange functional to a Mayer functional, and reduce the given problem to the simplest linearly-convex optimal control problem x2 ð4Þ ! min ,
x_ 1 ¼ u , x1 ð0Þ ¼ 0, x2 ð0Þ ¼ 0, juj 1, 0 t 4: x_ 2 ¼ x1 þ u2
To solve this problem, we apply the maximum principle which in this case is a necessary and sufficient condition of optimality. We construct the Hamiltonian H ðψ, x, u, t Þ ¼ ψ 1 u þ ψ 2 x1 þ u2 and the conjugate Cauchy problem ψ_ 1 ¼ ψ 2 , ψ_ 2 ¼ 0, ψ 1 ð4Þ ¼ 0, ψ 2 ð4Þ ¼ 1: The solution of the last problem is ψ 1(t) ¼ t 4, ψ 2(t) ¼ 1. Following the maximum principle, we determine the optimal control from the condition
uðt Þ ¼ arg max ðt 4Þu u
2
juj1
( ¼
1, t 2, 2
0 t < 2, 2 t 4:
The corresponding optimal trajectory is obtained by integrating the appropriate differential equations, and it has the form 8 < x1 ðt Þ ¼ t 2 , 0 t < 2; : x2 ð t Þ ¼ t þ t 2
8 t2 > < x1 ðt Þ ¼ 2t þ 1 4 , 2 t 4: 3 > : x ðt Þ ¼ t 2t 2 þ 5t 10 2 3 6
Thus, the solution of the given variational problem is the function xðt Þ ¼ t, 0 t < 2; xðt Þ ¼
t2 2t þ 1, 2 t 4: 4
Problem 2 Solve the problem of the calculus of variations Z2 j€xjdt ! min , j€xj 1, xð0Þ ¼ 0, xð2Þ ¼ 1 0
by using methods of optimal control.
Appendix
249
Solution We put €x ¼ u, x1 ¼ x, x2 ¼ x_ and transform the integral functional in a terminal functional. We then represent the given problem as a linearly-convex general optimal control problem (G-problem) 8 > < x_ 1 ¼ x2 x3 ð2Þ ! min , x_ 2 ¼ u , x1 ð0Þ ¼ 0, x3 ð0Þ ¼ 0, x1 ð2Þ ¼ 1, juj 1, 0 t 2: > : x_ 3 ¼ juj We form the Lagrange function and the Hamiltonian L ¼ λ0 x3 ð2Þ þ λ1 x1 ð0Þ þ λ2 x3 ð0Þ þ λ3 ½x1 ð2Þ þ 1,
H ¼ ψ 1 x2 þ ψ 2 u þ ψ 3 juj,
and we conjugate the system of differential equations ψ_ 1 ¼ 0, ψ_ 2 ¼ ψ 1 , ψ_ 3 ¼ 0 with the conditions of transversality ψ 1 ð0Þ ¼ λ1 , ψ 2 ð0Þ ¼ 0, ψ 3 ð0Þ ¼ λ3 ; ψ 1 ð2Þ ¼ λ3 , ψ 2 ð2Þ ¼ 0, ψ 3 ð2Þ ¼ λ0 : We then integrate the conjugate equations and satisfy the transversality conditions to obtain ψ 1 ðt Þ ¼ 0, ψ 2 ðt Þ ¼ 0, ψ 3 ðt Þ ¼ λ0 , λ1 ¼ 0, λ2 ¼ λ0 , λ3 ¼ 0: If λ0 ¼ 0, then all Lagrange multipliers are equal to zero which contradicts the maximum principle. Therefore, without a loss of generality, we assume λ0 ¼ 1. As a result, the Hamiltonian has the form H ¼ λ0 juj ¼ juj: On the segment |u| 1, the function H has a unique maximum point u ¼ 0. Since the maximum principle is a necessary and sufficient condition of optimality for a linearly-convex G-problem, the control u(t) ¼ 0 is optimal. The original differential equations are integrated and the initial conditions are satisfied to determine the corresponding optimal trajectory t 1 x1 ðt Þ ¼ , x2 ðt Þ ¼ , x3 ðt Þ ¼ 0: 2 2 Thus, the function xðt Þ ¼ 2t is the solution of the given variational problem.
Bibliography
1. Alekseev V.M., Tihomirov V.M., Fomin S.V. Optimal control. - M.: Nauka, 1979. 2. Aschepkov L.T. Optimal control of discontinuous systems. - Novosibirsk: Nauka, 1987. 3. R. Bellman Dynamic Programming. - M.: Publishing House of Foreign. Literature, 1960. 4. Vasilyev F.P. Numerical methods for solving extreme problems. - M.: Nauka, 1986. 5. Velichenko V.V. On optimal control problems for equations with discontinuous right-hand sides // Automation and Remote Control. 1966. #7. p. 20–30. 6. Velichenko V.V. On variational method in the problem of invariance controlled systems // Automation and Remote Control. 1972. #4. p. 22–35. 7. Velichenko V.V. On the method of extremal field in sufficient optimality conditions // Journal of Computational Mathematics and Mathematical Physics. 1974. V.14. #1. p.45–67. 8. Gabasov R., Kirillova F.M. Optimization of linear systems. - Minsk: Publishing House of Belarussian State University, 1973. 9. Kalman R., Falb P., Arbib M. Essays on mathematical systems theory. - M.: Mir, 1971. 10. Krasovskii N.N. The theory of motion control. - M.: Nauka, 1968. 11. Krotov V.F., Boukreev V.Z., Gurman V.I. New methods of calculus of variations in dynamics of flight. - M.: Mashinostroenie, 1969. 12. Pontryagin L.S. Ordinary differential equations. - M.: Nauka, 1965. 13. Pontryagin L.S., Boltyanskii V.G., Gamkrelidze R.V., Mishchenko E.F. Mathematical theory of optimal processes. - M.: Fizmatgiz, 1961. 14. Rozonoer L.I. The principle of maximum of Pontryagin in the theory of optimal systems. I-III // Automation and Remote Control. 1959. V.20. #10. p. 1320–1344; #11. p. 1441–1458; #12. p. 1561–1578. 15. Tyatyushkin A.I. Numerical methods and software for optimization of controlled systems. Novosibirsk: Nauka, 1992.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L. T. Ashchepkov et al., Optimal Control, https://doi.org/10.1007/978-3-030-91029-7
251