354 61 20MB
English Pages 245 Year 1995
Optimal Control and the Calculus of Variations
Optimal Control and the Calculus of Variations ENID R. PINCH Department of Mathematics University of Manchester
OXFORD UNIVERSITY PRESS
This book has been printed digitally and produced in a standard spedftcation in order to ensure its continuing availability
OXFORD UNIVERSITY PRESS
Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It fmthers the University's objective of excellence in research, scholarship, and education by publishing worldwide in Oxford NewYork Auckland Bangkok Buenos Aires Cape Town Chennai Dar es Salaam Delhi Hong Kong Istanbul Karachi Kolkata Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Sao Paulo Shanghai Singapore Taipei Tokyo Toronto Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York © E. R. Pinch 1993
The moral rights of the author have been asserted Database right Oxford University Press (maker) Reprinted 2002 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose this same condition on any acquirer ISBN 0-19-851489·1
Preface
The aim of this book is to introduce the theory of optimal control to readers who have a serious interest in its mathematical aspects. It assumes a background in analysis, algebra, and mathematical methods similar to that acquired by a diligent British undergraduate after two years of study. A list of books that cover this material is given in the bibliography. The central result of optimal control theory is the maximum principle of Pontryagin. A number of authors prove it rigorously on the assumption that the reader has a strong background in measure theory and functional analysis. A disadvantage of this approach is that it makes the proof inaccessible to many applied mathematicians and engineers who need to use the result to solve problems. What is required is a rigorous but elementary proof of the maximum principle and this is presented in Chapter 16. The author has taught optimal control theory to third-year mathematics undergraduates for a number of years and has tried several different approaches. The experience has made it clear that students understand the concepts of optimal control theory and appreciate the power and elegance of the maximum principle much more readily if they meet it after studying the optimization of functions and the methods of the calculus of variations. This is the approach adopted in this book; we start by looking at the problem of finding the global minimum of a function of one variable and build up from there. The text contains many worked examples and also exercises for the student. It is essential to work through these exercises; it is only while applying a general theory to specific examples that a proper understanding of the theory starts to emerge. The process can be painful but it is essential. The answers to the exercises are given in a section at the end of the book together with hints intended to help students with the more difficult problems. There are a number of people here at Manchester whose help has been invaluable. Miss Francesca Entwistle typed the manuscript with great skill and patience. Dr Paul Martin read the typescript with eagle-eyed efficiency. The author is grateful to Dr Grant Walker for helpful discussions when the section on convex sets was
Vl
PREFACE
being written. Above all, thanks are due to Dr Graham Little for his generous help with the most difficult section of the proof of the maximum principle. The diagrams were drawn by the author using the Ghost80 package on the Amdahl 5890-300E at Manchester. The experience was an enlightening one. Using a good graphics package to trace out optimal paths and switching curves is an activity that the author can recommend; it is not always pleasant but it affords insights not otherwise attainable.
Manchester June 1992
E.R.P.
Contents
1
1
Introduction The maxima and minima of functions The calculus of variations Optimal control
2
Optimization in ~n Functions of one variable Critical points, end-points, and points of discontinuity Functions of several variables Minimization with constraints A geometrical interpretation Distinguishing maxima from minima
13
3
The calculus of variations The fixed end-point problem Problems in which the end-points are not fixed Finding minimizing curves Isoperimetric problems Sufficiency conditions Fields of extremals Hilbert's invariant integral Semi-fields and the Jacobi condition
33
4
Optimal control 1: Theory Introduction Control of a simple first-order system Systems governed by ordinary differential equations The optimal control problem The Pontryagin maximum principle Optimal control to target curves
70
5
Optimal control II: Applications Introduction Time-optimal control of linear systems Optimal control to target curves Singular controls
1 4 9
13 16 18 22 25 28
33 41 46 54 58 59 62 66
70 70 72 74 80 100
103
103 103 139 149
CONTENTS
Vlll
6
Fuel-optimal control Problems where the cost depends on x(t 1 ) Linear systems with quadratic cost The steady-state Riccati equation The calculus of variations revisited
151 159 163 168 170
Proof of the maximum principle of Pontryagin Convex sets in ~" The linearized state equations The behaviour of H on an optimal path Sufficiency conditions for optimal control
175
176 182 184 206
Appendix: Answers and hints for the exercises
208
Bibliography
231
Index
233
1
Introduction
The shortest distance between two points in a plane is obtained by drawing the straight line that joins them. The circle is the shape that encloses maximum area for a given length of perimeter. These well-known facts were known to the Ancient Greeks and are the oldest known solutions in the class of problems discussed in what we now call optimization theory. But, although the Greeks' geometrical insight gave them the answers to a few problems in this field, it was not until the eighteenth century that a systematic theory began to emerge. That the theory of optimization continues to be an area of active research for mathematicians is an indication both of the inherent beauty of the subject and of its relevance to modern developments in science, industry and commerce. Our aim is to understand the theory of optimal control, an area developed since the 1950s in response to American and Russian efforts to explore the solar system. The mathematical problems of space navigation include optimization problems. One wishes to construct trajectories along which a space vehicle, controlled by a small rocket motor, will reach its destination in minimum time or using the minimum amount of fuel. These new problems were not soluble by the methods that were currently available and the theory, whose roots lay back in the eighteenth century, had to be extended to meet the new challenge. 1t,he theory of optimal control is best appreciated when it is placed in its historical context and seen as the latest chapter of a story that starts in the early eighteenth century. The maxima and minima of functions Let f(x) be a function of the scalar variable x and suppose it is defined and continuously differentiable for all x. Every student of calculus knows that to find its minimum value we must first find the points at which the first derivative f'(x) is zero. Suppose f'(a) = 0 and let e be any small positive number. Then if f'(a- e)< 0 and f'(a + e) > 0 we know that f(x) has a local minimum at x = a. Similarly if f'(a - e) > 0 and f'(a + e) < 0 there is a local maximum.
2
INTRODUCTION
This process gives us the local maxima and m1mma but the existence of a local maximum (or minimum) does not imply that f(x) takes its largest (or smallest) value at that point because an absolute or global maximum (or minimum) need not exist. Thus f(x) = x 3 - 3x + 4 has a local maximum at x = -1 but it is not true that f(x) ::::;; f( --1) for all x. This is a difficulty that is present in all types of optimization problems and, as we shall see, in more difficult problems we have to content ourselves with conditions that ensure a local optimum. Observe too that our familiar rule for local maxima and minima is valid only if f(x) is nicely behaved; for f(x) = lxl there is no point at which f'(x) = 0 but it has a global minimum at x = 0 (and at this point f'(x) does not exist). Values of x at which the derivative fails to exist or f(x) is discontinuous need to be carefully investigated because f(x) could achieve its global maximum or minimum at such a point. We discuss this further in Chapter 2. The effect of constraints
The presence of constraints alters the character of an optimization problem in a dramatic way. The presence or absence of a constraint can be crucial to the existence of an optimal solution. This effect can be seen if we consider the very simple problem of finding the maximum and minimum of the function f(x, y) = x + 2y, where x andy are two independent variables with range - oo < x, y < + oo. This problem has no solution. The function f(x, y) takes all values from - oo to + oo and has neither a maximum nor a minimum. Now suppose we constrain the values of x andy by insisting that they always satisfy the equation x 2 + 4y 2 = 8. We now have a constrained problem. Find the maximum and minimum values of f(x, y)
=---=
x
+ 2y
subject to the constraint
x2
+ 4y 2 = 8.
There is an elegant method ascribed to Lagrange (1736--1813) that will solve problems of this type and we will meet it in Chapter 2. Here we will use a simple graphical method to convince ourselves that this constrained problem is soluble even though the unconstrained problem is not. Let us plot in Figure 1.1 the contour lines f(x, y) = constant and note that, as we move from left to right across the diagram, the value of f(x, y) increases.
3
THE MAXIMA AND MINIMA OF FUNCTIONS li\y
X -~--t------"'---c;>
Figure 1.1
Constrained optimization-a graphical view
We now superimpose the constraint curve C(J onto our contour map and recall that we are seeking the maximum and minimum values that f can achieve on C(J, From the diagram it is clear that the constrained maximum must be attained at P and the constrained minimum at Q. It can be shown that Pis x = 2, y = 1 and Q is x = -2, y = -1. Thus the constrained problem is soluble. This phenomenon is universal in optimization problems; constraints are not just trivial additions to the problem that need to be checked after the solution has been found. They are an essential part of the structure of the problem and are often crucial to the very existence of a solution. Note however that we are not claiming here that all constrained optimization problems are soluble. A useful tool for finding the local maxima and minima of a function is the Taylor series representation of the function. As we shall see in later chapters, it remains useful in more complex problems. In Chapter 2 we will look in more detail at the optimization of functions before we move on to discuss the calculus of variations (Chapter 3) and optimal control (Chapters 4-6). Here we simply
4
INTRODUCTION
note that the ideas met so far (existence, local and global optima, the smoothness (or otherwise) of the function to be optimized, the effect of constraints and the use of the Taylor expansion) are very important in optimization problems. Optimal control also involves some ingredients that are absent from the theory of optimization of functions. These we meet in the calculus of variations.
The calculus of variations The calculus of variations is the name given to the theory of the optimization of integrals. The name itself dates from the mideighteenth century and describes the method used to derive the theory. An early problem involving the minimization of an integral was posed by John Bernoulli (1667-1748) in 1696. It involves a bead sliding under gravity along a smooth wire joining two fixed points A and B (not in the same vertical line) and asks what shape the wire should be in order that the bead, when released from rest at A, should slide to B in minimum time. It is called the brachistochrone problem (from the Greek: brachist = shortest, chronos =time). Figure 1.2 shows the eoordinate system. Without loss of generality we can take A to be the origin. Let B have coordinates x = a, y = b and let y = y(x) be the equation of the arc of the wire joining A to B.
A
X
-·----------------..;>
g
y
Figure 1.2
The brachistochrone problem
THE CALCULUS OF VARIATIONS
5
We are required to minimize
I
Bds
A
U
where s is arc length along the wire and v is the speed of the bead. Now in this problem the total energy is conserved so v = (2gy) 112 • Also we can write ds = (1 + y' 2 ) 112 dx, where y' denotes the first derivative of y with respect to x. We then have to minimize
_ 1 fa (1 +
J[y]-~
(2g)
0
y
y'2)1/2
dx
(1.1)
with y(O) = 0, y(a) = b. Note that each choice of the function y(x) on [0, a] leads to a specific numerical value for the time taken. The integral J[y] acts on a set of functions to produce a corresponding set of numbers. It is useful to compare this to the action of a function f(x) defined on some interval I; this acts on the points in I to produce a corresponding set of numbers. So J[y] and f(x) play a similar role; they both act on a set of mathematical objects to produce a corresponding set of numbers. To mark this similarity, integrals like J[y] are called functionals. Note carefully that the set of objects that functionals operate on is much more complicated than the simple set of points on which f(x) operates. We see this as soon as we start to think about how to minimize the functional defined in (1.1). What exactly do we mean by the phrase 'the set offunctions defined on [0, a] withy(O) = 0, y(a) = b'? Do we mean all such functions, however badly behaved they may be? Clearly we must exclude functions that are discontinuous on [0, a] (there must be a continuous path for the bead to traverse), and also functions for which J[y] fails to exist. The set of functions that remains still contains functions too ill-behaved for our purposes; to derive the fundamental result we need such operations as differentiation and integration by parts. We must therefore insist upon a certain degree of smoothness for our candidate functions y(x). But we must be aware of the consequences of our decision to exclude some types of function; we may fail to find a minimizing function because we rejected it at the start. For example, a sensible move would be to consider only functions that have continuous first derivatives (functions of class C1 ). We can derive a first necessary condition on this assumption, but consider
6
INTRODUCTION
'------~---+----+-~---+-------3>
2
3
X
Figure 1.3 A minimizing curve with a corner the following minimization problem: J[y] =
{3
(y'z- 1)2 dx
(1.2)
with y(O) = 1, y(3) = 2. The integrand is always non-negative, so if there is a curve joining (0, 1) to (3, 2) for which J[y] is zero it must be a minimizing curve. Now y= {
X+
1,
(1.3)
5 -x,
is one path for which J[y] = 0, so it is a minimizing curve. This function does not have a continuous derivative on [0, 3] because there is a discontinuity (or corner) at x = 2. See Figure 1.3. The possibility that the minimizing curve might have corners will have to be borne in mind as we develop the theory. Let us now formulate the simplest possible problem that involves the minimization of a functional.
The simplest problem Find the curve y = y(x), which joins the fixed point x =a, y the fixed point x = c, y = d, that gives the functional
=
b to
THE CALCULUS OF VARIATIONS
7
its minimum value. We assume that the function f(x, y, y'), where y' represents the first derivative of y with respect to x, is differentiable with respect to each of its three variables as many times as are required. We viewy' as a variable with the same status as the dependent variable y. This may at first seem strange, but we wish to find the curve that starts at (a, b), ends at (c, d) and whose shape is such that J[y] is minimized. The shape of the curve is determined by our choice of y' at each x, so it is necessary to view y' as a variable and to be unperturbed by such operations as 8 f joy'. A complete solution to this problem would involve finding conditions that were necessary and sufficient for the curve y = y(x), say, to give J[y] a global minimum in the class of all continuous functions joining (a, b) to (c, d). This is a formidable task and we shall only be able to make progress if we start with a more modest goal. First we restrict y = y(x) to the class of functions that satisfy the end conditions and have y' andy" continuous on [a, c]. These C2 functions we will call the admissible functions. (The possibility of corners will be dealt with in due course.) This restriction means that we can only get a condition that is necessary for a minimumeven if y = y(x) minimizes in C2 there might be a curve with corners that gave J an even smaller value. This step of specifying a class of' admissible functions is essential if we are to make progress in solving problems in the calculus of variations, and we will see it again when we look at optimal control problems. Having chosen our class of admissible functions we now look within that class for a curve that gives J a local minimum. That is, el[y] is smaller than J[y] for all admissible functions y = y(x) that are close to y = y(x) in some sense. But it is not at all clear what is meant by saying that two functions are close to each other. Functional analysis clarifies the situation by setting up spaces of functions and defining precisely what is meant by the distance between two functions. The eighteenth-century mathematicians who invented the calculus of variations did not have this powerful modern theory at their disposal. Many British undergraduates are not familiar with functional analysis either, so we shall develop the calculus of variations in the spirit of Euler and Lagrange . Consider admissible functions which are of the form (1.4) y(x) = y(x) + sry(x), where s is a small constant.
8
INTRODUCTION
Since y(x) must be C2 and satisfy the end conditions y(a) = b, y(c) = d the functions r,(x) must be and satisfy r,(a) = r,(b) = 0. Functions of the form shown in (1.4) are called weak variations of the minimizing curveji(x). We can obtain a necessary condition for ji(x) to be a minimizing curve by observing that if ji(x) minimizes J then
c2
J[y
+ er,]
~
J[y]
for all weak admissible variations. The general theory is given in Chapter 3, where we tackle the problem by expressing f(x, y, y') as a Taylor series in the variables y, y' at each x and retaining only the terms that are of first order in the small quantity e. This will lead us to the so-called first necessary condition derived by Euler in 1746. To prove the basic results in the theory of optimization of functionals we examine the variation in J when we make a small variation in the curve along which it is calculated. This is how the theory was first developed by Euler (1707 -83) and Lagrange (1736-1813) and their contemporaries. It is for this reason that the subject was dubbed the calculus of variations early in its history.
Constrained problems We know that the circle is the shape that encloses the maximum area for a given length of perimeter. This is a constrained problem. The area, which can be expressed as an integral, is to be maximized while the length of the perimeter, which is also expressed as an integral, is made to take a fixed value. We have a whole class of problems of the same type where we optimize (maximize or minimize) J[y] = while l[y] =
f
f
f(x, y, y') dx
g(x, y, y') dx
=K
where K is fixed. Such problems are usually called isoperimetric problems (from the Greek: iso = equal, peri = around) thus taking their collective name from the oldest and most famous problem of this type. As we shall see in Chapter 3, the way in which we solve such problems bears an astonishing resemblance to the method named after
OPTIMAL CONTROL
9
Lagrange but invented by Euler to deal with the constrained minimization of functions. The concept of the Lagrange multiplier is exactly what we need to solve both types of problem. This elegant idea also turns out to be useful in optimal control problems. The calculus of variations approaches the problem of minimizing a functional by first defining a class of admissible variations and then examining the effect upon the value of the functional J of a small variation in the curve along which it is evaluated. This turns out to be a fruitful approach in that it enables us to find conditions that are necessary for a minimum (or maximum). But it does not solve the problem completely. If our curve y = ji(x) satisfies these necessary conditions for a minimum, then we know that no admissible curve that is close toy = .Y(X) can give J a smaller value. No y = ji(x) that fails the necessary conditions can be optimal. But even if the curve satisfies these conditions it still may not be optimal. There are a number of reasons for this. The method assumes that an optimal solution exists, so if we have a problem that does not have an optimal solution our deductions are false. We have also restricted the class of admissible variations; we must be alive to the possibility that the optimal solution lies outside this class. Thus, we usually work with admissible functions that are C2 , but we have already seen that optimal curves can have corners. Although necessary conditions for minima and maxima were known in the mid-eighteenth century, a correct account of sufficiency conditions did not emerge until much later with the work of Jacobi (1804-51), Weierstrass (1815-97) and Hilbert (1862-1943) . It is not at all surprising that sufficient conditions took so long to emerge. The need for rigour in mathematical proofs was not appreciated until the nineteenth century. It is no accident that Weierstrass was also one of the founders of analysis. What is truly remarkable is that Euler (and Lagrange slightly later) derived the correct necessary conditions for the minimization of an integral at such an early date.
Optimal control The idea of a machine that automatically carries out some preassigned task is now a commonplace one. Thermostats and timing devices control the behaviour of central heating boilers and washing machines; a car driver controls the path of the vehicle using the foot pedals and the steering wheel. Such systems can be made to change their behaviour. We have a means of controlling
10
INTRODUCTION
them. This means that the laws that govern their behaviour must contain variables whose values can be changed by someone acting outside and independently of the system itself. We decide to press on the brake rather than the accelerator and the car responds by slowing down. Thus, as well as the so-called state variables that define precisely what the system is doing at time t (for a car its position variables), we have variables called the control variables that can be used to modify the subsequent behaviour of the system. Thus changing the pressure on the accelerator pedal and turning the steering wheel are simply devices that change the magnitude and direction of the total force acting on the car and it is this which, via Newton's laws of motion, determines the subsequent path of the car. We will consider systems whose behaviour can be described by a set of ordinary differential equations. This is the easiest type of system to deal with. The reader should note though that some systems need partial differential equations or even integradifferential equations to model their behaviour adequately. We will let the n-vector x(t) denote the state variables and the m-vector u(t) denote the control variables and suppose that the behaviour of the system is described by :X= f(x, u)
where f is an n-dimensional vector function of x and u and the dot indicates differentiation with respect to the time variable t. We assume that fhas derivatives of all orders with respect to all its variables and that u(t) is reasonably well-behaved. Suppose now that our system is in some given state at t = t 0 , say x(t 0 ) = x 0 , and we wish to control it to some other giyen state at t = t 1 , say x(t 1 ) = x 1 . Each choice of u(t) in t0 ~ t ~ t 1 will give us a different solution for x(t) satisfying x(t 0 ) = x 0 • Some of these paths may pass through x = x\ others will not. Here we meet our first difficulty; it is possible that there is no control that will transfer the system from x 0 to x 1 . Consider the two-dimensional system with one control variable that is governed by the equations
and is to be controlled from x 1 = 1, x 2 = 2 at t = 0 to x 1 = x 2 = 0 at some later time. No choice of u 1 (t) can make the system go from (1, 2) to (0, 0) because the value of u 1 has no effect on the behaviour
OPTIMAL CONTROL
11
of x 1 . Any solution starting with x 1 = 1 must have x 1 = et fort 2:: 0 so x 1 can never be zero and the system is uncontrollable. Before we can discuss optimal control for a system we must establish that the system is controllable from the given initial state to the desired final state. If we can steer the system from x 0 to x 1 in a number of different ways then we have a choice as to which one to use. Some choices will mean that the system will take a very long time to reach x 1 , others may be expensive in energy consumption. We need a measure of the cost incurred during the transfer. A natural way to measure the cost is to set up an integral of some suitable function of the state and control variables so we write the cost integral as J[x]
J t1
=
{ 0 (x,
u) dt.
to
and look for the control that minimizes it. The choice of the function {0 is up to us. If we want the swiftest possible transfer we would put {0 = 1 so that minimizing J would minimize the transfer time t 1 - t 0 • On the other hand if we were concerned about the amount of energy consumed then we would choose {0 appropriately. We now seem to be back on familiar ground. We have to choose a function u(t) on t0 ~ t:::; t 1 so that a functional J is minimized and the state equations x = f(x, u) are satisfied. This looks like a problem in the calculus of variations. The only difference (apart from the vector nature of u(t), which is not a serious difficulty) is that the constraints on x and u are differential equations and not integrals as in the isoperimetric problem we discussed earlier. The calculus of variations can indeed be used on problems with differential constraints; a well-developed theory already exists. Unfortunately it cannot be used to solve the type of problem that naturally arises in control theory because it assumes that there are no constraints on the vaiues that the control variables can take. Control problems almost always have constraints on the control variables. To see this, consider how a driver controls a motor car. She uses the foot pedals and steering wheel to alter the motive force F acting on the car and it is F that controls the car. However powerful the car's engine may be the magnitude ofF is always bounded. Infinite values of IFI are impossible and the control F is subject to the constraint IFI ~ K, where K is a constant. The value of K varies from car to car but the fact that IFI is bounded remains the case. Most control problems share this feature.
12
INTRODUCTION
In Chapter 4 we shall look at a control problem using the classical approach. To do so we shall ignore the constraints on u(t). This approach is a useful one because it leads us to the theorem that we need in order to solve real control problems. What we cannot do is prove the theorem rigorously because in the calculus of variations the class of admissible functions is too narrow. The theorem was proved rigorously in 1956 by L.S. Pontryagin. The proof given here (in Chapter 6) uses analysis rather than functional analysis, but is applicable to real eontrol problems because the class of admissible functions is an appropriate one.
2
Optimization in [Rn
F11..mctions of one variable Before examining then-variable case it will be instructive to look at the familiar problem of finding the maxima and minima of a function of one variable. This is not as straightforward as one might at first think and the difficulties eneountered here are of a similar type to those found in the more technically complex theories of optimization in lffin, the calculus of variations and modern optimal control. Problem 2.1. Let f(x) be a funetion defined on some interval I of the real line. Find the points of I at which f(x) achieves its maximum and minimum values. This is the most general problem we can pose and it is insoluble. We know nothing about the behaviour of f(x) or whether I is open or closed, finite or infinite, nor can we have any grounds for assuming that f(x) has a maximum or minimum. We can, however, state a necessary and sufficient condition that x E I gives f(x) its minimum value (if such a point exists). If f(x):::;; f(x) for all x E I, with equality only for x = x, then f(x) achieves its minimum value at x. (2.1)
There is a similar condition for a maximum. In what follows we shall concentrate mainly on developing the theory for minima. Any theory of minima has to be based on condition (2.1), but the condition itself is of little use in practice. We will need more information about f(x) and I before usable conditions can be developed from it. Thus the familiar condition that f'(x) = 0 is necessary for a local minimum or maximum can only be derived by restricting Problem 2.1 to functions that are suitably well-behaved near x. Before we can go any further we need to distinguish between local and global minima. A point x for which (2.1) is satisfied gives f(x) a global or absolute minimum in that there is no other x E I that gives f(x) a value equal to or smaller than f(x). A point xis
14
OPTIMIZATION IN IR"
said to give f(x) a local minimum if 'VxENc I,
f(x) :s; f(x)
(2.2)
with equality only for x = x, where N is the ~>-neighbourhood lx - xi < s, s small. Let us now establish conditions that we can use to find minimum points for well-behaved functions.
Theorem 2.1. Let f(x) be defined on the open interval (a, b) and have continuous first and second order derivatives in some ~> neighbourhood of x E (a, b). In order that x give f(x) a local minimum it is necessary that f'(x) = 0. Proof. If f(x) is a local minimum then there is a neighbourhood in which
f(x
+ h) - f(x)
2:.
o
N,
in
with equality only for h = 0. Since f(x) has continuous first and second derivatives we can use a Taylor expansion
f(x +h)= f(x) + hf'(x) +
h2
~
2!
f"(x
+ eh),
Then we have
hf'(i)
h2
+ ' f"(i + ()h) 2.
?:. 0
1n
N,
where f"(x + ()h) is bounded. When h > 0 we can deduce that f'(i)
+ '!_ 2
f"(i
+ ()h)
?:. 0.
Ash--+ 0 we obtain f'(x) ?:. 0. When h < 0 a similar argument gives f'(i) :s; 0
so f'(i) = 0 is a necessary condition for a local minimum. It is of course also necessary for a local maximum and points at which the first derivative vanish are called the stationary points of the
function.
•
Note that we need not insist that f" is continuous; all we really need is the assurance that the second derivative is bounded in N.
15
FUNCTIONS OF ONE VARIABLE
Here we have taken the fundamental inequality (2.2) and used a Taylor expansion to deduce a useful necessary condition for a local minimum. This is an important idea and we shall be using it frequently in developing useful techniques in the field of optimization. We can think of it in the following way: the function takes its minimum value at x, so if we change x slightly to x + h, the value off should always increase. If we can expand this new value as a power series in h then, provided the coefficients are bounded, the dominant term locally will be the term in h (called the first variation in the calculus of variations). Once we have dealt with this term we may find that we can squeeze more information out by examining the term in h 2 • For instance, in Theorem 2.1 we deduce that we must have f'(x) = 0 and so
f(x
+ h) -
f(x)
=
h 2 f"(x 2
+ Oh) ;;::: o.
If f"(x) is continuous in Nwe can deduce that f"(x);;::: 0 is necessary for a minimum. But if f"(x) is discontinuous we can do no such thing. Whenever we address the problem of deducing conditions for maxima and minima we must be clear about what sort of function we are dealing with. To illustrate this let us state two familiar theorems, both of which give necessary and sufficient conditions for a local minimum but whose assumptions about the behaviour of f(x) are quite different. Theorem 2.2. Let f(x) be defined on the open interval (a, b) and continuously differentiable in some e-neighbourhood N of x E (a, b). Write the points of N in the form x = x + h, lhl 0
for all h,
where the gradient and the Hessian are evaluated at the point a.
•
The following well-known theorem from the theory of quadratic forms will be useful in applying Theorem 2.5.
Theorem 2.6. The quadratic form h T Hh is positive if and only if det Hand all the principal minors of H are positive. •
21
FUNCTIONS OF SEVERAL VARIABLES
In rR 3 this would require
a2r a2[ axr ax! ax2 azr 0 ax1> 2 , azr > 0, a2{ ax2 axl ax~ azr a2[ a2[ axr ax! ax2 ax! ax3 a2[ a2[ a2[ > 0. ax2 ax! ax~ OXz ax3 azr a2{ a2{ ax3 axl ax3 ax2 ax~ ----
~-
----
The corresponding theorem for a local maximum requires that grad f = 0 and h T Hh < 0 for all h. A quadratic form hT Hh is negative if and only if ( -1)" det H > 0 and the principal minors of H alternate in sign with o2f jaxr < 0. Example 2.1.
Minimize f(x 1 , x 2 ) =xi- x~ where K is rR 2 .
Solution.
f = ( 2x 1 ) ami :"his is zero at x 1
grad
-2x 2
=
x 2 = 0.
0 ) everywhere. Now det H = -4 and a2[jaxr = 2, so the H = (2 0 -2 quadratic form is neither positive nor negative and the origin is neither a maximum nor a minimum. In fact the surface is saddleshaped near the origin. Exercises 2.2. ing functions:
Find the local maxima and minima of the follow-
+ X 2 - 4) 3. (x 2 - xi) 2 + (x 1 - 1) 2 4. (xi - 4) 2 + x~ 5. xi- x~- 2x 1 x 2 + 6 6. xi + x~ + 3xi - 3x~ 2.
X 1 X 2 (X 1
8.
22
OPTIMIZATION IN IR"
Minimization with constraints The proof of Theorem 2.5 relies on the fact that the variables x 1 , x 2 , •• • , xn are independent. If x is constrained to lie on some surface in IR1n then, for a to give f(x) a local minimum, we require that f(a
+ eh) :?: f(a)
for all points a + eh that lie in the tangent plane at a. We can still say that h T grad f = 0 at a is necessary for a minimum but the result holds for h in the tangent plane and not for all possible n-vectors h. The neatest way out of this difficulty is to introduce what are known as Lagrange multipliers. Let us state the problem.
Problem 2.3. constraints
Minimize f(x) given that x satisfies the equality j = 1, 2, ... , m < n,
where the ci are constants. We assume that f and gi are twice-continuously differentiable with respect to X;, i = 1, 2, ... , n in some open region of IR1n. Now suppose that a satisfies the constraint equations and gives f a local minimum. Take a neighbourhood of a, all of whose points satisfy the constraint equations j
=
1, 2, ... , m.
Then for sufficiently small e we have g/a)
+ ehT grad g/a) + O(e 2 )
=
ci,
which gives us m equatiohs hT gradg/a)
=
0
(2.11)
or j
=
1, 2, ... , m.
We can in principle solve (2.11) for m of the components of h (say, h 1 , h 21 ••• , hm) in terms of the remaining n- m components (hm+ 11 ••. , hn) provided the rectangular matrix (8g)8x;) is of rank m. Equations (2.11) restrict the vectors h in that only the last n- m
MINIMIZATION WITH CONSTRAINTS
23
components can be chosen arbitrarily; once these have been chosen the remaining components are completely determined. We now examine the implications of the inequality {(a+ eh)
~
{(a)
for all h satisfying (2.11). Following the method used in proving Theorem 2.5, we use a Taylor expansion. Thus ehT grad {(a)+ O(e 2 )
~
0
and dividing by e ( > 0) we obtain hT grad {(a)= 0
(2.12)
in the limit as e ~ 0. This must hold for all h satisfying (2.11) if a is to minimize f and satisfy the constraints. Now observe that for any set of m scalars J..i hT grad f = hT grad f
+ L J..ihT gradgi j
at a for all h satisfying (2.11). Thus (2.12) is equivalent to hT(grad f
+ ~}..igradgi) = 0
(2.13)
at a for all h satisfying (2.11). Writing this in scalar form, with the first m components of h separated from the rest, gives
A judicious choice of the scalars }..i will now remove the terms involving h1o h 2 , ••• , hm. We pick J..i so that them equations i = 1, 2, ... , m
are satisfied. For this set of values of J..i equation (2.14) reduces to n (of ogj) L hi-+ LAi- =0 i=m+ 1 oxi i oxi
(2.15)
24
OPTIMIZATION IN IR"
at a for all possible choices of the independent numbers hm+ 1 , . . . , hn. So we must have (2.16)
at a, i = m + 1, ... , n. Combining (2.15) and (2.16) we see that a necessary condition for f to have a local minimum at a subject to the constraints gj(x) =~ ci, is that there exists a set of scalars A.i such that grad f
+ L A.i gradgi = 0 j
at a. This is usually written as grad(t
+ ~ A.igi) = 0
(2.17)
at a. Points at which (2.17) holds are called critical points. Thus finding the constrained minimum of f is reduced to finding the unconstrained minimum of f + L 2jgi. The scalars are called the Lagrange multipliers. Take careful note of the way in which this result was obtained. We shall use a similar argument in Chapter 3 when we look at constrained problems in the calculus of variations and again in Chapter 4 when we tackle the problem of optimal control. Before we examine the geometrical significance of condition (2.17), let us state the result we have just established as a theorem. Theorem 2.6. Let f(x) and gi(x) be defined and have continuous second derivatives in some open region of IW. Then a necessary condition that a minimize f(x) subject to the constraints gix) = ci, j = 1, 2, ... , m is that there exist m Lagrange multipliers 2 1 , ), 2 , . .. , Am such that at a.
•
The condition is of course also necessary for a maximum. Note that we have assumed that the second derivatives exist and are continuous. These are not the most stringent conditions under which Theorem 2.6 holds. But it is convenient to assume that the second derivatives are nicely behaved.
25
A GEOMETRICAL INTERPRETATION
A geometrical interpretation The vectors grad f and grad gi are, provided they are not zero, the normals to level surfaces off and gi. Let us first examine a problem in IR: 2 with one constraint g(xv x 2 ) =c. Theorem 2.6 tells us that if we want to minimize f(x 1 , x 2 ) subject to g(x 1 , x 2 ) = c, then we should look for A and a such that grad({ + Ag) = 0 at a. If we can find such a A and a we have grad
f = -A gradg at a,
so, pmvided grad f # 0, grad g # 0 and A # 0, the normal to the constraint curve at a has its normal parallel to grad f(a). Thus the level surface of f that passes through a has the same normal direction as the constraint curve at a. In IR 2 this means that the two curves touch at a. Example 2.2. xi= 0.
Minimize
f =
1 - xf - x~ subject to g = x 2
-
1
+
Solution. Introduce a Lagrange multiplier A and consider grad(1 - xi - x~ + A(x 2 - 1 + xi)) = 0. This gives
--2x 1 + 2AX 1 = 0
and
-2x 2 +A= 0.
There are three unknowns so we need another equation and this is provided by the constraint itself: x2
-
1 +xi= 0.
When we solve these three equations we find the following solutions (i)
x1
= 0, x 2 = 1, A = 2 ± 1/J2, .x2 = !. A = 1.
(ii) .x 1 =
Now sketch the constraint curve and the level curves of f. The constraint is the parabola x2 = 1- xi. The level curves of f are circles centred at 0, f decreases as we move away from the origin. As is seen in Figure 2.1 the points (0, 1) and (± 1/J2, !) are the points where curves f = constant touch the parabola of constraint. It is dear that the minimum is at x 1 = 0, x 2 = 1. Higher-dimensional problems with several constraints exhibit the same geometrical property; the tangent to the constraint set at a lies in the tangent plane to the level surface of f that passes through a. We have to be careful here because we are using the
26
OPTIMIZATION IN
~"
--(0,1)
I I
I
'
''
'
''
''
''
I
I I
I I I I
'
'
'
''
I
I
I
I I I
''
I
''
I I
''
......
Figure 2.1
''
H/~,2, 112)- -- - -
I
'
Lagrange
-
-... -
multipliers~a
geometrical interpretation
term tangent and tangent plane in a more general sense than before. When we have found a local minimum for a problem in ~" with m < n constraints we have a set of numbers Ai, j = 1, 2, ... , m such that grad f = - Li Ai grad gi at a. So the normal to the level surface of f is a particular linear combination of the normals to the constraint surfaces g/x) = 0. A vector t that is perpendicular to all the grad gi(a) is a tangent to the constraint set at a (strictly one should say t lies in the tangent space of the constraints) but since grad f = - Li },i grad gj, we see that t must also be a tangent to (lie in the tangent space of) the level surface of f at a. The following example illustrates this point. Example 2.3.
Find the local maxima and minima of
f (x) = xi + x~ + x~ subject to the constraints
g 1 (x) =xi+ x~ g 2 (x) = xi
+ x~- 5 = 0,
+ x~ + x3 -
2x 1
-
3
=
0.
27
A GEOMETRICAL INTERPRETATION
Solution. that
Consider L(x, l) = f
+ A. 1 g 1 + l 2 g 2
and find x, l such
gradL = 0. Now
+ x 32 + x 33 + A, 1 ( x 21 + x 22 + x 23 + A. 2(xi + x~ + x~- 2x 1 - 3)
L = x 31
5)
so
oL = 3xi + 2A. 1 x 1 + 2A. 2x 1 ax1 oL = 8x 2
aL
-
ax3
=
-
222
=0
3x~ + 2A. 1x 2 + 2A. 2 x 2 = 0 2
3x 3
,
,
+ 2A 1 X 3 + 2A 2 X 3 =
0
and these, together with the two equations of constraint, give five equations for the five unknowns x, l. There are four solutions all of which have x1
(i) (ii) (iii) (iv)
x 2 = 0, x 3 = x 2 = 0, x 3 = x 2 = 2, x 3 = x 2 = -2, X3
=
1,-1 1
=
-l
2, 22 = -l -2, -1 2 = 1. 0, A2 = -~. = 0, A2 = 1.
Whether any of them actually gives a minimum or a maximum we are not yet able to determine. But to illustrate the points discussed above let us examine the solution x 1 = 1, x 2 = 0, x 3 = 2. The constraint set at this point is the circle x~ + x~ = 4 in the plane x 1 = 1 (Figure 2.2). The tangent to the constraint at P is the set of points with position vectors of the form rT = (1, p., 2) for any p..
Now consider the tangent plane to
(2.18)
f =constant at P. At P
gradf~U)
28
OPTIMIZATION IN IR"
Figure 2.2 The plane x 1 = 1
and the tangent plane has equation
(x,
-I, x,, x,- 2)( 1:) ~ 0
or 3x 1
+ 12x3 = 27.
Any r of the form (2.18) lies in this plane so the tangent to the constraint set does indeed lie in the tangent space of f = constant.
Distinguishing maxima from minima For a local minimum we need to show that {(a
+ eh) ~
f(a)
for all h # 0 satisfying gi(a +h)= 0, j = 1, ... , m. We can write
DISTINGUISHING MAXIMA FROM MINIMA
29
this as L(a + eh) ~ L(a) for all h =f. 0 satisfying the constraint, where L = f + Li Aigi. Using a Taylor expansion we obtain ez ehT grad L +- hTHLh + O(e 3 ) ~ 0 at a, 2
where HL is the Hessian of L = f + Li Aigi evaluated at a. If a gives f(x) a local minimum or maximum subject to the constraints then we already know that there exist non-zero Ai such that grad L = 0 at a. So in order for a to give f(x) a minimum we require hTHLh
~
0 at a
for all h =f. 0 satisfying h T grad L = 0. This is clearly a necessary condition but it can also be shown to be sufficient provided the matrix HL and the vectors grad gi satisfy a certain condition at a. If hTHLh > 0 at a then this is sufficient for a local minimum. The problem case is when there is a non-zero h 0 satisfying h~ grad gi = 0 at a, for which h~l{h 0 = 0. Write HL(a) = A and let B be the n x m matrix whose columns are grad gi(a), that is B = (ogi(a)) OX;
i
=
1, ... , n; j
=
1, ... , m.
Then since h THLh ~ 0 for all non-zero h satisfying the constraints, h 0 is a solution of the following problem in the variables hl> h 2 , ••• , hn: minimize G(h) = h TA h subject to So, from Theorem 2.6, we have Lagrange multipliers J1 1 , Jlz, •.. , Jlm such that grad(G(h)
+ hTBp) = 0,
which gives 2Ah
and
+ Bp =
0
(2.19)
30
OPTIMIZATION IN IR"
These are n + m equations for the n we can write them in matrix form
+m
unknowns h 0 and Jl, and
(2.20) where the 0 indicates them x m zero matrix and 0 the zero (n + m) vector. But h 0 =F 0 so the matrix equation (2.20) can only hold if
B)= 0.
det(A
BT 0
If this determinant is non-zero, then there is no non-zero h that minimizes G(h) and so h TA h ;:::: 0 implies h TA h > 0, strict inequality. We have now proved an important result about our original problem. If hTHLh;:::: 0 at a for all h =F 0 satisfying the constraints and, in addition HL
det
(~gjr
axi
agj axi
=FO
at a,
(2.21)
0
then hT HLh > 0 and we have a sufficient condition for a local minimum. The matrix in (2.21) is called a bordered Hessian. A point a at which grad L = 0 and (2.21) holds is called a non-degenerate critical point of the constrained problem. We have now proved the following theorem:
Theorem 2. 7. A necessary and sufficient condition for the nondegenerate critical point a to minimize f(x) subject to the constraints gix) = 0, j = 1, 2, ... , m is that h T HL h ;:::: 0 for all non-zero tangent vectors h. The corresponding result for a local maximum is of course that hTHLh:::::; 0. • Example 2.3 (continued). Determine whether the critical points already found for this problem are non-degenerate and hence find the local minima and maxima.
DISTINGUISHING MAXIMA FROM MINIMA
31
Solution 6xt HL =
+ 2At + 2), 2
(
0
6x 2
0
0
+ 2At + 2A 2 0
2x 1 ) gradgt = ( 2x 2 2x 3
,
grad g 2 = (
2xt- 2) 2x 2
2x 3
At critical point (i), aT = (1, 0, 2), At = A2 = -~ and the determinant of the bordered Hessian is -384 =f. 0 so the critical point is non-degenerate and we can use Theorem 2.7 to test for a local maximum or minimum. The tangent to the constraint at this point is rT = (1, J-l, 2) so the tangent vectors h at aT = (1, 0, 2) are hT c.= (Q, {l, 0):
hTHLh
~ ~ 0)(~ -~ ~)(~) ~ -6~' (0
Thus critical point (i) is a local maximum. At critical point (ii), aT= (1, 0, -2), At= -~, A2 =-~and the determinant of the bordered Hessian is 384 =1- 0, so we can use Theorem 2.7. The tangent
vectors to the constraint are again of the form h T = (0, J-l, 0), giving h 1JfLh = 6J1 2 • Thus critical point (ii) is a local minimum. The remaining critical points are easily dealt with once we observe that the problem is symmetric in x 2 and x 3 • Thus aT == (1, 2, 0) is a local maximum and aT = (1, -2, 0) is a local mnnmum. Exercises 2.3 1. Find the critical points of the following constrained optimization problems and check that they are non-degenerate. Determine the local minima and maxima. (i) f(x) = XtX 2 + X 2 X 3 + X3 Xt subject to x 1 + x 2 + x 3 = 1. (ii) f(x) =xi + x~ + x~ subject to Xt + x 2 + x 3 = 3. (iii) f(x) = XtX 2 + X 2 X 3 + X3X1 subject to xi + x~ = x 3 •
32 2.
OPTIMIZATION IN
The following constrained problem has four critical points, two of which are non-degenerate. Show that one of these is a maximum and the other a minimum:
f(x) =xi+
3.
~"
x~
+ 3xi-
3x~-
8
subject to xf + x~ = 16. By applying the fundamental inequality, f(a + sh)- f(a) ~ 0 for all points in the tangent space of the constraint, show that neither of the degenerate points can be a maximum or a minimum. Find the local maxima and minima of the following problems (a) by introducing two Lagrange multipliers; (b) by using the constraints to eliminate one or more of the variables: (i) f(x) = X1X2X3 subject to x 3 =xi- x~ and x 3 = xi + x~ - 8. (ii) f(x) = 2x 1 + 2x 2 + x 3 subject to xi + 2x~ + 4x~ = 1 and (x 1 - 1) 2 + 2x~ + 4x~ = 2.
3
The calculus of variations
The fixed end-point problem As we saw in Chapter 1, the calculus of variations is concerned with the optimization of functionals. There we used (x, y) for the coordinates of a point in the plane andy= y(x) to represent the equation of a plane curve. In order to set up a uniform notation for the remainder of the book we shall re-name our variables, letting t be the independent variable and x the dependent; so (t, x) represents a general point and x = x(t) the equation of a curve. The simplest problem that we can pose is that of finding, out of all the curves that join two fixed points, the equation of the curve that minimizes a given functional. Let (t 0 , x 0 ) and (t 1 , x 1 ) be a pair of fixed points and x = x(t) be any reasonably well-behaved curve defined on t 0 ~ t ~ t 1 and passing through the given end-points. Our functional will be the integral from t 0 to t 1 of a given function f(t, x(t), dx/dt). For our purposes this is a function of three variables and we assume that f is differentiable with respect to each of them as many times as we require. For convenience we use the notation x for dx/dt. The problem can now be posed:
Problem 3.1 Minimize J[x] =
I
ll
f(t, x, x) dt
(3.1)
to
with x(t 0 ) = x 0 , x(t 1 ) = x 1 • In order to show that a curve x = x*(t) is a minimizing curve for Problem 3.1 we need to show that J[y]
~
J[x*]
(3.2)
for all continuous y = y(t) satisfying the end conditions, with equality only when y and x* coincide. Condition (3.2) is necessary and sufficient for x*(t) to be the solution of Problem 3.1, but it is not particularly useful; it gives no hint as to how x*(t) might be found and requires that each candidate x*(t) be tested against every possible y(t). We cannot solve Problem 3.1 by using (3.2) directly,
34
THE CALCULUS OF VARIATIONS
but we can deduce from it useful information about the behaviour of minimizing curves. In order to make progress we set aside for the moment consideration of sufficient conditions and concentrate on deriving necessary conditions from (3.2). The advantages of this approach are twofold; firstly, in order to prove necessity we can examine nicely behaved subclasses of the class of all continuous y(t), secondly a necessary condition is likely to lead to a method of constructing a minimizing curve. In Chapter 1 we saw an example for which the minimizing curve was piecewise differentiable, so perhaps we should look for our minimizing curve amongst the set of all piecewise differentiable (D1 ) curves joining the end-points. Such curves will have to be considered in our theory at some stage but for the moment, in order to be able to derive a useful condition without technical difficulty, we shall assume that the x(t) are twice continuously differentiable and satisfy x(t 0 ) = x 0 , x(t 1 ) = x 1 . Such curves will be called admissible. Thus our problem becomes one of finding amongst the set of twice continuously differentiable (C2 ) curves that join the given end-points, the curve that gives a local minimum to the functional defined in (3.1). If such an admissible minimizing curve exists, then (3.2) must be satisfied for all y(t) that are admissible and close to x*(t). We must now clarify what we mean by saying that two curves are close to each other. To this end we define the concepts of weak and strong variations.
Definition. A weak variation. Let x*(t) be a minimizing curve and y(t) an admissible curve. Then if there exist small numbers 8 1 and 8 2 such that lx*(t) - y(t)l
Figure 3.2 A strong variation
partial derivatives off in the expansion are to be evaluated on the minimizing curve. When we substitute our series into the definition of !J.J we obtain
!J.J = e \'t + e2 Vz + O(e 3 ) where
\'t
=
f
11
to
and
v; =
(
. ar) dt ax + 11 --: ax
11 ar -
( axazr
1 I~~ 1122 2 to
-
(3.3)
azj + lj2 -azj) + 211!J ----; . dt 2 ax ax
ax
and the partial derivatives are evaluated on the minimizing curve. The integral \'t is called the first variation of J because e \'t is the first-order change in J consequent upon our weak variation x* + 211. The integral Vz is called the second variation and contains all the terms in !J.J involving second-order partial derivatives of f. All the remaining terms of the expansion of !J.J, each of which is of degree 3 or more in e, have been gathered together in the term O(e 3 ) in
37
THE FIXED END-POINT PROBLEM
(3.3). If x* is a minimizing curve then it is necessary that !iJ ~ 0 for all admissible 17(t). Thus for all17(t). Now 8 may be positive or negative, so if we divide by two inequalities: for
8
(3.4) 8
we obtain
>0 }
and
(3.5)
for
8
< 0.
Now let 8 ~ 0 and we see that we must have both v;_ 2 0 and v;_ ~ 0. That is, a necessary condition for x* to be minimizing is that v;_ = 0 for all admissible 17(t). Written out in full this condition is that
f' 1
(
to
of . of) dt=O 17-+11--: OX OX
(3.6)
for all admissible 17(t). We can turn this into a more useful result by integrating by parts to remove the term in~· Since
f
'1
to
and 17(t 0 )
of
• 11-dt = OX
[11-of]' - Jt 11-d (of) dt 1
1
OX
to
dt OX
to
= 17(t 1) = 0 we can re-write (3.6)
f' {of_ox _ i ( ox8~)} 1
10
11
dt
as
dt =
o
(3.7)
for all admissible 17(t). Now examine the integrand of (3.7) and in particular the expression inside the curly brackets. Both terms are evaluated on the minimizing curve x* and do not involve the variation 17(t). Furthermore, since x* is C2 , both terms are continuous. Thus the expression between the brackets is a continuous function oft which we will call g(t). Condition (3.7) simply says that if g(t) arises from a minimizing curve, then multiplying it by any admissible 17(t) and integrating from t 0 to t 1 must always produce zero. One is tempted to infer that g(t) = 0 at each point on a minimizing curve. We can show that this is indeed the case by constructing a proof by contradiction. That is, assume that our inference is false, so g(t) =1= 0
38
THE CALCULUS OF VARIATIONS
at some point in [t 0 , t 1 ], while
I
t,
to
yt(t)g(t) dt = 0
for all admissible yt(t). Assume that g(t) is positive at t = c in [t 0 , t 1 ]. Since g(t) is continuous it will be positive in some neighbourhood [a, PJ oft= c, where a< c 0.
to
But this integral should be zero for all admissible yt(t). Thus g(t) cannot be positive at t = c. Similarly we can prove that g(t) cannot be negative at t = c. Hence we must have g(t) = 0 at every point of [to, t1J. We now have a first necessary condition for a weak local minimum. We state our result in the following theorem.
Theorem 3.1. In order that x = x*(t) should be a solution, in the class of C2 functions, to Problem 3.1, it is necessary that
~ _r!_(Of)
ox
at each point of x = x*(t).
dt
ox
=
0
(3.8)
•
This differential equation is called the Euler-Lagrange equation. It is also a necessary condition for a local maximum as is seen by returning to (3.4), reversing the inequality sign and following the argument through. For this reason the solutions of the Euler-Lagrange equation are called extremals. It should be noted that if we enlarge the class of admissible curves to include once continuously differentiable ( C1) curves, we still find that the Euler--Lagrange equation must be satisfied along a minimizing curve. For a proof of this see L.A. Pars, An Introduction to the Calculus of Variations, Heinemann, London, 1962.
39
THE FIXED END-POINT PROBLEM
Example 3.1. Find the extremal of J[x] = x(l) = 0, x(2) = 3.
fi x t
2 3
dt, given that
Solution. We have to= 1, tl = 2, x 0 = 0, x 1 = 3 and f(t, The Euler-Lagrange equation for this problem is o-
X,
x) =
x2 t3 •
i_ (2xt 3 ) = o, dt
giving :it 3 = constant. On integrating we find
k
X=-+ [ t2
for some constants
k and l.
When we apply the end conditions we find that l = - k = 4. Hence the extremal is 4 x=4--2 . t
Exercises 3.1 Find the extremal for each of the following fixed-end point problems: 1.
f x: 2
' 1
2.
J
t
dt with x(1) = 2, x(2) = 17.
1l/2
t'
(x 2
--
x2 -
2x sin t) dt with x(O)
= 1, x(n/2) = 2.
' 0
3.
(x 2
+ 2x sin t) dt with x(O) = x(n) = 0.
Example 3.2. Let us now find the extremals of the brachistochrone problem that we formulated in Chapter 1. Apart from its purely historical interest it is an ideal problem on which to demonstrate two useful techniques. Solution.
We are required to minimize
__ Jt'
el[x] -
to
0 lies on the curve x
=
t- 5.
Finding minimizing curves We now turn to the question of how to determine whether the extremal we have found is a minimizing curve. Recall that if our curve is to minimize J, then inequality (3.2) must hold. Let us return to Example 3.1 and examine the sign of 11J = J[y]- J[x*], where y = y(t) is any C1 curve joining the fixed end points and x* = 4 - 4/t 2 is the extremal. Then writing y = 4 - 4/t 2 + ry(t), where ry(1) = ry(2) = 0, we can calculate 11J directly:
11J = =
I (~ ~Yt3 Iz (16~ ~ 2 t3) 2
+
+
dt _ .[
(~Yt 3 dt
dt = [16ryJi
+
1~ 2t3 2
dt
since ry(1) = ry(2) = 0. Thus (3.16)
47
FINDING MINIMIZING CURVES
Note that we have not assumed that rJ(t) and ~(t) are small. Thus J[y] - J[x*] 2: 0 for all y(t) that are cl and pass through the end-points, so our extremal minimizes J in the class of C1 curves. Consider now what our calculation gives if we allow ~(t) to have a finite number of discontinuities in [1, 2], so that y(t) is continuous with a finite number of' corners' at which the slope of the tangent is discontinuous. We have to split the interval up into a finite number of sub-intervals on which ~(t) is continuous, but we still end up with !1J 2: 0. Thus our extremal also minimizes in the class of D 1 curves that join the end-points. This is not a freak result but an example of a general truth which can be embodied in the following theorem.
Theorem 3.2. If x = x*(t) is a minimizing curve in the class of C1 curves, then it is also a minimizing curve in the wider class of D 1 curves. Proof. Suppose the theorem is false so that x*(t) minimizes in class C1 but not in class D1 . Then there exists x = z(t) in D1 such that J[z] < J[x*]. Since the integrand f(t, x, x) of J[x] is bounded on any sub-interval of [t 0 , t 1 ], we can construct, by 'rounding off' the corners, a C1 curve x = w(t) on which the value of J[w] is as close as we please to J[z]. That is IJ[w] - J[z]l < e for any positive e. A judicious choice of e will enable us to deduce that J[w] < J[x*], thus contradicting the fact that x* is minimizing in C1 . To this end take e = (J[x*]- J[z])/2. Then IJ[w]- J[z]l < (J[x*]- J[z])/2, which implies that J[w] < (J[x*] + J[z])/2 = J[x*] - e. Thus a curve that is minimizing in C1 must also minimize in D 1 • • Example 3.5. Find the minimizing curve for x(O) = 0, x(2) =
J3.
J5 (x- 1) x
2 2
dt with
Solution. The integrand is independent oft, so we can use (3.9) to obtain the equation of the extremal. Thus x 2 (x 2 - 1) = constant on an extremal. This integrates to give k
+ x 2 = (t + l) 2 •
(3.17)
The end conditions give l = -i, k = / 6 , so the extremal is x 2 = t2 - t/2. But there is clearly something wrong here; when 0 < t < ! the right-hand side is negative. Further investigation reveals that the extremal is a hyperbola and the two given end
48
THE CALCULUS OF VARIATIONS
points are not on the same branch. Thus there is apparently no extremal. However, common sense tells us that there is a minimizing curve. The integrand is always positive, so if we can construct a curve for which J = 0, it will give J its minimum value. Consider the following D1 curve:
x=O X=
t
0 ::;; t ::;; 2 -
+ j3- 2,
2-
j3
j3 ::;; t ::;; 2.
(3.18)
It is continuous and piecewise differentiable with a corner at t = 2 - j3. It joins the given end-points and along it J = 0. This curve must be a minimizing curve. Example 3.5 is such that it is possible to make a shrewd guess about what sort of curve might be minimizing (we clearly needed a curve with x = 1 on one section and x = 0 on the other). What we must do now is investigate carefully the necessary conditions for a D1 curve to be a minimizing curve.
Theorem 3.3. In order that a D1 curve be a minimizing curve for the fixed end-point problem it is necessary that (i) (1.1.)
the Euler-Lagrange equation is satisfied between corners and between a corner and an end point;
of .
.
ox LS contmuous at a corner;
-
... ) f (111
-
. of . contmuous . x - LS at a corner.
ox
Proof. Let x*(t) be a D1 curve passing through the fixed endpoints and minimizing J[x]. For simplicity assume that it has just one corner at t = 1:, t 0 < 1: < t 1 • The proof for curves with several corners is easily constructed once the results for one corner have been established. Our minimizing curve can then be written in the form
t0
::;;
t ::;; 1:
1:::;; t :=;;tl
(3.19)
where x 1 (1:) = x 2 (1:) but x1 (1:) =f. x2 (1:). Note that since we only wish to establish necessary conditions we need not consider the most general variation on x*(t); suitably chosen special variations will suffice.
49
FINDING MINIMIZING CURVES
xl1\
''
''
'
'
' ''
''
' '' '' 'B
A
Figure 3.6 Weak variations of type (i)
(i) Consider the class of weak variations
y- {
x 1 (t), Xz(t)
t 0 s; t s; -r
+ SIJ(t),
'r
s; t s; tl
where 17(-r) = IJ(t 1 ) = 0, as shown in Figure 3.6. All these curves are identical to x*(t) in t 0 s; t s; -r. Thus
flJ =
I
ll
r
f(t, Xz
+ SIJ, X2 + sfT) dt-
It, r
f(t, x 2, i
2)
dt
=sIt' (atax 11 + axa~ t7) dt + O(sz), t
where the derivatives are evaluated on x*(t). We then need the same argument as that used in the proof of Theorem 3.1 to deduce that the Euler-Lagrange equation must be satisfied on x*(t) in -r s; t s; t 1 . The proof that the same must be true in t 0 s; t s; -r is left to the reader.
50
THE CALCULUS OF VARIATIONS
x!l\
I, ' I'
I\ I \ I
I I
\
\
'\ \ I
I I I
I I I
'
'
'' '
''
'
I
I I
'
I
''
I
'
I I I I
I
; I
I
''
'
'
I
I
A
'
''
B
Figure 3. 7 Weak variations of type (ii)
(ii) Consider weak variations of the form
y
=
{x
+ ~>111(t), x 2 (t) + sry 2 (t),
t 0 :S: t :S: r
1 (t)
T
:S: t S t1
where 111(t0 ) = ry 2 (t 1 ) = 0 and 1] 1 (r) = ry 2 (r) as shown in Figure 3.7. Such curves if they have a discontinuity in their slope will have it at t = r. Now calculate !l.J, bearing in mind that the Euler-Lagrange equation must hold along each of the two sections of x*(t), to obtain
!l.J =
8
of (t, x 1 , x.1 ) ]' + 8 [ 1] 2 ----:of (t, X 2 , x.2 ) [ 1] 1 -~ OX
to
OX
Jtl + 0(8 ). 2
(3.20)
r
We now require that the first variation vanish (see inequalities (3.5) and the argument that follows them) so that (3.20) leads to
~~ (r, x 1 (r), .X 1 (r))- ~~ (r, x 2 (r), .X 2 (r)) = 0,
FINDING MINIMIZING CURVES
51
x/1\
(-
t:
1:
I
,' i '
....
,
\
\
\
I
I
I I I
I I I I
I I
B
A
L----------------~-4--------------->
r r'
Figure 3.8 Weak variations of type (iii)
since 171(t 0 ) = 172(t 1 ) = 0 and 111(r) = 172(r). Thus of/ox is continuous at t = r even though i(t) is discontinuous there. (iii) Consider weak variations
y- {
t0
x 1 (t), Xz(t)
+ ery(t),
:::;
t :::; r'
r' :::; t :::; t 1 ,
where ry(t 1 ) = 0 and r' = r + ~r, where ~r is small as shown in Figure 3.8. To construct this type of variation we have simply extended the curve x = x 1 (t) for a short distance beyond t = r. Note that y must be continuous at t = r' so that x 1 (r') = x 2 (r') + ery(r'), which to first order in e gives (3.21)
We now calculate
~J
and require that its linear part be zero. Thus
52
THE CALCULUS OF VARIATIONS
Now use (3.21) and the fact that obtain the required condition f(-r, X1(-r), xt(-r))- X1(-r)
of /ox
is continuous at t
=
-r to
;~ (r, x1(-r), x1(r))
= f(-r, Xz(-r), Xz(-r))- X2 (-r) ;~ (-r, X2 (r), x 2 (-r)).
•
Example 3.5 (continued). We saw earlier that the curve defined by (3.18) gave J its smallest possible value, namely zero. The general solution of the Euler-Lagrange equation is a twoparameter (k and l, see equation (3.17)) family of curves, none of which passes through both (0, 0) and (2, j3). However, there are two other solutions, x = constant and x = 1. Thus the curve defined by (3.18) is such that the Euler--Lagrange equation is satisfied between corners. Parts (ii) and (iii) of Theorem 3.3 (usually called the 'corner conditions') are also satisfied by (3.18). To see this, let p 1 and p 2 respectively be the slope just before and just after a corner (r, A), A= x(-r). Then the corner conditions require that 2A 2 (p 1 - 1) = 2A 2 (p 2 - 1) and A2 (p 1 - 1)(p 1 - 3) = A2 (p 2 - 1)(p 2 - 3). Both of these will be satisfied for p 1 =I= p 2 if A= 0 and not otherwise. The D 1 minimizing curve must therefore consist of an x = 0 section and an x = 1 section as we already suspected. Before looking at one final example, let us pause and gather together what we have discovered so far. The Euler-Lagrange equation is crucial since minimizing curves on which xis continuous must satisfy it at each point; minimizing curves with a finite number of points of discontinuity of x satisfy it at every point except the points of discontinuity. Theorem 3.2 tells us that if there is a minimizing curve in cl then we need look no further because there can be no D 1 curve that is better. If we are forced to look for a D1 minimizing curve then Theorem 3.3 must be satisfied, and when this is done we will need further evidence before we can be sure we have a minimizing curve; Theorem 3.3 is necessary but not sufficient. In Example 3.5 we found a D 1 curve satisfying Theorem 3.3 and we also knew by inspection that J? 0. Since this D 1 curve gave J the value zero it was indeed a minimizing curve. Care needs to be taken in any application of 'fheorem 3.2; it requires that the extremal exist and be minimizing in C1 • This needs a sufficiency condition. At the moment the only one we have is the rather unwieldy condition that I'!J 2 0 for all C1 variations. We used this to good effect in Example 3.5 because the calculation of
53
FINDING MINIMIZING CURVES
11J for general rt(t) was easy to carry out and led to an expression whose sign was unambiguous. This is not often the case and what is needed is a condition that is easier to handle. The theory of sufficiency conditions in the calculus of variations is very subtle and we shall look at some aspects of it later in this chapter. Example 3.6 Minimize
Solution.
t
2
x2 (1 - x) 2 dt
with
x(O) = 0,
x(2) = 1.
The Euler-Lagrange equation is
o~ ~ (2x(1 dt
x) 2
-
2x 2 (1 - x)) = o
or
x =constant
so the extremals are straight lines x = kt + l. (When the integrand f of J is a function of x only, the extremals are always straight lines because the Euler-Lagrange equation is always of the form g(x) =constant, for some function g, and this has solutions x =constant.) The extremal that fits the end conditions is x = t/2, but this does not minimize in C1 . (Calculate 11J and verify that it is not positive for all rt(t).) We now invoke Theorem 3.3 and look for a D1 curve minimizing J. Each section must be a straight line. The corner conditions require that x(1- x)(1- 2x) and .X 2 (1- x)(3x- 1) be continuous at a corner. If p 1 ,p2 denote the slopes on either side of a discontinuity we see that the only values satisfying both corner conditions are p 1 = 0, p 2 = 1 or p 1 = 1, p 2 = 0. All that remains is to fit the sections together so that the resulting curve passes through the end-points. There are many ways in which this can be done. Figure 3.9 shows two possibilities. We now need to find out if these curves minimize J. The integrand of J is always positive and the minimum possible value of J is zero. On any section of the D 1 curves we have constructed we either have x = 0 or x = 1; that is J = 0 for each of these curves. Thus we can minimize J but the minimizing curve is not unique. Exercises 3.4 1. By considering an admissible variation y = x*(t) + rt(t), where rt(t) is C1 and not necessarily small, show that the extremals found in Exercise 3.1(1) and 3.1(3) are minimizing curves.
54
THE CALCULUS OF VARIATIONS
x/1\
c
A
B
Figure 3.9 Two minimizing curves with corners
2.
Show that x = t/2 is an extremal for t2 (x2 - 1)2 dt with x(O) = 0, x(2) = 1. Calculate the corresponding value of J. Find a D 1 curve that satisfies the conditions of Theorem 3.3 and that gives J the value zero. Is this curve unique?
lsoperimetric problems We now consider the problem of finding a curve which minimizes a given functional while giving another functional an assigned value. The name isoperimetric is given to such problems for historical reasons; the first such problem to be considered was that of finding, out of the class of all closed curves of the same length (that is, of equal perimeter), the curve that maximizes the enclosed area.
ISOPERIMETRIC PROBLEMS
Problem 3.3.
Minimize
J[x] =
f
55
tl
f(t, x, x) dt
to
with x(t 0 ) = x 0 , x(t 1 ) = x 1 subject to the integral constraint
I=
f
tl
g(t,x,x)dt=c,
to
where cis a given constant. Our experience with constrained optimization problems in Chapter 2 might tempt us to introduce a Lagrange multiplier;, and consider the problem of minimizing
f
tl
(f
+ lig) dt.
to
The Euler-Lagrange equations for this functional would give us a general extremal involving two arbitrary constants and the unknown Lagrange multiplier li. Then we can find the three unknown constants by applying the end conditions and the constraint I= c. It may seem astonishing that such a simple device as a Lagrange multiplier, introduced first for problems in !Rn, should also work for the minimization of functionals subject to integral constraints. But it does. The proof, however, is rather tricky.
Theorem 3.4. In order that x = x*(t) be a solution of Problem 3.3 (the isoperimetric problem) it is necessary that it should be an extremal of
f
tl
(f(t, x, x)
+ lig(t, x, x)) dt
to
for a certain constant li.
Proof. We will take our admissible curves to be C2 and consider weak variations of the form y = x*(t) + eO"(t), where a is small. We are only interested in varied curves that pass through the endpoints and satisfy the constraint. In order to construct a class of weak variations that satisfy the constraint we choose to write O"(t) in the form O"(t) = rxYJ(t) + {J((t), where rx, f3 are constant and YJ(t), ((t) vanish at the end-points. The functions YJ(t) and ((t) are arbitrary and independent; that is, there is no constant k such that YJ(t) = k((t)
56
THE CALCULUS OF VARIATIONS
for all t. Now take any such pair 17, (and apply the constraint
I=
I
t!
g(t,x*+ea,x*+eo)dt=c.
(3.22)
to
Since I is constant, its total variation !!I= 0 and, in particular, its first variation must be zero,
I
(rx17 + fJO
tl (
to
a ) dt = o,
a
a!+ (rx~ + fJ() a!
where the derivatives are evaluated on the minimizing curve. Integrating by parts gives
I
ll
to
(ag
(ag)) dt
d (rx17 + fJO - - - ~ ax dt ax
=
0.
Let L(h) denote
so that we have
I
(rx17 + fJOL(g) dt =
11
o
(3.23)
to
Since x* is not an extremal of I, L(g) =I= 0, so for each pair IJ, (the constants rx and fJ are related by (3.23). Now consider !!J for such a variation. Since x* is a minimizing curve, the first variation must be zero, which after integration by parts gives
I
t
(rx17 + {J()L(f) dt
1
0
=
(3.24)
to
where rx and fJ must satisfy (3.23). Eliminating rx and fJ between (3.23) and (3.24) gives n~ L({)IJ dt
J:~ L(f)( dt
s:~ L(g)17 dt
s:~ L(g)' dt
for any independent pair of C2 functions. This can only be true if both sides are equal to a constant -A., say. Hence x* must be such that
I
ll
to
L(f
+ A.g)17 dt =
0
57
ISOPERIMETRIC PROBLEMS
for all admissible ry(t). Then the same argument that was used in the proof of Theorem 3.1 leads to the necessary condition that
j_ (f + Ag) -
8x
idt (~ (f + Ag)) = ,ax
on x*(t),
0
•
so that it is an extremal of J;~ (f + Ag) dt. Example 3. 7. Minimize J = to the constraint x dt = l.
J6
J6 x
2
dt with x(O)
2, x(l) = 4 subject
=
Solution. Theorem 3.4 tells us to find the extremals of 2 + h) dt. The Euler-Lagrange equation is A - d(2:i:)/dt = 0 which has solutions x = At 2 /4 + kt + l. The end conditions give l = 2, k = 2 - .1/4. We find ll by applying the constraint
J6 (;i:
tl {~
t 2 + ( 2-
~)t + 2} dt =
1.
This gives A= 48, so the required extremal is x
=
12t 2
-
lOt+ 2.
Example 3.8. Consider the problem of finding, amongst the set of curves of length rc joining (0, 0) to (2, 0), the curve which has maximum area between it and the t axis. That is maximize
I:
x dt when
t
2
(1 +
x2 ) 1 i 2 dt = rc
and x(O) = x(2) = 0.
Solution. Introduce a Lagrange multiplier A and obtain the extremals of J6 (x + A(l + x2 ) 1 12) dt. The integrand is independent of t so the extremals must satisfy
x +
1(
IL
1 •2
·2)1/2 _
1+ x
k
x2 ) 112 -_
AX
(l +
,
constant.
Putting x = tan x and following the method used in Example 3.2 leads to the following parametric representation of the extremals: X=
k- A cos
x,
t
=
l +A sin
x
(3.25)
where k and l are constants. The extremals are circles. Let x0 , x 1 , be the values of x at (0, 0) and (2, 0) respectively. The end conditions then give k =A cos Xo =A cos X1
(3.26)
l= -Asinx 0 =2-Asinx 1 .
(3.27)
58
THE CALCULUS OF VARIATIONS
J5
The constraint requires that sec x dt = n. This can be transformed into an integral with respect to X· Thus
f
x' A dx
=
or
n
xo
A(X 1
-
xo) = n.
(3.28)
Now (3.26) implies that Xo = ±x 1 . But Xo = x1 is incompatible with the constraint equation (3.28), so Xo = - x1 and (3.28) then gives x1 = - Xo = n/2),. Equation (3.27) leads to a transcendental equation for A
l
= sin( 2;).
This has only two finite solutions, A= ± 1. Choosing A= 1leads to a solution lying in the fourth quadrant; the solution we require corresponds to A = -1, for which x1 = - Xo = - n/2, k = 0 and l = 1. This is of course the top half of the circle, centre (1, 0), through (0, 0) and (2, 0).
Exercises 3.5. 1. Find the extremal of Si5 x2 dt with x(O) = 0, x(2) = 1 subject to the constraint x dt = 2. 2. Find the extremals of So x2 dt with x(O) = 0, x(n) = 0 subject to So x 2 dt = n/2. Show that there is an infinite set of extremals. Evaluate the functional on a typical extremal.
J5
Sufficiency conditions
When looking for local minima of a function of one variable g(x) we first find the points at which the first derivative is zero, g'(x) = 0; the corresponding result in the calculus of variations is the vanishing of the first variation ~' which leads to the condition that the minimizing curve must be a solution of the EulerLagrange equation. In the search for a minimum for g(x), when we have found a point x =a for which g'(a) = 0 we then examine the sign of the second derivative at x =' a. If g"(a) > 0, then x = a gives the function a local minimum; the conditions g'(a) = 0, g"(a) > 0 are sufficient to ensure a minimum at x =a. The corresponding result in the calculus of variations would involve the sign of the second variation Vz (see (3.3)). Vz is simply the coefficient of a2 in the calculation of !'!..J for a weak variation. Unfortunately ~ = 0, Vz > 0 does not constitute sufficient proof that we have found a minimizing curve as the following example shows.
59
FIELDS OF EXTREMALS
Example 3.9 Minimize J[x] =
f ~dt 1
o
x(O)
X
= 0,
x(1)
= 1.
Solution. The extremal is x = t and gives J = 1. For a weak variation y = t + B1J, with 17(0) = 17(1) = 0, !lJ =
L 1
((1
+ e~)- 1
- 1) dt
= t1 ( -e~ + e2~2 = e2
e3~3 + ... ) dt
L1 ~2 dt + O(e3).
So the second variation is positive and we might hope that x = t minimizes J. But consider the path
y- { Then
J[y] =
f
3t,
0:::;; t:::;; -5;,
-t + 2,
-5::::;; t:::;;
112
0
j- dt
+
J 1
(
1.
-1) dt = -j-.
1/2
This curve, which has a corner at t = !, gives J a value less than 1. It is not a D 1 minimizing curve because the corner conditions (see Theorem 3.3) cannot be satisfied for this problem. It is possible to construct many curves joining (0, 0) to (1, 1) for which J < 1. Thus although 11; > 0 for x = t it is not a true minimizing curve. The fact that the sign of 11; is not crucial for distinguishing minimizing curves means that we have to adopt a much subtler approach using the concept of a field of extremals. We will base our investigation on this and on the Hilbert Integral.
Fields of extremals In order to find sufficient conditions for a minimum we have to return to inequality (3.2). That is, we have to show that
!lJ = J[y]- J[x*] > 0
60
THE CALCULUS OF VARIATIONS
x/1\
B ' '
'
'' '
/
//
//
'
' /
//
//
' ' ' ·------·-----A ---~L_.J~ //,."'
'
~~~~-;;:>
,"'//
/
'
Figure 3.10
The field of extremals x = t
+l
for all curves y = y(t) that satisfy the end conditions. We must not restrict ourselves to weak variations and we must include the possibility that y(t) may be piecewise differentiable (recall Example 3.9). In Example 3.1 a direct calculation of !l.J gave us an expression that was easily seen to be positive, even for y in D 1 . Our extremal was indeed minimizing. This is not what usually happens; !l.J is, in general, the integral of an extremely complicated function involving both 11 and l'j, and we will not be able to make any statement about its sign for a general choice of IJ. What we must do is express !l.J in a new form; a form whose sign is easier to ascertain. Before attempting to do this we must examine a concept that is fundamental to any discussion of sufficiency conditions; this is the idea of a field of extremals. The general solution of the Euler-Lagrange equation for a given functional depends on two arbitrary constants; it is a two-parameter family of curves. Imposing the given end conditions will determine the two parameters and give us the particular extremal we require. Thus in Example 3.9 the general extremal is x = kt + land the values k = 1, l = 0 give us the extremal through (0, 0) and (1, 1). Now consider the one-parameter family of extremals x = t +las shown in Figure 3.10. As well as having x = t as a member, this family of curves
61
FIELDS OF EXTREMALS
gives a simple cover to the plane; through each point passes one and only one member of the family. This is what is known as a field of extremals. Beeause only one member of the family passes through each point there is a unique value for the slope p(t, x) at each point. (In this simple example p(t, x) = 1 everywhere.) Thus associated with this field of extremals is a slope function p(t, x) which, since it is the slope of an extremal, satisfies the identity at(t, x,p) _ i_ (at (t x )) ax dt ap ' ,p
=0
(3.29)
at each (t, x). In any given problem there may be more than one family of extremals that forms a field; for example in the problem we are considering here x = kt is a one-parameter family with x = t as a member. This too gives a simple cover to the plane except at the origin. For this family p(t, x) = xjt, which is well-behaved except at the origin. If there is a choice of field we choose the most convenient one_
Example 3.10. Find a suitable field of extremals for Example 3.1 and verify that (3.29) is satisfied by the corresponding slope function.
Solution.
The general solution of the Euler-Lagrange equation is k/t 2 + l and the extremal satisfying the end conditions is x = 4 - 4/t 2 • A one-parameter family of extremals containing this is x = l - 4/t 2 • We are interested in the behaviour of this family for t ~ 1. Figure 3.11 shows that the family gives a simple cover that includes x = 4 - 4/t 2 , so it constitutes a field whose slope function is p(t, x) = 8/t 3 . To verify (3.2B) we write f(t, x,p) = p 2 t 3 and calculate the derivatives involved. Now a[jax = 0 and a[jap = 2pt 3 • Thus (3.~m) gives x
=:
d 3 d 0 --- (2pt ) = --- (16) = 0 dt dt
as required.
Exercises 3.6 Find a suitable field of extremals for each of the following problems. In each case verify that (3.29) is satisfied by the corresponding slope function.
62
THE CALCULUS OF VARIATIONS
xli\
' '
'
'
' '
Figure 3.11
The field of extremals x
x(l) = 0,
x(2) x(O)
=
=
l - 4jt 2
1.
= 0, x(2) = 2.
Hilbert's invariant integral In Figure 3.12, t(!* is the extremal x = x*(t). Suppose that we have found a suitable field with slope function p(t, x). Let t(!, x = x(t) be any other curve joining the end-points and lying in the region covered by the field. We place no further restrictions on t(! other than that the following integral be defined along it: K[x]=
I
t, (
ar
Note that when integral is J[x*].
t(!
)
f(t,x,p)+(x-p)-(t,x,p) dt
ap
to
coincides with
t(!*,
so that
(3.30)
x = p(t, x*),
this
HILBERT'S INVARIANT INTEGRAL
63
xli\
Figure 3.12
C(J
is a path for Hilbert's invariant integral
We can write (3.30) as a line integral K[x]
=I
'if?
u dt
+ v dx,
where
(3.31)
w(t, x) =
f'(t,
of'
x, p)- p -- (t, x, p),
op
v(t, x)
of'
op (t, x, p).
=-
The remarkable thing about this integral is that it is invariant; it does not depend on the path Cfl but only on the end-points. To show this we need to demonstrate that oujox- ovjot = 0 at each point and this can be shown (see Exercises 3. 7) to reduce to the requirement that (3.29) hold at each point of the field, which is, of course, the case. So integral (3.30) is independent of the path. This, together with the fact that K[x*] = J[x*], enables us to write 11.J in a new and more useful form 11.J = J[x]- J[x*] = J[x]- K[x*] = J[x]- K[x].
64
THE CALCULUS OF VARIATIONS
Both integrals are evaluated along((}, so we can combine them: t1.J
=It'~ (f(t, x, x)- f(t, x,p)- (x- p) ~o( (t, x,p)) dt.
(3.32)
In (3.32), x refers to the slope of((} at (t, x) and p is the slope of the field at (t, x). The integrand is usually denoted by E(t, x, x, p) and is called the Weierstrass excess function. We have made no approximations in deriving this form of t1.J and our restrictions on the behaviour of the varied curve ((} are not stringent; we merely require that the integrals exist for the curve((} and that it lie within the region covered by the field. We can now state a sufficient condition for the extremal to be a minimizing curve. Theorem 3.5 (The Weierstrass condition). In order that the extremal((}* give a strong minimum to J[x] it is sufficient that (i) ((}* is a member of a field of extremals, (ii) E(t, x, x, p) ::?: 0 for all points (t, x)
lying sufficiently close to((}*, and arbitrary values of x.
Proof. If {(}* is a member of a field of extremals, then for any varied curve ((} lying entirely in the region covered by the field, we can write t1.J =
I
tt
E(t, x, x,p) dt.
to
If condition (ii) is satisfied, then AJ;:::: 0 for all ((} sufficiently close to ((}*. The slope of((} at a given value oft need not be close to the slope of((}* at that value oft, so((}* gives .J a strong minimum. • Note that this theory allows((} to have discontinuities in its slope. Example 3.11. Show that the shortest distance between two points is a straight line. That is with x(O)
=
0,
x(l)
=
1.
Solution. The extremals are x = kt + land the end conditions give k = 1, l = 0, so the required extremal is x = t. In order to apply the Weierstrass condition (Theorem 3.5) we need to embed x = t in a field of extremals. There are a number of ways of doing this; for example the family x = t +lis a field (with slope functionp(t, x) = 1)
65
HILBERT'S INVARIANT INTEGRAL
and so is the family x
=
kt (with slope function p(t, x)
E(t, x, x,p) = f(t, x, x)- f(t, x,p)- (x- p)
=
xjt). Now
~~ (t, x,p)
where p = p(t, x). For this problem
E(t, x, x,p)
= =
+ x2)1'2- (1 + p2)1f2- p(x- p)(1 + p2)-1f2 c this can certainly be done. For instance u(t) = 0 will effect the transfer in time T = (1/oc) ln(a/c). However, we will not be able to maintain the level at x = c for t > T unless there are values of u(t) for which - occ + u(t) = 0, Vt ~ T, that is unless m ~ occ. Thus for a> c the system will be controllable provided m ~ occ. Now consider a< c. We can transfer the system from x(O) =a to x(T) = c provided x > 0, Vt ~ T. For this to be possible we must have m ~ occ and we can sustain the level at x = c by setting u = occ fort~ T. Now that we know that the system is controllable when m ~ occ we can pose an optimal control problem. To do this we must associate with the system a suitable cost functional. Thus we could measure the cost simply by the length of time spent in getting from x '"'a to x = c and look for the control regime that minimizes it. This is the so-called time-optimal control problem. We will discuss the theory of time-optimal control of linear systems in a later chapter. Fortunately we do not need anything apart from common sense to solve the time-optimal control problem for the very simple system we are examining here. Suppose a < c and m ~ cxc so the system is controllable. To reach x ''" c in the shortest possible time we need to make x as large as possible at each t and this means we must take u(t) = m for all t ~ T. A similar argument shows that for a > c the time-optimal control is u(t) = 0 for all t ~ T. Note that in both cases the time-optimal control takes its values on the boundary of the set of allowed values of u(t); that is, although it was free to take its value at any t from the interior of 0 ~ u(t) : rt
= m(b- s)/K, where s is as yet undetermined.
t > rt'
.
x2
K
==- =>
m
K
x 2 = -- t +constant,
m
but x 2 = s at t = 11 so K
x 2 = -- t- b + 2s. m At t = t 1 , x 2 = 0, so t 1 = m(b - 2s)/K, where s is determined by finding where the Cfrl- path through (a, b) intersects OQ. This then will be the procedure for solving linear time-optimal control problems. There is never any need to use H = 0 to calculate
92
OPTIMAL CONTROL 1: THEORY
the arbitrary constants that arise from the co-state equations. We only need them if we are going to use H = 0 to find t 1 • Since we can find t 1 directly from the state equations we can avoid using H = 0. It just happens to be the case that, provided we have got the control sequence and the time of switch exactly right, the H = 0 condition is automatically satisfied. This is not true for all optimal control problems but it is for linear time-optimal control. The co-state variables also appear to play a mysterious role in the problem since we seem to be able to ignore them. A moment's reflection will show that their role is absolutely crucial. The key sentence in the above solution is '1/1 2 =B-At and this can have at most one zero'. We then know that the only possible optimal control sequences are -K/m followed by Kjm (or vice versa) and all that remains is to find where the switch has to take place.
Example 4.3. The glucose problem. This problem was discussed earlier in the chapter. The level of glucose is governed by the state equation x1 = -1Xx 1 + u, where the control u = u(t) satisfies the constraint 0 :::; u :::; m. The level is to be controlled from x 1 =a at t = 0 to x 1 =cat some time T in such a manner that
J=
for udt.
is minimized. Find the optimal control and the corresponding value ofJ.
Solution. Note that both the initial state a and the final state c must be positive. (There is no such thing as a negative amount of glucose.) Also m ~ IXC, otherwise the system is not controllable. H= -u
+ 1/1 1(-1Xx 1 + u) =
-1Xx 11/1 1
+ u(ljl1
-1).
Now H is linear in u and 0 :::; u :::; m, so the maximum of H with respect to u is for when 1/1 1 < 1, u
= u = {:
when 1/1 1 > 1.
The co-state variable 1/1 1 satisfies
.
aH
1/1 1 = - -
ox1
= lXI/I 1
THE PONTRYAGIN MAXIMUM PRINCIPLE
93
so 1/1 1 = Ae"1 and the switching function 1/1 1 - 1 = Ae"1 - 1 can only have a zero in t > 0 if 0 < A < 1. For A :2: 1 it is positive for all t > 0, giving ii = m; for A ::;; 0 it is negative for all t > 0 giving ii = 0. In all three cases the control that maximizes H is piecewise constant and at t = 0 we must have either ii = 0 or ii = m. We can now rule out the possibility that the optimal control has a switch. Suppose u(O) = 0. Then at t = 0, H = -aaA = 0. Now a =I 0 so A must be zero and there can be no switch because 1/1 1 - 1 == -1 Vt. Suppose u(O) = m at t = 0, then H = -m + A(m- aa) = 0, so A = m/(m- aa). Either m < aa, giving A < 0 or m > aa, giving A> 1, so there can be no switch for this case either. The optimal control is ii = 0 Vt or ii = m Vt. The corresponding state equation where ii = 0 or m integrates to give
When we apply the end conditions we obtain
a= B
+ ii/a,
c = Be-"T
+ ii/a
so that B =a- ii/a,
aa} . T = -1 ln {ii-_--a u- ac
There are two cases. Here we are decreasing the glucose level. The control (i) > ii = m gives T < 0 and clearly cannot transfer the level to its requilred final value x 1 =c. However, the control u = 0 gives T == 1/a ln(a/c) > 0. This control takes the system to x 1 = c in a finite time and the corresponding value of J is zero. That is, the system can get to x 1 = c unaided (Zi = O) and in doing so incurs no cost. (ii) a < c. Here we have to increase the glucose level. The control u = 0 gives T a negative value and is useless. The control u =" m gives a finite positive value to T (since m :2: ac > aa). Thus the only control that satisfies the maximum principle and transfers the system from x 1 =a to x 2 = c when a< cis ii = m with corresponding cost
a c.
aa).
J == m ln(m a m-ac
94
OPTIMAL CONTROL 1: THEORY
The optimal strategy is now clear. If a> c put u = 0 and if a< c put u = m. The reader might wish to try the minimum time problem for this system. It turns out that the optimal control is the same. Exercises 4.1 1. For Example 4.2, suppose the initial state (a, b) lies above POQ and lett= Y/ be the time at which the optimal control switches from - Kjm to Kjm. (i) Show that at t = Y/ the system is at (l, s), where
l = m(b 2 (ii)
+ 2Kajm)/4K,
s = -(2Kl/m) 112 •
Apply the condition H = 0 at t = 0, t = Y/ and t = t 1 and deduce that
B = m(b- s)/Ks,
A= 1/s, Y/ =
m(b- s)jK,
t1
= m(b- 2s)/K.
Calculate H as a function of t in the two time intervals [0, ry], [ry, t 1 ] and hence verify that H = 0 Vt in [0, t 1 ]. The system x1 = -x 1 + u, where u = u(t) is not subject to a constraint, is to be controlled from x 1 (0) = 1 to x 1 (t 1 ) = 2, where t 1 is unspecified, in such a way that (iii)
2.
1 J =2
3.
It, (xi + u
2)
0
dt
is minimized. Find the optimal control. The system .X 1 = x 1 + u, where u = u(t) is not subject to a constraint, is to be controlled from x 1 (0) =a to x 1 (t 1 ) = b, where a and b are specified but t 1 is not, in such a way that J =
I'
(2xi
+ 2ux 1 +
u 2 ) dt
is minimized. Find the optimal control. An outline proof of Theorem 4.1. The proof of Theorem 4.1 is long and at times rather intricate. A full account is given in Chapter 6, which also contains proof's of the other theorems that are essential to the development of the proof of the maximum principle. This section aims to give an overview of the method of proof and should be read carefully before embarking on the detailed exposition in Chapter 6. We are assuming that an optimal control exists and we are looking, in the first instance, for a local result; the optimal control
THE PONTRYAGIN MAXIMUM PRINCIPLE
95
gives a smaller value to J than any other control that is close to it. We will consider controls that are small perturbations (or variations in the vocabulary of Chapter 3) of u*(t). We then calculate the effect of these small changes on the value of J. It is clear that we must allow the controls we use to be discontinuous and bounded, so we consider admissible controls that are piecewise continuous and take their values from some prescribed region U in the control space. Note that since we are looking for a necessary condition we need not consider the class of all piecewise continuous variations; a suitably chosen subclass of the admissible controls is all that is required. The reader should be familiar with this fact; it is basic to most of the proofs in Chapter 3. The proof is based on the geometric fact (see Figure 4 . 1) that, given the existence of an optimal control that takes the system to the prescribed final state x 1 , there can be no other control whose trajectory in augmented state space hits CB at a point below D. To be more precise, consider the entire line passing through C and Band let l denote the semi-infinite section of it that lies below D. Then no point on l can be reached by any admissible control. If we vary the optimal control slightly, either by changing its value in a small sub-interval of [t 0 , t 1 ] or by changing the time of transfer by some small amount, the system will go, not to x 1 , but to some point close to it in the augmented state space. In this way we can generate a so-called set of varied end-points with position vectors of the form x
=
x*(t 1 )
+ Ax
where Ax is small. The next step in the proof is to show, for a suitable subclass of the admissible controls, that this set of varied end·points, which we shall call E, is convex. The definition of a convex set is simple. Suppose a set Kin !Rn contains the two points P and Q, then K is a convex set if and only if all the points on the straight line segment PQ also lie in K for all possible choices of P and
'L'
giving us the control sequence { -1, + 1}. On the other hand if S(t 0 ) > 0, then
u* = sgn S = {
1, -1
and we have the sequence {1, -1}.
t>-r
•
These are the only controls that satisfy the maximum principle. It is from controls of this type that we must select the optimal control that takes our system from x(t 0 ) = x 0 to x(t 1 ) = x 1 in minimum time t 1 - t 0 • Control problems of this type, time-optimal control of a linear system, are particularly easy to deal with because it can be shown that for such problems the Pontryagin maximum principle is not only necessary (as we show in the proof of Theorem 4.1) but also sufficient. Thus if we can find a control sequence of the type discussed in Lemma 5.1 that takes the system from x 0 to x 1 , then it must be the optimal control.
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS
107
For such linear systems we can actually solve a very general control problem; we can construct (when it exists) the optimal control from any initial point to the origin. Furthermore, we can express the optimal solution in a particularly elegant way. Before we move on to tackle a series of examples the reader is advised to return to Chapter 4 and examine once more the solution to Example 4.2 (the truck problem). Example 5.1. .X1
The system = -3x 1
+ 2x 2 + 5u,
(5.9)
is to be controlled from a general initial state to the origin in minimum time. Find the optimal control when the control satisfies the constraint Jul ::::; 1.
Solution. { 1 (x,
We apply Theorem 4.1. We have
u) = -3x 1 + 2x 2 + 5u, {2 (x, u) = 2x 1
-
3x 2
and {0 (x, u) = 1.
Then
+ 1/1 1 ( -3x 1 + 2x 2 + 5u) + l/1 2 (2x 1 - 3x 2 ) = -1 + 1/1 1 ( -3x 1 + 2x2 ) + l/1 2 (2x 1 - 3x 2 ) + 5l/f 1 u
H = -1
and
The function H is maximized as a function of u, Jul::::; 1, by u* = sgn(51/1 1 ) = ± 1. The corresponding solutions of the state equations (5.9) are the trajectories of the autonomous linear system u* = ±1.
The eigenvalues of
(-3 2) 2
-3
are -1 and -5. They are real so Lemma 5.1 can be used to deduce that the optimal control must be piecewise constant, taking the values + 1 or -1, and can have at most one discontinuity. Since the eigenvalues are both negative the trajectories are those of a stable node. For u* = 1 the singularity is at the solution of
108
OPTIMAL CONTROL II: APPLICATIONS
-3x 1 + 2x 2 + 5 = 0, 2x 1 - 3x2 = 0. That is at x 1 = 3, x 2 = 2. For u* = -1 the singularity is at x 1 = -3, x 2 = -2. In both cases there are straight-line trajectories of slope + 1 and -1. Let ((I+ be a typical path corresponding to u* = 1 and ((1- be a typical path corresponding to u* = -1. The trajectories are sketched in Figure 5.1. In order to reach the origin in minimum time the phase point (x 1 , x 2 ) must either travel along a ((I+ path or a ((1- path and can switch from one to the other at most once. This observation, together with the application of a little common sense, enables us to construct the minimum-time path to the origin for any initial state. The argument is as follows. The system must arrive at 0 on a ((I+ path or a ((1- path. The slope of these paths at 0 is dx 2 /dx 1 = 0/5u* = 0. The two paths to 0 are shown in Figure 5.2. If the initial state of the system happens to lie on either P+ 0 or Q_O then the optimal control sequence must be {1} or { -1} respectively. The system goes directly to 0 without a switch. Now consider any initial state W lying above P+OQ _ as shown in Figure 5.3. It cannot be taken to 0 on a((/+ path because of the singularity at P; the state of the system would simply be attracted towards P (Figure 5.1). However, the singularity for the ((/trajectories lies on the other side of P+OQ _ and any such path starting at W would intersect P+ 0 which is the u* = 1 path to the origin. So there is a control sequence { -1, 1} with one switch which takes an initial state W to 0. This must be the time optimal control for initial states lying above P+OQ -· A similar argument shows that the time-optimal control for initial states below P+OQ _ must be {1, -1} and we can write down the optimal synthesis u* = { 1 -1
below P+OQ_ and on P+O above P+OQ _ and on OQ _.
Figure 5.4 shows some typical optimal paths. Note that every initial state can be controlled to the origin in minimum time. Complete control of this system is possible because the system itself is inherently stable; the uncontrolled system has a stable node at 0 so the system has a natural tendency to move towards 0. As we shall see in the next example, unstable systems are not as tractable.
Example 5.2.
The system
109
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS
------~~~~~--~----~~------
7 Xl
u* =-1
Figure 5.1
Trajectories for Example 5.1. Stable nodes at P and Q
110
OPTIMAL CONTROL II: APPLICATIONS
----------------~.~~------------------? XI
Q•
. p
Q
Q_
Figure 5.2 The two paths to 0
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS
111
.w
.p ------------~~~----------------? XI
Q· Q_
Figure 5.3 Initial state above P + OQ _
where the control u = u(t) satisfies the constraint lui s 1, is to be controlled to the origin in minimum time. Show that the only possible control sequences are {1}, {-1}, { -1, 1}, {1, -1} and obtain the optimal synthesis.
Solution.
Now
+ t/1 1 (3x 1 + 2x 2 + 5u) + t/1 2 (2x 1 + 3x2 ) -1 + t/1 1 (3x 1 + 2x 2 ) + t/1 2 (2x 1 + 3x 2 ) + 5t/1 1 u
H = -1 =
where His linear in u with luis 1 soH is maximized by u* = sgn(5t/J 1 ) = ± 1. The corresponding trajectories satisfy
u*
=
±1.
112
OPTIMAL CONTROL II: APPLICATIONS
Q_
Figure 5.4 Typical optimal paths for Example 5.1
The eigenvalues of
are 1 and 5. They are real so Lemma 5.1 can be used to deduce that the time-optimal control is piecewise constant, taking only the values 1 or -1, and can have at most one switch. Note that the eigenvalues are both positive so the trajectories are those of an unstable node. The two sets of trajectories are shown in Figure 5.5. The straight-line trajectories have slope ± 1. When u* = 1 the singularity is at (- 3, 2); when u* = -1 the singularity is at (3, - 2). For both families the trajectory that passes through 0 does so with zero slope. Once again
xl
Q
Figure 5. 7 The region for which control to 0 is possible
for ever. L is useful only as part of the boundary of the region in which control to 0 is possible. Thus a ~- trajectory lying below L and above QO intersects PO at an ordinary point, a switch of the control to u* = 1 would then send the system to 0 along part of PO. If however Wlies above L then the~- path through it does not intersect PO and such initial states cannot be controlled to 0 with one switch. Thus the only initial states that can be driven to the origin by the control sequence { -1, 1} are those lying between PO and L. Similarly, the only states that can be controlled to 0 by the sequence {1, -1} are those lying between QO and r+. Outside the region bounded by L and r+ (shown in Figure 5. 7) there can be no time-optimal path to the origin. In fact there is no control that will take the system to 0 in any manner if the initial state lies outside this region. This is a consequence of the fact that for time-optimal control of linear systems the Pontryagin maximum principle is sufficient as well as necessary. Inside the region bounded by the curves r _ and r+ the optimal
116
OPTIMAL CONTROL II: APPLICATIONS
synthesis must be u* = {
-1
1
above POQ and on QO below POQ and on PO.
The system is uncontrollable outside the region bounded by r_ and r+. Note that if we were allowed to use larger values of u-that is, if the constraint on u were lui ~ 16, say-then the region in which an optimal control existed would be larger but it would still be finite and most initial states could not be controlled to 0.
Example 5.3. system with
lui~
Let us try to solve the time-optimal problem for the
1.
Solution. The system has real eigenvalues -2 and 4 so the singularity is a saddle point. Lemma 5.1 again tells us what the optimal control must be. The corresponding trajectories satisfy
x2 = 3x 1 + x 2 -
5u*
where u* = ± 1. The singularity for re'+ paths is at (1, 2) and is at ( -1, -2) for the re'- paths. There are straight-line trajectories with slope ±1 and at the origin both families of trajectories have slope t· Figure 5.9 shows P+O, there'+ path to 0 and Q _0, there'- path to 0. If the initial state lies on P+ 0 or Q _ 0, then minimum time to the origin is attained by the control sequence {1} or { -1} respectively. Consider the control sequence { -1, 1}. The phase point reaches 0 by travelling along part of P+ 0 and the previous section must have been along are'- path that intersects P+O. Now examine Figure 5.10. No rt'- path lying below CQD can intersect P+ 0 so no initial point that is below CQD can be controlled to the origin by the control sequence { -1, 1}. Figure 5.10 shows some typical C(J- paths that intersect P+ 0. We see from this diagram that the only initial states that can reach 0 using the control sequence { -1, 1} are those lying in the region between CQD and Q _OP+. Similarly the only initial states that can reach 0 with the control sequence {1, -1} are those lying in the region between APB and Q _OP+. No initial states outside the infinite strip between APB and CQD can be taken to 0 by a control sequence satisfying the Pontryagin maximum principle and
117
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS A
~~--~---------? xl
u* = 1 B
c
----~--~,----------~~----+-------~:> XI
Q
D
Figure 5.8
Trajectories for Example 5.3. Saddle points at P and Q
118
OPTIMAL CONTROL II: APPLICATIONS
Q_ • p
~~--~~~---';3>
xl
Figure 5.9 The two paths to 0 A
ill x 2
Q_
c
'
' '
' ''
D',_
Figure 5.10
({!-
paths intersecting P+O
119
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS
'
'
Q_
c -~'-:-,,-~---_,.L-r'-----;,L--7fL--7'-----;/--t"---~--"-;,-·--p
xl
D
Figure 5.11
Typical paths to 0 in the region for which control is possible
for such states there is no optimal control to the origin. Inside the infinite strip we have the usual time-optimal synthesis
u* =
{
-1 +1
below Q_OP+ and on Q_O above Q_OP+ and on P+O.
Figure 5.11 shows typical paths to 0 and the switching curve Q __ OP+.
Example 5.4, Solve the problem of time-optimal control to the origin for the system when the control satisfies the constraint lui :s; 1. Note that ad- be= 0 so the uncontrolled system does not have an isolated singularity at the origin. In fact there is a line of singularities on x 2 = 0. When ad - be = 0 the shape of the trajectories cannot be deduced immediately from the eigenvalues. Fortunately systems with ad- be= 0 can always be solved by elementary methods.
120
OPTIMAL CONTROL II: APPLICATIONS
Solution
+ l/1 1 (x 2 + u) + 1/1 2 (-x 2 + u) -1 + VJ1Xz -l/lzXz + (1/Jt + 1/Jz)u
H= -1 =
H is maximized as a function of u by
u* = sgn(l/l 1
+ 1/1 2 )
=
± 1.
The co-state equations are:
.
aH
l/11
= ----- = 0
oxl
.
aH
1/Jz
= --
= 1/1 2
ox 2
=
le 1
= 1/11 = k
= --l/11 +VIz= -k + 1/Jz
+ k.
Thus S = 1/1 1 + 1/1 2 = le 1 + 2k. This clearly has at most one zero so the optimal control can have at most one switch. The corresponding trajectories satisfy
x2 =
-x 2 + u*,
u*
=
± 1.
These we can solve by elementary methods. Note that there are no singularities. The slope, dx 2 /dx 1 = ( -x 2 + u*)/(x 2 + u*), is zero on x 2 = u*, infinite on x 2 = - u* and takes the value 1 on x 2 = 0. Note that x 2 = u* is a trajectory of the system. We can find the shape of the trajectories by examining the sign of the slope in the three regions x 2 > 1, 1 > x 2 > -1 and x 2 < -1. They are sketched in Figure 5.12. We could of course integrate the equations directly:
x2 =
-
x2
+ u * = x 2 = ae- 1 + u * .
So Eliminating t gives
x1 + x2
=
u*- 2u* lnlx 2
-
u*i +constant.
But it is easier to find the shape of the trajectories directly from the differential equations. We can now deduce the solution in the usual way. The optimal
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS
121
---+--------?? X!
u*
=1
~·--------~L----0?
X!
Figure 5.12
Trajectories for Example 5.4
122
OPTIMAL CONTROL II: APPLICATIONS
Figure 5.13
Typical optimal paths for Example 5.4
synthesis is shown in Figure 5.13: u* = {
where
or-
is the
r-or+ and on r-o below r-or+ and on r+o path to 0 and or+ is the ~+ path to
-1 1
~-
above
0.
Exercises 5.1 1. Solve the problem of time-optimal control to the origin for each of the following (i)
xl
(ii)
xl
(iii)
Xl
(iv)
2.
x1
= = = =
xl
+ 2x2,
x 2 = 4x 1
-
x 2 + u,
x2,
x2 = -2xl- 3x2
Xz,
x2 = -x2
3x 1 - X2,
:i: 2 = -x 1
+ u,
+ u, + 3x 2 + u,
For the truck problem (Example 4.2)
lui::::; 1. lui::::; 1. lui::::; 1. lui::::; 1.
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS
(a)
(b)
find the time-optimal control from (1, 1) to ( -1, 0) and calculate the minimum time. Find also the time-optimal control and minimum time from ( -1, 0) to (1, 1). Comment on your answers. suppose now that the constraint is changed to 0 ::::;; u ::::;; 1 (so that the rocket motor fires in one direction only). Show that a time-optimal control to the origin can only be constructed if the initial state lies in the region
x 2 < 0, x 1
3.
:2': x~/2.
Show also that if the initial state lies outside this region then the system cannot be controlled to the origin. The system
xl = x2,
4.
123
x2 = xl
+ u, lui::::;; 1
is to be controlled from x 0 to x 1 in minimum time. Show that the optimal control can take only the values 1 or - 1 and that it can switch at most once. Given that x 0 = ( -i, 0) and x 1 = (!, O) show that the switch takes place at (0, J3/2) and find the time at which the switch takes place. Show that the minimum time of transfer is 2 arcsinh Solve the problem of time-optimal control to the origin for the system
J3.
lui::::;; 1 5.
in the cases (i) a = 0, (ii) a = -1. Solve the problem of time-optimal control to the origin for the system
lui::::;; 1. 6. The system it.s to be controlled from (1, O) to (0, 0) in minimum time. Show that the optimal control can only take the values 1 or -- 1 and that it can switch at most once. Show that the minimum time of transfer is 2ln(e 112 7.
+ (e- 1) 1 12 ).
For each of the following non-linear systems, consider the problem of time-optimal control to the origin. Determine whether
124
OPTIMAL CONTROL II: APPLICATIONS
an optimal control can be constructed and describe the optimal trajectories (if they exist):
= xt i 2 = u, lui ::::; 1 (ii) i 1 = x 2 , i 2 = --x~ + u, lui::::; 1. (i)
i
1
Systems with complex eigenvalues If the system matrix A has complex eigenvalues, then so does the co-state matrix --AT. The controls that maximize H are still piecewise constant, taking the value 1 or -1 according to the sign of S = ll/1 1 + ml/1 2 , but now S has many zeros. Constructing the time-optimal synthesis for such systems is more difficult than the real-eigenvalue case but the reward is always a solution that is both intriguing and spectacular. When a system has complex eigenvalues its uncontrolled (u = 0) behaviour is oscillatory. The introduction of the control cannot alter this; the optimal control simply manipulates this natural behaviour. We found in the case of real eigenvalues that we could find the optimal synthesis without actually solving the co-state equations. Once we knew that there could be at most one switch we could complete the solution just by examining the behaviour of the trajectories corresponding to u = 1 and u = -1. Here, however, we must find l/1 1 and l/1 2 so that we can find out exactly how S behaves. The following examples show how to find the optimal synthesis for a system with complex eigenvalues. There are three types of behaviour depending on whether the eigenvalues are imaginary, have positive real parts or negative real parts. Example 5.5
(Eigenvalues imaginary).
The system
is to be controlled from a general initial state to the ongm in minimum time. The control u = u(t) satisfies the constraint lui s 1. Find the optimal synthesis.
Solution. H= -1 + l{l 1 x 2 u* = sgn l/1 2 = ± 1, where
+ l/1 2 (-x 1 + u) and His maximized by and
The eigenvalues of the system matrix A are ;t eigenvalues of -AT are q = ± i. Thus
S
=
l/1 2
=
k sin(t + l),
=
± i and the (5.10)
125
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS
4-~--~~~--~-~
XI
u* =-1
u* = 1
Figure 5.14 Trajectories for Example 5.5. The singularities are centres
where k and l are arbitrary constants, and the zeros of S are at t = -l ± nrc, where n is an integer or zero. The state equations become where u* = 1 or -1 depending on the sign of S. When u* have the family of trajectories
x 1 -1 = acos(t + oc),
x 2 = a sin(t
= 1 we
+ oc)
where a and oc are arbitrary constants, or (x 1
-
1)2
+ x~ = a 2 ,
a family of circles, centred at the singularity (1, 0), that are traversed clockwise as t increases, as shown in Figure 5.14. Let C(J+ denote a typical member of this family. When u* = -1 we have the family x1
+1=
a cos(t
+ oc),
x 2 = a sin(t
+ oc)
centred at ( -1, 0). Let C(J- denote a typical member of this family. The optimal path to 0 must consist of alternate arcs of C(J+ and C(J- with the points at which the changeover takes place determined by the zeros of S. The time interval between zeros is rc (see (5.10)) and in this time a point on any C(J+ or C(J- trajectory sweeps out a semicircular arc. The origin is reached on either C(Ji or C(J!, the semicircular arcs of radius 1 shown in Figure 5.15. M is the point ( -1, 0), N is (1, O) and P 1 is (2, 0).
126
OPTIMAL CONTROL II: APPLICATIONS
Figure 5.15 The two semicircular paths to the origin
The optimal strategy for an initial state lying on Cfl{ (or Cfil) is to stay on it until the origin is reached. Any other initial state must adopt a strategy that eventually takes it onto a Cfl- path that intersects Cfl{, or a Cfl+ path that intersects Cfi!. We will discuss in detail those optimal paths whose final section is all or part of Cfl{. A path that switches onto Cfl{ at P1 at time t = r must have had its previous switch at t = r - n and traversed the top half of the Cfl- path through PI> which is a circle centre ( -1, 0) and radius 3. Now consider a path that switches onto Cfl{ at some point Q. In r - n :::;;; t :::;;; r it must have been on the Cfl- through Q and its previous switch was at R, where RQ is the diameter. We need to construct the switching curve which is the locus of switching points. We must determine the locus of R given that Q is any point on Cfl{. It can be seen from Figure 5.16 that if Lis ( -3, 0), M is ( -1, 0) and N is (1, 0), then the triangles MRL and MQN are congruent, so LR = NQ = 1 and the locus of R is the top half of the circle of radius 1 centred at L. Similarly, optimal paths switching onto Cfi!
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS
127
----->-----
Figure 5.16
Finding the locus of R
must have had their previous switch on the lower half of the circle of radius 1 centred at (3, 0). To find the rest of the switching curve we continue to trace baekwards from 0. It is clear that a path switching at R will have had its previous switch on the lower half of the. circle radius 1 centred at (5, O) and so on. The switching curve 1 consists of a series of semicircles, each of radius 1 as shown in Figure 5.17. Figure 5.18 shows the switching curve and some typical optimal paths. It can be seen that the optimal synthesis is u*
=
{
-1 +1
above
r and on C6'1
below
r
and on C6' t .
This problem has physical significance. If we eliminate x 2 from the state equations we obtain &\ + x 1 = u(t), a particular case of the forced, simple-harmonic oscillator equation x1 + w 2 x 1 = u(t). If a spring has one end fixed so that it hangs in a vertical line and a particle P is attached to the lower end of the spring then P
128
OPTIMAL CONTROL II: APPLICATIONS
Figure 5.17 The switching curve is a series of semicircular loops
------>------
-->-'------L---_L_-----'------t---,-----,---,----.-------
> XI
--- XI
u* =-1
Figure 5.19
Trajectories for Example 5.6. Stable foci at M and N
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS
131
and '{!- paths. Since these are spirals the switching curve promises to be rather complicated. First we establish how much spiral is swept out in a time interval of length n. Put u* = 1 in equations (5.11) and write them in polar coordinates (see Figure 5.19) x 1 - 1 = r cos e, x 2 = r sin e. Then (5.11) becomes
of~+
x1 = f cos e -
rB sin e = - rk cos
e + r sin e
and
x
2 =
f sine+ re cos
e=
--r cos e- rk sin e.
Eliminating f then gives (J = -1 so in time n a point on the spiral sweeps clockwise through an angle n. Now consider the arcs that pass through the origin. There is no loss of generality if we say that the'{!+ to 0 reaches it at t = 0. If S == 0 at t = 0 then the previous switch was at t = - n when the system was at P1 (see Figure 5.20). If S =1- 0 at t = 0 then the previous switch must have been at some point Q on'{!{ (between 0 and P1 ).
Figure 5.20
The two paths to 0. M is ( -1, 0), N is (1, 0)
132
OPTIMAL CONTROL II: APPLICATIONS
The arc ~t can be described in terms of the parameter u: X1
= 1 ·- e-ka cosO", x 2 = e-ku sin u,
O"E[-n:,O].
The arc ~1 is exactly the same shape. To obtain it from ~t we simply translate N toM and then rotate ~t through n:. It is helpful to note that Figures 5.20 and 5.15 are rather alike, so we expect the solution of this problem to go through in an analogous way to that of Example 5.5. When we trace back through an angle n along the ~- through Q we reach the previous switch point R. We can then show, using the equations of the trajectories, that the locus of R as Q varies on ~t is the spiral arc ~2 given parametrically by 0" E [ -71:,
0].
The arc ~2 is 'similar' to ~t. If we imagine~-;- rotated through n and N translated to U( -1 - 2ek", O) and we then stretch the distance of every point from U by a factor ek7r we obtain ~2. That is, using the parametric representations of the two curves we can show that UR = ek"NQ.
'lif'-2
u
---~-------+---·
Q
Figure 5.21
Finding the locus of R. U is ( -·1 -- 2ekrr, 0)
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS
Figure 5.22
The switching curve
133
r. The loops grow in size
We can now construct the remaining loops of the switching curve 1. Since each loop is a stretched version of its predecessor they become extremely large as we move out from the origin. It is not possible to give a diagram in which the scale is accurate, but Figure 5.22 gives a general impression. The optimal synthesis is u*
=
{
-1 +1
above [' and on ifl1 below [' and on ifl{ .
Example 5. 7 (Eigenvalues with positive real part). find the time-optimal control to 0 for the system
We wish to
Solution. The system matrix A has eigenvalues k ± i. The usual arguments (see Examples 5.5 and 5.6) lead to the conclusion that we must have u* = + 1 or -1 with switches at the zeros of S ="" -kl/! 1 + 1/1 2 which is of the form S = ce-kt sin(t + y) and has zeros at t = -·y ± nrr:. The trajectories ifl+ and ifl- corresponding to
134
OPTIMAL CONTROL II: APPLICATIONS
r
u*
=1
r
e
-\----------+'----+-----+---------------;? M
XI
u* =-1
Figure 5.23
The trajectories for Example 5.7. Unstable foci at M and N
TIME-OPTIMAL CONTROL OF LINEAR SYSTEMS
135
'6'+2
Figure 5.24
u*
= 1
The switching curve
r. The loops shrink in size
and u* = -1 are x 2 = - aekt
sin(t
+ ()()
(5.13)
They spiral away from the singularity (u*, 0) in a clockwise manner. We can show that fJ = -1 so that the trajectories sweep clockwise through an angle n in the time interval n. The switching curve is constructed in the same way as in Example 5.6 but this time the loops shrink as we move away from 0. Figure 5.24 shows the case kn == ln ~' a value chosen so as to give a 'shrink factor' of 0.6. This presents us with an intriguing question: does the switching curve disappear? To answer this, it is enough to examine the switching curve in x 1 ~ 0. The loops rt'i and rt'i can be represented parametrically:
rt'i : X1 =
rt'i : X1 =
1-
eka
COS 0' ,
X 2 = eka
sin 0',
1 + 2e-k"- e-k(n-a)COSO',
Xz
aE[-n,O]
= e-k(n-u)sina,
a E [ -n, 0].
136 The (n
OPTIMAL CONTROL II: APPLICATIONS
+ 1)th loop can be shown to be
= 1 + 2I
n
x1
e-pkn-
e -k(nn-o-) cos (J'
Xz
= e -k(mr- 0.
The optimal synthesis is
u* =
{
-1
in H 1
~
in H 3 in Hz uH4 •
H1 includes r-o, H 3 includes or+, Hz includes K+o and H4 includes OK-. Note that the optimal control switches off on part of the route to 0 and uses no fuel while it drifts towards x 1 = 0 on an Xz = constant path.
158
OPTIMAL CONTROL II: APPLICATIONS X 11\ 2
Figure 5.35
Optimal synthesis for Example 5.12
Exercises 5.4 For the system discussed in Example 5.10:
1.
(i)
(ii) (iii) 2.
Take an initial point lying above x 1 = x 2 in the first quadrant and show that there is an infinite number of singular optimal controls to the origin. (Start by considering piecewise constant functions with one discontinuity.) Prove that for any initial state (a, b) the minimum time to the origin is max{a, b}. Find the optimal control and minimum time from (a, b) to any point (c, d).
The system xl = Xz, Xz = u, /u/ ::::; 1 is to be controlled from (1, 1) at t = 0 to (0, 0) at t = 4 in such a way that J
=
t
4
/u/ dt
is minimized.
Show that the optimal control has three phases (i) u = -1,
PROBLEMS WHERE THE COST DEPENDS ON x{t 1)
3.
159
(ii) u = 0, (iii) u = 1, which occur in that order. Sketch the corresponding trajectories. Calculate the times at which the two switches that separate the phases occur and show that the minimum value of J is 4-J3. The system lui~ 1
is to be controlled from x 1 = 1, x 2 = 1 at t = 0 to x 1 = x 2 = 0 at t = 5 in such a way that
4.
is minimized. Show that the optimal control sequence is { -1, 0, 1}. Find the times at which the switches take place and calculate the minimum value of J. The system
x2 = u,
lui~
1
is to be controlled from x 1 =!, x 2 = 0 at t = 0 to x 1 = -!, x 2 = 0 at some time T in such a way that J =
LT (4 + lui) dt
is minimized. Find the optimal control. Determine the times of the switches, the value of T and the minimum value of J.
Problems where the cost depends on x( t 1 ) There are situations in which the appropriate form for the cost functional is not an integral but a function of the final state of the system. For example, in the problem of the infusion of glucose (see Example 4.3) one could ask for the rate of infusion that would maximize the level x 1 (t 1 ) of glucose in a fixed time t 1 - t 0 • We would then have J = -x 1 (t 1 ) to be minimized. Further, if we wished to dispense with an explicit constraint on the value of u(t) but wanted to ensure that the optimal control took sensible (non-infinite)
160
OPTIMAL CONTROL II: APPLICATIONS
values, we could include a term such as
f
tl
u 2 dt
to
in the cost functional so that
We now state the general problem of this type for an ndimensional system with m control variables. As in Chapter 4 the functions {;, i = 0, ... , n are continuously differentiable with respect to all their arguments and the scalar function g(x(t)) is continuously differentiable with respect to its arguments. Problem 5.2. The system :X = f(x, u) is to be controlled, during the fixed time interval t 0 ::;; t ::;; tv from a given initial state x 0 in such a way that
J = g(x(t 1 ))
+
I
tt
{0 (x, u) dt
to
is minimized. Find the optimal control. We can transform this into a problem suitable for an application of the Pontryagin theorem. Introduce a new cost variable X 0 , where •
n
ag
1
axi
Xo =I-{;+ fo n
ag .
1
axi
=I-x;+ fo so that
so X 0 (t 1 )
=
J- g(x(t 0 )).
Now x(t 0 ) = x 0 so g(x(t0 )) is a known constant. Thus minimizing J is equivalent to minimizing X 0 (t 1 ). (Recall Chapter 4, where the cost variable x 0 satisfied x0 = { 0 and x 0 (t 1 ) = J.) We can now apply Pontryagin's theorem. We have
PROBLEMS WHERE THE COST DEPENDS ON x(t 1 )
161
As u:sual we can set t/1 0 = -1 and the co-state equations are tfr; == --oHjox;, i = 1, ... , n. We could in principle simply solve the problem as it stands but the co-state equations would be very complicated. We can avoid these difficulties by rearranging Hand introducing a new set of variables which we will call the pseudo co-state variables A;, i = 1, ... , n. Now H = -lo
og) h + ~n ( t/li -- Dxi
Set
so that we have n
H
=
-fo
+ LAi[i = 1
H',
say. Note that this form of H shows no explicit dependence on the funetion g. We can then show (see Exercise 5.5(1)) that
·
of~
n
!
OX;
1
of.
oH'
OX;
OX;
k = ---- l:A-_1- = --J
so that the A; look like co-state variables corresponding to H'. The problem is much easier to solve in this form. It only remains to determine the end condition at t = t 1 . Since x(t 1 ) is free, the transversality condition is t/J;(t 1 ) = 0, i = 1, ... , n. (See the Corollary to Theorem 4.3). Thus at t = t 1 we must have A;(t 1 ) = - ogjox;, i = 1, ... , n, where the derivatives are evaluated at the end point. To solve Problem 5.2 we write H' = - [0
+ LAJj j
and maximize it as a function of u. The pseudo co-state variables A; satisfy . oH' k= - - !
OX;,
i
=
1, ... , n.
The end conditions are x(t 0 ) = x 0 and A;(t 1 ) = - ogjox;, i = 1, 2, ... , n. Let us now apply this to a variant of the glucose problem (Example 4.3).
162
OPTIMAL CONTROL II: APPLICATIONS
Example 5.13. The system x1 = -cxx 1 + u is to be controlled from x 1 = 0 at t = 0 to some state x 1 (t 1 ) at fixed time t 1 in such a way that
is minimized. Find the optimal control. Note that there is no explicit constraint on u(t), but since J contains the integral of u 2 there is an effective constraint on the values u(t) can take. Solution. H' = -u 2 + A. 1(-cxx 1 + u) and i 1 = o:A- 1 , so the pseudo co-state variable is A- 1 = Ae"1• To maximize H' we need u* = A.t/2, so u* = Ae"1/2. The corresponding equation for x 1 is x1 = -cxx 1 + Ae"1/2 with solution
x1
=
Be-"1 + Ae"1/4o:.
Now x 1 (0) = 0 so B = -Aj4o: and at t = t 1 we must have = - ogjox 1 so Ae"11 = 1. The optimal control is therefore u* = e" 3 the Riccati approach gives us more equations to solve but is probably better for numerical computation because we simply have to integrate backwards from known values at t = t 1 • Once K is determined the optimal control can be written
u* = R- 1 (BTK- QT)x. Example 5.14. The system .X 1 0 s t s 3 in such a way that J
= u 1,
x2 =
= t{xi(t 1 ) + x~(t 1 )} + t
u 2 is to be controlled in
t (ui
is minimized. Find the optimal control.
3
+ uD dt
LINEAR SYSTEMS WITH QUADRATIC COST
167
Solution. For this example A = 0, B = I, S = I, P = 0, Q = 0 and R = I. Equation (5.27) then gives
K=
-K 2
with K = - I at t = 3. Let K = ( :
~) and write down the three differential equations
for p, q and r, jJ = -p2- q2
q=
-q(p
+ r)
f = -q2- r2.
These three non-linear equations are easy to solve. The solution q = 0 satisfies the equation for q and the end condition q = 0. The remaining equations then separate into two independent equations p = - p 2 , f = - r 2 which integrate to give 1
p = t
1
r=--. t -12
-11'
At t = 3, p = r = -1, so 11 = 12 = 4 and 1
K=-I.
t-4
The optimal control is then 1
u*=--x (t- 4)
where x satisfies . 1 x=--x (t- 4)
or
•
X;
X·=-1
t -4'
i = 1, 2,
with solution x = -x0 (t- 4)/4, so that the optimal control takes the constant value u* = -x 0 /4.
Of course we could just solve the original two-point boundary value problem,
168
OPTIMAL CONTROL II: APPLICATIONS
We require uf = A.i to maximize H' and the corresponding equations for x and A. are
xl = A.l,
A1
=
o,
with end conditions x 1 (0) = x 0 , A.(3) = - x(3). Thus l is a constant vector and the corresponding solution for x is xi(t) = - txi(3)
+ xi(O),
i = 1, 2.
At t = 3 these give 4xi(3) = x;(O) sou*= -x 0 /4.
The steady-state Riccati equation This is the equation K must satisfy for the case in which t 1 is infinite. J will be divergent unless x and u go to zero for large t, so we take S = 0 and consider carefully what happens to K as t1 co for ---j.
J =
i
f
11
(xTPx + 2xTQu + uTRu) dt.
to
Put
1:
=
dK
t1
-dr =
-
t, so that K satisfies
dK -1 T -dt = (KB- Q)R (B K-
T
Q ) + KA
+ A TK-
P,
a differential equation forK as a function of r, with end condition K = 0 at r = 0 since S = 0. Now take any fixed value oft and let t 1 go to infinity. Then r goes to infinity and either K is finite in the limit or it goes to infinity with r. An infinite K will give infinite values to x and u and a divergent integral for J. The problem is meaningful only if Kremains finite as r---+ co. This can only happen if dK/dr = 0. Thus to minimize J
=!
tx'
(xTPx
+ 2xTQu + uTRu) dt
we need K satisfying the algebraic equation (KB- Q)R- 1 (BTK- QT)
+ KA + ATK- P = 0.
This is the steady-state Riccati equation and it has in general two solutions for K. We expect that one of these will give a sensible
169
THE STEADY-STATE RICCATI EQUATION
solution for x. Recall that the optimal solution for x must satisfy
i=Ax+Bu* with so that (5.28) Now x---+ 0 as t---+ oo, so the eigenvalues of the matrix on the right-hand side must all have negative real parts. This criterion will determine the appropriate K.
Example 5.15. The system .i 1 a given initial state so that J =
= x 2 , i 2 = u is to be controlled from
1too (x~ + u 2 ) dt
is minimized. Find the optimal control.
Solution
A=(~ ~). and R
B=
(~).
= 1, a scalar. Let K = (:
P=
(~ ~).
Q=0
~).
The steady-state Riccati equation is KBBTK + KA
+ ATK- p
= 0
which gives
(::
~~) + (~
:) + (; :) -
(~ ~)
=
0.
Thus q2 = 0,
qr
+p
= 0,
sop = q = 0 and r = ± 1, leading to two solutions for K. Now consider equation (5.28), which gives the corresponding equation for x. Since Q = 0 and R- 1 = 1 we require the eigenvalues of A + BBTK to have negative real parts. If K =
(~ ~)
the eigenvalues are 0 and 1, giving an unstable
170
OPTIMAL CONTROL II: APPLICATIONS
0) the eigenvalues are 0 and -1 and solution for x. If K = ( 0 0 -1 this is the solution we choose. Note however that this is not, strictly speaking, a stable solution
~ 0.
)x
1 is X = ( 0 or xl = Xz, 0 -1 x2 = -x 2 with solution x 2 = Ae-t ~ 0 and x 1 = -Ae- 1 + B ->B. However, x 1 does not appear in J, so the cost integral is convergent. The condition that all the eigenvalues of the equation for x have negative real parts is seen here to be slightly too stringent; some of the eigenvalues can have zero real part provided this does not cause J to be divergent.
with
X
The equation for
Exercises 5.6 1. The system
X
x1 = u is to be controlled from J =
~xi(l) + ~
L 1
x 1 (0)
=
x 0 , so that
u 2 dt
is minimized. (i) (ii) 2.
Find and solve the Riccati equation and hence determine the optimal control. Write down and solve the two-point boundary value problem that determines the optimal control.
The system
xl
= Xz, Xz = J =
~
-xl
L oo
+ u is to be controlled so that
(xi + ~2 ) dt
is minimized. Find the optimal control.
The calculus of variations revisited It is clear that the calculus of variations and the theory of optimal
control are closely related. In Chapter 4 we tried to use the classical theory on an optimal control problem which had no constraints on the control variable. Our investigation led us to the Pontryagin maximum principle. The reader will recall that the methods of the calculus of variations were not adequate to prove the Pontryagin maximum principle because they required too much smoothness for the admissible functions, and that the proof (as outlined in the
THE CALCULUS OF VARIATIONS REVISITED
171
latter part of Chapter 4 and given in detail in Chapter 6) used admissible controls that were piecewise continuous and bounded. It follows that the results we have derived in the theory of optimal control must also apply to the classical problems. In the following seetions we derive the necessary conditions of the calculus of variations from the Pontryagin maximum principle and its related theorems.
The fixed end-point problem (Problem 3.1)
f
tl
J[x] =
f(t, x, x) dt
to
with x(t 0 ) = x 0 , x(t 1 ) = x 1 . We can treat x as a control variable and write x = u(t). Rename the variables t and x by setting t = x 1 and x = x 2 • Then the state variables x 1 and x 2 satisfy the differential equations x1 = 1, with fixed end conditions. The cost functional is
We now apply the Pontryagin maximum principle H
=
-f(x 1 , x 2 , u)
+ t/1 1 + t/J 2 u
with (5.29)
and
1/Jz = _
oH = jf_.
ox 2 ox 2
(5.30)
There are no constraints on the values that u(t) can take, so to maximize H as a function of u we need
aH = -ar- + t/lz = 0 ou au
-
(5.31)
and (5.32)
172
OPTIMAL CONTROL II: APPLICATIONS
If xis continuous in t 0 :$ t :$ t 1 , then we can differentiate (5.31) with respect tot and use (5.30) to eliminate if; 2 • We obtain
i(.DL) au dt
=
~ OXz.
(5.33)
But u = x and x 2 = x, so (5.33) is simply the Euler-Lagrange equation which holds on an extremal between corners. Equation (5.32) is the Legendre necessary condition for a minimum,
o2 f -->0 2 ox
-
on an extremal.
We can also deduce the corner conditions that must hold at each point of discontinuity of x. First note that 1/1 1 and 1/1 2 are solutions of the co-state equations and hence are continuous throughout t 0 :$ t :$ t 1 , even at points where xis discontinuous. Now (5.31) holds on an extremal so
of
-
au
of ox
=- =
1/Jz.
Thus, of fox is continuous, even at a corner. The second corner condition follows from the fact that H is constant on an optimal path. This, together with the continuity of 1/1 1 , gives us that -f(x 1 , x 2 , u) + 1/1 2 u is continuous. When we write this in the original variables we see that
f _ xof ox is continuous, even at a corner. The transversality condition (Problem 3.2)
Consider the case in which x(t 0 ) = x 0 but x(t 1 ) lies on a given curve x = c(t) at t = t1. Lett= x 1 , x = x 2 , x = u as before. Then for this control problem Theorem 4.3 says that (i) (ii)
H = 0 on an optimal path (1/J 1 , 1/J 2 ? is perpendicular to the tangent to the target at the optimal end-point.
THE CALCULUS OF VARIATIONS REVISITED
173
These conditions give us
f + t/11 + t/Jzu = 0
at
t = t1
and
Eliminate
t/1 1 to give
-f
-I- t/lz(u-
~) = 0. dx
Now t/1 2 =at ;au= af/ax and dcjdx 1 = equation (5.34) becomes
:~) ~~ =
f + (c -
(5.34)
.
1
0
c, so in the original notation at
t = t1 .
The isoperimetric problem (Problem 3.3)
We are required to minimize
f
J =
,
f(t,
X,
i) dt
to
subject to the integral constraint
I=
f
tl
g(t, x, x) dt
= c,
to
where c is a given constant. The end-points are fixed. Let t = Xu x = x 2 , x = u and introduce a further state variable x 3 where with x 3 (t 0 ) = 0, X 3 (t 1 ) = c. We now have an optimal control problem with fixed end-points and three state equations. We solve this in the usual way:
H = -f(x 1 , x 2 , u)
+ t/11 + t/Jzu + tj;3g(x1, Xz,
u)
with
~1 = ~ -- t/13 ag
(5.35)
t/1 3 ag
(5.36)
ax1
~ 2 =_at _ ax2 ~3 = 0.
ax1 ax2
(5.37)
174
OPTIMAL CONTROL II: APPLICATIONS
To maximize H we require
aH au
at au
ag au
-- = - - + l/tz + l/13- = 0.
(5.38)
and
We then show, using (5.36) and (5.38), that
~ -l/1 3 ag =~(at _ l/1 3 ag) OXz
OXz
dt
au
au
(5.39)
and note from (5.37) that l/t 3 is a constant. When we write (5.39) in the original notation we find that
where l/t 3 is a constant that can be determined by using the constraint equation. This is of course the familiar Lagrange multiplier rule for problems with integral constraints. Thus all the necessary conditions of the calculus of variations follow immediately from the Pontryagin maximum principle.
6
Proof of the maximum principle of Pontryagin
We now turn to the proof of Theorem 4.1, the Pontryagin maximum principle. The reader may find it helpful to read the outline proof in Chapter 4 before starting this chapter.
Theorem 4.1 (The Pontryagin maximum principle). Let u*(t) be an admissible control with corresponding path x* = (xf, x!) that transfers the system from x 0 at time t = t 0 to x 1 at some unspecified time t 1 . Then in order that u* and x* be optimal (that is minimize J) it is necessary that there exist a non-trivial vector t/1 = (l/1 0 , l/1 1 , l/1 2 )T satisfying equations (4.4), . 8H
l/1;
= ---' OX;
i = 0, 1, 2
and a scalar function
H(t/1, x, u)
=
l/1 0 f 0 (x, u) + l/ld1 (x, u) + l/lzf2 (x, u)
such that (i) .for every tin t 0 ~ t ~ t 1 , H attains its maximum with respect to u at u = u*(t) (ii) H(t/1*, x*, u*) = 0 and l/1 0 ~ 0 at t = t 1 , where t/l*(t) is the solution of (4.4) for u = u*(t). Furthermore it can be shown that H(t/l*(t), x*(t), u*(t)) =constant and t/1 0(t) =constant, so that H = 0 and l/1 0 ~ 0 at each point on an optimal trajectory. • There are a number of subsidiary theorems that need to be proved before the maximum principle can be established. These preliminaries are essential to a full understanding of the proof of Theorem 4.1. To prove part (i) we have to show that the set of varied end-points E is convex and then use a result from the theory of convex sets to show that there is a hyperplane of support to E at the optimal end-point. Theorems 6.1 and 6.2 establish the relevant result for convex sets in IR 2 and IR 3 • The proofs are entirely elementary and can be extended to cover convex sets in !Rn.
176
PROOF OF THE MAXIMUM PRINCIPLE OF PONTRYAGIN
The existence of a hyperplane of support to E at the optimal end-point D gives us a condition that must hold at the final time t 1 on the optimal path. Now we wish to show that the maximum principle holds at any t E [t 0 , t 1] for which u*(t) is continuous. To deduce this we need Theorem 6.3. The reader will no doubt have noticed that part (i) says that H is to be maximized for every t E [t 0 , t 1 ] whereas we have just excluded the (finite number) of points at which u* is discontinuous. There is a delicate technical difficulty which will be resolved in due course. What we will establish in the first instance is that part (i) holds at every t E [t 0 , t 1 ] at which u* is continuous. Part (ii) is easy to prove and Theorem 4.1 goes on to say that t/1 0 and Hare constant on an optimal trajectory. It is very easy to show that ljJ 0 is constant but showing that H is constant on an optimal path presents serious difficulties. Recall that u* can have a finite number of discontinuities in [t 0 , t 1]. At such points H and its partial derivatives may not be defined. There is no hope of establishing the constancy of H by showing that dHjdt = 0 on an optimal path. The symbol dHjdt has no meaning in this situation. Fortunately, with the help of Theorems 6.4 and 6.5, we can show that H is constant in the intervals between the points of discontinuity of u*. Since H may not be defined at these discontinuities we have to go on to prove that His equal to the same constant in each of these intervals. To show this we need to prove Theorem 6.6, which says that there is a function h(t), continuous on the whole of [t 0 , t 1], which is equal to H(l/l*(t), x*(t), u*(t)) whenever the latter is defined (that is, at points of continuity of u*(t). From this result we can deduce that His indeed constant on an optimal path and that part (i) of Theorem 4.1 holds for all t E [t 0 , t 1].
Convex sets in !Rn What follows is not intended to be a comprehensive treatment of the theory of convex sets. There is, however, an important property of convex sets that is crucial to the proof. First let us define a convex set. Definition. Let K be a set in !Rn and let A and B be any two points in K. Then K is convex if and only if K also contains all the points on the straight line segment joining A and B.
CONVEX SETS IN IR"
177
Note that K is required to contain only that portion of the straight line through A and B that lies between A and B. That is, if a and b are the position vectors of A and B respectively then all points with position vectors Aa
+ (1- -1)b
for
must lie in K if K is convex. Note that A= 1 and A= 0 correspond to the end-points A and B. Points corresponding to values of A in 0 < A < 1 are called the interior points of AB. Clearly ~n itself is convex. In ~ 2 , the disc xi + x~ :::;; 1 is convex whereas the circle xi + x~ = 1 is not. The test for a convex set is simply whether we can join any two of its points by a straight line without leaving the set. The removal of a single point can wreck convexity. For example, consider the disc xi + x~ s 2 in ~ 2 from which the origin has been removed. The resulting set is not convex since we cannot, for instance, join ( -1, -1) to (1, 1) without passing through the origin. Note that we can remove points from the edge of the disc without destroying convexity. This observation leads to our next definition.
Definition. Let K be a convex set and let A, B and C be distinct points in ~". If all the interior points of the segment AB belong to K but none of the interior points of BC belong to K then B is said to be a boundary point of K. Thus a boundary point of a convex set need not be a member of the set. For example, the convex sets xi + x~ < 1 and xi + x~ :::;; 1 have the same boundary, namely Xt +X~= 1, but one of them contains none of its boundary points and the other contains all of them. We now come to the result we need for the proof of the Pontryagin theorem. Since we are proving the latter in ~ 3 we shall state the result for ~ 3 and discuss its generalization to ~n later.
Theorem 6.1. Let K be a convex set in ~ 3 and D aboundary point of K. If there is a half-line l, starting at D and extending to infinity, that is such that none of its points (except perhaps D) belongs to K, then there exists a plane ll through D that divides ~ 3 in such a way that K lies entirely in one half-space and the half-line l lies entirely in the other. • The plane ll is said to be a plane of support to K at the boundary point D. The whole of K is confined to one of the half-spaces into
178
PROOF OF THE MAXIMUM PRINCIPLE OF PONTRYAGIN
which II divides IR 3 and the half-line l must lie in the other. The plane of support need not be unique. Let us first prove the corresponding theorem in IR 2 • Theorem 6.2.
Let K be a convex set in IR 2 and D aboundary point of K. If there is a half-line l, starting at D and extending to infinity, that is such that none of its points (except perhaps D) lies inK then there exists a line q through D that divides IR 2 in such a way that K lies in one half-plane and the half-line l lies in the other.
Proof. In Figure 6.1 ADC is the straight line such that DC is the half-line l. None of the points of l (except perhaps D) belong to K. Let P and Q be points in K, one to the right of AD and the other to the left of AD as shown in Figure 6.1. ()(P) is the angle between DA and DP measured in a clockwise sense from AD. Similarly ()(Q) is the angle between DA and DQ measured in an anticlockwise sense from AD. Both these angles must satisfy the inequality 0 :::;; () < n. Now define
rx = sup ()( Q)
and
f3
=
sup ()(P),
then 0 :::;; rx,
f3
n. We will show that this possibility leads to a contradiction. We can choose P and Q, P from the points of K that lie to the right of ADC and Q from those on the left, such that fJ(P) + fJ(Q) > n. It then follows, see Figure 6.3, that the line segment PQ intersects l and therefore contains a point not in K. So K is not convex. Since K is given to be convex we can only conclude that this case cannot arise and the theorem is proved. • Proof of Theorem 6.1. First we take the intersection of K with any plane I: containing the half-line l. This gives us a convex set K' in ~ 2 , with a boundary point D as before, which is known not to contain any of the points of l. Theorem 6.2 therefore applies and we can draw a line q through D such that all of K' lies above q. None of the points of K' lie below q. Now, for clarity, set up a coordinate system with the origin at D, they axis along q, the x axis perpendicular to I: and the z axis in the plane I: as shown in Figure 6.4. We can then say that no
180
PROOF OF THE MAXIMUM PRINCIPLE OF PONTRYAGIN A
D
Q
c Figure 6.3
Case 2
z
/!\
Figure 6.4
Proof of Theorem 6.1: the coordinate system
CONVEX SETS IN IR"
181
z
(a)
li\
z li\
(b)
oc----
~ r2
------------
~
Q ·,,,
',,~
----
, ',,'',',,,
r-------~D~------~-------~y
Figure 6.5
(a) The point P; (b) the point Q
points with x = 0, z < 0 can lie in K. In other words, no point in the lower half of the plane l: can lie in the convex set K. Figure 6.5 shows two typical points of K on opposite sides of I:. Let P lie in K with x > 0 and construct the unique half-plane r 1 that lies in x ;;:::; 0 and contains P and the y axis. Let (}(P) be the angle between r 1 and the top half of I: (that is, the half-plane x = 0, z ;;:::; 0). For Q inK with x < 0 construct the corresponding half-plane
182
PROOF OF THE MAXIMUM PRINCIPLE OF PONTRYAGIN ~-------------,,-/~,~------~~y
/ / /
/
/
/
/
p
/
/ / / /
Figure 6.6 The line segment PQ
r 2 and let ()(Q) be the angle between r 2 and the top half of l:. Both these angles satisfy the inequality 0 ~ () < n. Define IX= sup O(Q), fJ =sup O(P), then 0 ~IX, fJ < n.
Case 1. IX+ fJ ~ n. We can always draw at least one plane through D that cuts ~ 3 into two halves, one containing the whole of K. The argument is the same as that in Theorem 6.2. Case 2. IX + fJ > n. In Figure 6.6 we have chosen P and Q from K so that ()(P) + O(Q) > n. Now consider the line segment PQ. It must intersect the plane l: at a point with z < 0. Such a point we have already shown cannot lie inK if K is convex as postulated. Thus Case 2 cannot arise and Theorem 6.1 is proved. • The reader will have noted that once the result is established for ~ 2 we can prove it for ~ 3 using precisely the same argument. The result will generalize to ~n in the same way; essentially it is an exercise in the method of induction. Once we have n > 3 the plane of support is no longer a plane in the ordinary sense but a linear manifold of dimension n - 1 usually called the hyperplane of support. The proof for IR" is left as an interesting exercise for the reader. The linearized state equations We shall be considering controls that are small perturbations of the optimal control. These will lead to small changes in the augmented state vector. We need to know how such small disturbances propagate as t increases. To do this we linearize the
183
THE LINEARIZED STATE EQUATIONS
c D
r--------~~-----+-------------7 x2
B
A
Figure 6. 7 The optimal path in augmented state space
augmented state equations
i = 0, 1, 2.
X;= {;(x 1 , x 2 , u),
(6.1)
Suppose that at some time t = -r any perturbation in the optimal control ceases so that fort> -r the control in equations (6.1) is the optimal control u*(t). However, at t = -r the state of the system will not lie on the optimal path but at some nearby point x*(-r) + CI(-r), where the three components of 1X(r) are small. We need to know how this perturbation CI(-r) develops in t > r. The state of the system at any time t > -r must satisfy equations (6.1), so i = 0, 1, 2
and to first order the perturbation CI(t) satisfies the linearized perturbation equations &. = ,
f, ~fi L... a
j=O
Xj
(X.
J'
i
= 0, 1, 2,
(6.2)
where the derivatives 8{;/8x.; are evaluated on the optimal path at each t > -r. We can write this more compactly in matrix form ri
where A(t) = (8{;/8x).
= A(t)CI,
t>r
(6.3)
184
PROOF OF THE MAXIMUM PRINCIPLE OF PONTRYAGIN
When a(t:), the perturbation at t = t: is known the perturbation to the optimal path for any t > T is found by solving (6.3) and applying the initial condition at t = r. The solution can be written in terms of the fundamental solution matrix (t). This is a 3 x 3 invertible matrix whose columns form a set of three linearly independent solutions of (6.3). Once (t) has been found we can write the solution that satisfies a given initial condition a(t:) = c, say, in the form (6.4)
We will need this result later when we construct the set of varied end-points E. Now let us turn to an important result involving a(t) and the co-state vector 1/J(t). Recall that .
l/1; =
oH
--
OX;
= -
I~
j=O
of.
_!_
OX;
l/Jj,
i = o, 1, 2.
Note that the coefficients of t/ti on the right-hand side depend on our choice of the control function u(t). The case we are interested in is when the control is optimal, in which case the derivatives are evaluated on the optimal path, so that
.j, = -A(t)Tt/1
(6.5)
where A(t) is the matrix in equation (6.3). We can now prove the following useful result.
Theorem 6.3.
1/J(t)Ta(t) is constant along an optimal path in t > T.
Proof. We simply calculate the time derivative and use equations (6.3) and (6.5). Thus
d
d
dt {l/JTa} = dt (1/JT)a.
= ( -A(t)Tt/I)Ta =
d
+ 1/JT dt (a) + 1/!TA(t)a.
--1/!T A(t)a + 1/JTA(t)a = 0.
•
The behaviour of H on an optimal path Theorem 4.1 asserts that H is constant on an optimal path. In establishing this we encounter a number of difficulties. If we wish to use the derivative of a function to deduce some new result we need to be sure that the derivative exists. Now Hand its derivatives
THE BEHAVIOUR OF H ON AN OPTIMAL PATH
185
may not be defined at points of discontinuity of the control so we cannot establish the constancy of H by calculating dH/dt. Once we have shown that the maximum principle holds at points of continuity of u*(t), we can deduce that H on an optimal path is constant in the intervals between the discontinuities of u*. Theorems 6.4 and 6.5 will form an essential part of the argument. Note carefully that they involve the ratio (f(s)- f(t))/(s- t) for s =1- t, so we are not asserting that the derivative f'(t) exists.
Theorem 6.4. Let f be a continuous function on the closed interval [a, b] with the property that given t E [a, b] and B > 0, 3 0 such that f(s)- f(t)
s- t then f(b)
s
O. b-a
where =
Since g is continuous on the closed interval [a, b] it attains its infimum at some point c E [a, b]. There are three cases: (i) c =a; (ii) a < c < b; (iii) c = b. (i)
g(s) 2: g(a), ~
f(s)- M(s --a) 2: f(a),
~ f(s) - f(a) > M > 0
s-a
=>
But
--
a - - - - - - -
which contradicts (6.6) for t = c, e = M/2. (iii)
g(s) 2 g(b) =>
[a, b].
m
f(s)- M(s- a) 2 f(b)- M(b- a) =
=>
f(a)
f(s)- f(a) 2 M> O, s-a
since
M(b- a)= f(b)- f(a)
a 0, :1 0 such that _f(_s)_-_f(t) > _ 8 s- t -
for
t-- 0. As in (a), l/1 2 has at most one zero so the optimal control sequences are {0, 1} and {1, 0}. The corresponding trajectories are x~ = 2x 1 + k for u* = 1 and x 2 = l for u* = 0. When u* = 0 the phase plane has a line of singularities on
Jw
ANSWERS AND HINTS FOR THE EXERCISES
213
x 2 = 0. The trajectories are traversed from left to right in x 2 > 0 and in the opposite sense in x 2 < 0. The origin cannot be reached on au*= 0 path so the only optimal control sequence is {0, 1}. The only route to 0 is the lower half of x~ = 2x 1 • This is intersected from the right by u* = 0 paths so for initial points in x 2 < 0, x 1 ;;:::; x~/2 the optimal control sequence is {0, 1}. No initial points outside this region can have an optimal control to the origin. We now show that a point outside this region cannot be controlled to 0. Consider first all initial points with x 2 > 0. Now x2 = u and u;;:::: 0 so x 2 cannot decrease and can never be brought to zero. Now consider an initial point in x 2 < 0, x 1 < xV2. If u(t) = 0 then the system will drift off to infinity but any other control satisfying the constraint 0 s u s 1 will take the system along a path that intersects x 2 = 0 to the left of 0. From here we cannot drift along to the origin using u* = 0 because x 2 = 0 is not a trajectory but a line of singularities. Any other choice of u sends both x 1 and x 2 to infinity. Thus the system cannot be controlled to 0 in x 2 < 0, x1 < xV2. This result, that the system cannot be controlled to 0 in any manner outside x 2 < 0, x 1 ;;:::; x~/2, is to be expected. Theorem 4.1 is sufficient for time-optimal control of a linear system; if there is a control that takes the system to 0 then there must be a time-optimal control and it satisfies the theorem. In this example there is no optimal control outside x 2 < 0, x 1 ;;:::; xV2; thus there cannot be any control to 0 outside this region. 3. The controls that maximize H are u* = sgn 1/1 2 • The system eigenvalues are 2 = ± 1 and we call upon Lemma 5.1. The optimal control sequences are { -1, 1} and {1, -1}. The trajectories correspond to those of a saddle point. In fact they are two families of rectangular hyperbolas x~ =
xi + 2u*x 1 + k,
u*
=
±1
When we sketch the u* = ± 1 paths through (!, 0) and ( -!, 0) in Figure A.2 we see that we need the control sequence {1, -1}. This gives an arc of followed by an arc of (x 1
-
1) 2
They intersect at x 1 = 0, x 2 = takes place.
-
x~
= i:.
)3/2 and this is where the switch
214
ANSWERS AND HINTS FOR THE EXERCISES
c
----------r---A-~-----f----~B--~---------7
x,
Figure A.2
The optimal path ACE for Exercise 5.1(3)
To find the minimum time, observe that on AC, x2 = x 1 + !]11 2 • The time T taken to go from A to Cis therefore
+1=
(x~
T=
1
~3/2
o
(
2 Xz
dx
~y;z = sinh- 1 J3.
+4
By symmetry the time taken along ACB is 2 sinh - l J3. 4. (i) a = 0. Similar to Example 5.2. Controllable only m an infinite strip. (ii) a= -l The problem is degenerate. The singularities are (- u* /2, u*). The path to 0 for both families is along the straight line x 2 + 2x 1 = 0. The system can only be controlled to 0 if its initial state lies on the segment of x 2 + 2x 1 = 0 that lies between the singularities. 5. The problem is degenerate. The eigenvalues are 2 and 4 so the u* = ± 1 paths are those of an unstable node. The singularities are (- u* /2, u* /2). Both families reach 0 along part of x 1 = x 2 . The system can only be controlled to 0 if its initial state lies on the segment of x 1 = x 2 that lies between the singularities.
ANSWERS AND HINTS FOR THE EXERCISES
215
6. u* = sgn r/1 2 = sgn(Be1 +A). There is one switch. The corresponding trajectories are
x1
=
-ce- + u*t + D, 1
For optimal control from (1, 0) to (0, 0) we need the control sequence { -1, 0}. The u* = -1 path through (1, 0) is ln(x 2
+ 1) = x 1 + x 2 - 1.
The u* = 1 path through (0, 0) is ln(1- x 2 ) = -x 1
-
X2.
They intersect when
x 2 = - (1 - 1/e) 112 • On the u*
= -1 section, x2 = -x 2 - 1 so the time of the switch is r1
On the final section and arrival at 0 is
=
ln(e 112 j(e 1 i 2
x2 =
-
r2
ln(1
=
(e- 1) 1 12 ))
-
x 2 + 1 so the time between the switch
+ (1 - 1/e) 112 ).
The minimum time is r1
+ r2
+ (e- 1) 112 )
= 2ln(e 1 i 2
7. (i) We require u* = sgn r/1 2 , where ~ 1 = 0, ~ 2 = -3xir/f 1 • We cannot solve these but since r/1 1 = constant and xi > 0, the switching function is monotonic. Thus the optimal control has at most one switch. This is all we need to know. The corresponding trajectories are
xi The switching curve is
=
4u*x 1 + constant.
r defined by 0
xi= -4x 1 , x 2 > 0
for
x 1 < 0.
xi=4x 1 ,
X2
The optimal synthesis is
u*
= {
-1
1
r below r. above
(ii) We require u* = sgn r/1 2 , where ~ 1 = 0, ~ 2 = -r/1 1 + 2x2 r/f 2 . Thus r/1 1 = k and ~ 2 = -k + 2x 2 r/f 2 • The equation for the switching
216
ANSWERS AND HINTS FOR THE EXERCISES
function is of the form
~2 = -k
+ h(t)l/12
with solution
where k and l are constants. This can have at most one zero. The optimal control has at most one switch. The switching curve r is ln(x~+1)=-2x 1 ln(x~
inx 2 >0,
x 1 1. For example r = 2 needs two switches, r = 4 requires eight. (ii) The actual time taken is 1/r times the solution to the scaled problem.
ANSWERS AND HINTS FOR THE EXERCISES
219
First choose w 1 and w 2 such that the number of switches is the same. Then try a pair of values for which the number of switches is significantly different.
5.
O:s;u:s;l.
We need u* = 0 for l/1 2 < 0, and u* = 1 for l/1 2 > 0. The switching function is l/J 2 = A sin(t + /3) as usual. The corresponding trajectories are circles centre (1, O) for u* = 1 and circles centred at the origin for u* = 0. The only route to the origin is the lower half of (x 1 - 1) 2 + x~ = 1, x 2 < 0. Any phase point switching onto this semicircle does so from a circle xi + x~ = k 2 on which it has travelled for time n, sweeping out exactly a half-circle. We continue to trace back from the origin and thus generate the switching curve r. It is very similar to that of Example 5.5 apart from the fact that the only loop that can be traversed is ({,'i. The loop (x 1 + 1) 2 + x~ = 1, x 2 > 0 is simply a locus of switches from u* = 1 to u* = 0. The synthesis is u* = 0 above r, u* = 1 below r. Consider the initial state (3, 1). We need three switches to take
_.---
----
---< ----.- - 0. It does so at ( -1, 1). The time taken is n. At (- 1, 1) the control is switched off and the circle xi + x~ = 2 traversed until the loop rti is hit at (1, -1). The time taken is n. At (1, -1) the control is switched back on and the phase point goes to (0, 0) along rti. It sweeps out an angle n/2 in doing so and this is the time taken on the last section. The minimum time from (3, 1) to (0, 0) is therefore 2 tan - 1 (1/3) + 5n/2. This is considerably larger than the minimum time when the constraint is lui ~ 1. However, note that control to the origin from any initial point is still possible with 0 ~ u ~ 1. Contrast this with the same calculation for the truck problem (Exercise 5.1(2b)). Exercises 5.3 Example 5.8 deals with the time-optimal control of this system to x 2 = 0 and to x 2 = 0, x 1 ~ 0. We already know that the optimal paths are parabolas and that t/1 1 =A, t/1 2 =B-At. (i) %' is x 2 = 0, lx 1 1~ k. Controls to interior points of the target are known. The interesting points are the ends of%', x 2 = 0, x 1 = ±k. The solution goes through in the same way as Example 5.8(ii). The system can be optimally controlled everywhere as shown in Figure A.5. The synthesis is 1.
u* = {
-1 1
above PABQ below PABQ.
(ii) %' is x 1 = 0, lx 2 1~ k Figure A.6 shows the switching curve PABQ. PA is the %'- path through A; BQ is the rt+ path through B. u*
=
{-1 1
above PABQ below PABQ
2. i 1 = x 2 , i 2 = -x 1 + u, lui ~ 1. The optimal trajectories are circles. The time interval between switches is n. Detailed solutions are too lengthy to be included.
221
ANSWERS AND HINTS FOR THE EXERCISES
Figure A.5
The synthesis for Exercise 5.3(1(i)) X
2
II\
----1------1- - - - + - - - - - - + - " - ? x1
Figure A.6
The synthesis for Exercise 5.3(1(ii))
222
ANSWERS AND HINTS FOR THE EXERCISES p
Q
0
u
Figure A.7
Exercise 5.3(2(i))
The method consists of first determining how typical interior points can be reached by u* = ± 1 paths and carefully applying the transversality condition to trace the path backwards in time. Next one examines the end points by the same technique. Any ambiguities, inconsistencies and non-uniqueness have to be carefully examined before a coherent picture emerges. (i) ~is x 2 = 0, x 1 ~0. The switching curve is PQOTUV, where QO and TU are quarter circles of radius one centred at ( -1, O) and (3, O) respectively; PQ is x 1 = -1, x 2 ~ 1 and UVis x 1 = 3, x 2 < -1. Two switches at most are required. Some typical optimal paths are shown in Figure A.7. (ii) ~ is x 2 = 0, lx 1 1s!. Each section of the switching curve consists of two quarter circles of radius 1 and i with a common centre at (1, 0), (3, 0) and so on, joined by a straight line segment as shown. Figure A.8 shows the switching curve and a typical optimal path. (iii) ~ is xi + x~ = 1. The optimal synthesis is shown in Figure A.9. The loops of the switching curve are all semicircles of radius 1.
223
ANSWERS AND HINTS FOR THE EXERCISES
.. ···>··· ..
·······
-'----'.--t--+---+--.----;...---,---.- -- > xl
. ···