Calculus of Variations and Optimal Control Theory: A Concise Introduction (Instructor Solution Manual, Solutions) [1 ed.] 0691151873, 9780691151878

365 88 696KB

English Pages 59 Year 2012

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Calculus of Variations and Optimal Control Theory: A Concise Introduction 9781400842643

This textbook offers a concise yet rigorous introduction to calculus of variations and optimal control theory, and is a

121 54 3MB Read more

Optimal Control and the Calculus of Variations 9780191590498, 9780198514893

A paperback edition of this successful textbook for final year undergraduate mathematicians and control engineering stud

171 47 20MB Read more

Optimal and Robust Estimation: With an Introduction to Stochastic Control Theory, Second Edition (Solutions, Instructor Solution Manual) [2 ed.] 9781420069426, 9780849390081, 0849390087

114 3 2MB Read more

Game Theory: An Introduction (Instructor Solution Manual, Complete Solutions) [1 ed.] 0691129088, 9780691129082

219 7 2MB Read more

Exercises and Solutions in Statistical Theory (Solutions, Instructor Solution Manual) [1 ed.] 9781466572928, 9781466572898, 1466572892

111 62 1MB Read more

Topology: A Geometric Approach (Instructor Solution Manual, Solutions) 0199202486, 9780199202485

272 62 2MB Read more

Discrete and Computational Geometry (Instructor Solution Manual, Solutions) 0691145539, 9780691145532

212 32 543KB Read more

Vector Analysis Versus Vector Calculus (Instructor Solution Manual, Solutions) [1 ed.] 9781461421993, 1461421993

obtained thanks to https://t.me/HermitianSociety

221 58 2MB Read more

Quantum Mechanics: A Mathematical Introduction (Instructor Solution Manual, Solutions) 1009100505, 9781009100502

This original and innovative textbook takes the unique perspective of introducing and solving problems in quantum mechan

277 31 584KB Read more

Sturm-Liouville Theory and its Applications (Instructor Solution Manual, Solutions) [1 ed.] 1846289718, 9781846289712

obtained thanks to https://t.me/HermitianSociety

285 110 432KB Read more

Calculus of Variations and Optimal Control Theory: A Concise Introduction (Instructor Solution Manual, Solutions) [1 ed.]
0691151873, 9780691151878

Author / Uploaded
Daniel Liberzon

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Solutions to Exercises D. Liberzon, Calculus of Variations and Optimal Control Theory See the last page for the list of all exercises along with page numbers where they appear in the book.

Chapter 1 1.1 The answer is no. Counterexample: on the (x1 , x2 )-plane, consider the function f (x) = x1 (1 + x1 ) + x2 (1 + x2 ). Let D be the union of the closed first quadrant {(x1 , x2 ) : x1 ≥ 0, x2 ≥ 0} and some curve (e.g, a circular arc) directed from the origin into the third quadrant. The origin x ∗ = (0, 0) is clearly not a local minimum, because f (x∗ ) = 0 but f is negative for small negative values of x1 and x2 . However, it is easy to check that the listed conditions are satisfied because the feasible directions are 2 0 {(d1 , d2 ) : d1 ≥ 0, d2 ≥ 0} and we have ∇f (x∗ ) = 11 and ∇2 f (x∗ ) = . 02

1

2

DANIEL LIBERZON

1.2 Example: on the (x1 , x2 )-plane, let h1 (x) = x21 − x2 and h2 (x) = x2 . Then D consists of the unique point x∗= (0, a minimum of any function f over D. The gradients 0) which is automatically 0 0 ∗ ∗ are ∇h1 (x ) = −1 and ∇h2 (x ) = 1 and they are linearly dependent, hence x∗ is not a regular point. It remains to choose any function f whose gradient at x∗ is not proportional to 01 —e.g., f (x) = x1 + x2 works. See also Example 3.1.1 on pp. 279–280 in [Ber99]. Another example, a little more complicated but also more interesting, is to consider, on the (x1 , x2 )-plane, the functions h1 (x) = x2 and h2 (x) = x2 − g(x1 ) where ( x21 if x1 > 0 g(x1 ) = 0 if x1 ≤ 0 Then D = {x : x1 ≤ 0, x2 = 0}. The point x∗ = (0, 0) is not a regular point, and we can again easily choose f for which the necessary condition fails. The interesting thing about this example is that the tangent space to D at x∗ is not even a vector space: it is a ray pointing to the left.

3

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL 1.3

Let’s do it for 2 constraints, then it will be obvious how to handle an arbitrary number of constraints. For d1 , d2 , d3 ∈ Rn , consider the following map from R3 to itself:     α1 f (x∗ + α1 d1 + α2 d2 + α3 d3 ) F : α2  7→ h1 (x∗ + α1 d1 + α2 d2 + α3 d3 ) . α3 h2 (x∗ + α1 d1 + α2 d2 + α3 d3 )

The Jacobian of F at (0, 0, 0) is   ∇f (x∗ ) · d1 ∇f (x∗ ) · d2 ∇f (x∗ ) · d3 ∇h1 (x∗ ) · d1 ∇h1 (x∗ ) · d2 ∇h1 (x∗ ) · d3  . ∇h2 (x∗ ) · d1 ∇h2 (x∗ ) · d2 ∇h2 (x∗ ) · d3

Arguing exactly as in the notes, we know that this Jacobian must be singular for any choice of d1 , d2 , d3 . Since x∗ is a regular point and so ∇h1 (x∗ ) and ∇h2 (x∗ ) are linearly independent, we can choose d1 and d2 such that the lower left 2 × 2 submatrix ∇h1 (x∗ ) · d1 ∇h1 (x∗ ) · d2 ∇h2 (x∗ ) · d1 ∇h2 (x∗ ) · d2 is nonsingular (for example, using the Gram-Schmidt orthogonalization process: choose d 1 aligned with ∇h1 (x∗ ) and d2 in the plane spanned by ∇h1 (x∗ ) and ∇h2 (x∗ ) to be orthogonal to d1 ). Since the Jacobian is singular, its top row must be a linear combination of the bottom two, linearly independent by construction, rows: ∇f (x∗ ) · di = λ∗1 ∇h1 (x∗ ) · di + λ∗2 ∇h2 (x∗ ) · di ,

i = 1, 2, 3.

Note that the coefficients λ∗1 and λ∗2 are uniquely determined by our choice of d1 and d2 , and do not depend on the choice of d3 . In other words, we have ∇f (x∗ ) · d3 = λ∗1 ∇h1 (x∗ ) · d3 + λ∗2 ∇h2 (x∗ ) · d3 from which it follows that ∇f (x∗ ) = λ∗1 ∇h1 (x∗ ) + λ∗2 ∇h2 (x∗ ).

∀ d3 ∈ R 3

4

DANIEL LIBERZON

1.4 This is Problem 3.1.3 in [Ber99], page 292 (an easier version appears earlier as Problem 1.1.8, page 19). The function being minimized is f (x) = |x − y| + |x − z|. Writing |x − y| as ((x − y) T (x − y))1/2 , and similarly for |x − z|, it is easy to compute that ∇f (x∗ ) =

x∗ − z x∗ − y + . |x∗ − y| |x∗ − z|

By the first-order necessary condition for constrained optimality, this vector must be aligned with the normal vector ∇h(x∗ ). Geometrically, the fact that the two unit vectors appearing in the above formula sum up to a constant multiple of ∇h(x∗ ) means that the angles they make with it are equal.

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

5

1.5, 1.6 These follow easily from the definitions of the first and second variation by writing down the Taylor expansion of g(y(x) + αη(x)) around α = 0 inside the integral: J(y + αη) =

Z

1

g(y(x) + αη(x))dx = 0

The second variation is

Z

1 0

1 g(y(x)) + g 0 (y(x))αη(x) + g 00 (y(x))α2 η 2 (x) + o(α) dx. 2

1 δ J y (η) = 2 2

Z

1

g 00 (y(x))η 2 (x)dx.

0

This example also appears in Section 5.5 of [AF66].

6

DANIEL LIBERZON

1.7 Let RV = C 0 ([0, 1], R) with the 0-norm k · k0 , let A = {y ∈ V : y(0) = y(1) = 0, kyk0 ≤ 1}, and let 1 J(y) = 0 y(x)dx. It is easy to see that A is bounded, that J is continuous, and that J does not have a global minimum over A because the infimum value of J over A is −1 but it’s not achieved for any continuous curve. What’s not obvious is that A is closed, because to show this we must show that if a sequence of continuous functions {yk } converges to some function y in 0-norm then the limit y is also continuous. The proof of this goes as follows. To show continuity of y, we must show that for every ε > 0 there exists a δ > 0 such that when |x1 − x2 | < δ we have |y(x1 ) − y(x2 )| < ε. Let k be large enough so that kyk − yk0 ≤ ε/3, and let δ be small enough so that |yk (x1 ) − yk (x2 )| < ε/3 whenever |x1 − x2 | < δ (using continuity of yk ). This gives |y(x1 ) − y(x2 )| ≤ |y(x1 ) − yk (x1 )| + |yk (x1 ) − yk (x2 )| + |yk (x2 ) − y(x2 )| < ε and we are done. See also [Rud76, p. 150, Theorem 7.12] or [AF66, p. 103, Theorem 3-11] or [Kha02, p. 655] or [Sut75, p. 120, Theorem 8.4.1].

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

7

Chapter 2 2.1 Students submitted several interesting examples of variational problems, including: • Choose a curve connecting two given points which minimizes the surface area obtained by rotating the curve around the x-axis. The formulation ends up being very similar to brachistochrone. Or can minimize the enclosed volume, or minimize drag during horizontal motion (the latter problem was studied by Newton). • Among curves enclosing a given area, minimize the curve length. (Motivation: minimize the cost of building a fence.) This is “dual” to Dido’s problem. • Drive a car to a destination, or make an airplane reach desired cruising altitude, while minimizing fuel consumption. • Consider a boat moving at constant speed across the river with fixed current. Choose a path to cross the river in minimal time. • Find a shortest path (geodesic) connecting two points on a curved surface (such as a sphere). • Given a source of energy (desirable or dangerous) which decays inversely proportional to square of distance from this source (or some other law), choose a curve connecting two given points which minimizes/maximizes the energy integral. Another interesting problem (from p. 156 of the Russian book on calculus of variations by Lavrentiev and Lyusternik): determine the closed curve such that an airplane flying along this curve, with fixed wind direction and velocity, encloses maximal area in a given time.

8

DANIEL LIBERZON

2.2 This exercise is on page 341 in [Jur96]. y ≡ 0 is a weak minimum because it gives zero cost and for any other curve close enough to it in the sense of the 1-norm we have that y 0 (x) is close to 0 for all x and the cost is nonnegative as long as |y 0 (x)| ≤ 1.

On the other hand, we can construct curves that are close to 0 in the sense of the 0-norm but are “spiky” like in the example right before this exercise. For these curves |y 0 | can be arbitrarily large and we can make the cost approach −∞. So y ≡ 0 is a not a strong minimum. The same construction can be done around any other curve, hence there is no strong minimum.

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

9

2.3 Let us write down the first-order Taylor expansion (mean value theorem) for L(x, y(x)+αη(x), y 0 (x)+ as a function of α around α = 0:

αη 0 (x))

L(x, y(x) + αη(x), y 0 (x) + αη 0 (x)) = L(x, y(x), y 0 (x)) + Ly (x, y(x) + θη(x), y 0 (x) + θη 0 (x))αη(x) + Ly0 (x, y(x) + θη(x), y 0 (x) + θη 0 (x))αη 0 (x) where θ ∈ [0, α].

Notation: Functions without arguments are supposed to be evaluated at (x, y(x), y 0 (x)). When a function appears with an overbar, this indicates that it is evaluated at an “intermediate” point (x, y(x) + θη(x), y 0 (x) + θη 0 (x)). Also, in this problem we don’t want to retain the parameter α, so we set α = 1 in the above expression: L(x, y + η, y 0 + η 0 ) = L(x, y, y 0 ) + Ly η + Ly0 η 0 = L(x, y, y 0 ) + Ly η + Ly0 η 0 + (Ly − Ly )η + (Ly0 − Ly0 )η 0 Integrating, we get J(y + η) = J(y) +

Z

b a

(Ly η + Ly0 η 0 )dx +

Z

b a

(Ly − Ly )η + (Ly0 − Ly0 )η 0 dx

The first integral above matches (2.14), so it is the first variation δJ|y (η). The second integral is the higher-order term that we need to investigate, let us call it ∆. We have |∆| ≤ (b − a) max |Ly (x, y(x) + θη(x), y 0 (x) + θη 0 (x)) − Ly (x, y(x), y 0 (x))| · |η(x)| a≤x≤b + (b − a) max |Ly0 (x, y(x) + θη(x), y 0 (x) + θη 0 (x)) − Ly0 (x, y(x), y 0 (x))| · |η 0 (x)| a≤x≤b

If we divide this by kηk1 = maxa≤x≤b |η(x)| + maxa≤x≤b |η 0 (x)|, we get

|∆| ≤ (b − a) max |Ly (x, y(x) + θη(x), y 0 (x) + θη 0 (x)) − Ly (x, y(x), y 0 (x))| a≤x≤b kηk1 + (b − a) max |Ly0 (x, y(x) + θη(x), y 0 (x) + θη 0 (x)) − Ly0 (x, y(x), y 0 (x))|| a≤x≤b

and it is clear that as kηk1 goes to 0 this ratio approaches 0 because y + θη, y 0 + θη 0 approach y, y 0 (uniformly over x) and L is C 1 hence its partials are uniformly continuous on the finite interval [a, b]. This shows that indeed ∆ = o(kηk1 ). The answer to the last question is no, because η 0 appears in ∆ and so if we divide |∆| by kηk0 = maxa≤x≤b |η(x)| and let this 0-norm go to 0, the ratio is not guaranteed to be small.

10

DANIEL LIBERZON

2.4 This is called the DuBois-Reymond Lemma; see, e.g., pp. 63–64 in [Mac05], also p. 180 in [Lue69]. Define c to be the average value of ξ, i.e., c := Define a new function ξ¯ by

Rb a

ξ(z)dz b−a

¯ ξ(x) := ξ(x) − c

For any η satisfying η(a) = η(b) = 0 we have Z

b a

0 ¯ ξ(x)η (x)dx =

Z

b a

0

ξ(x)η (x)dx − c |

Z

b a

0

η (x)dx = {z }

Z

b

ξ(x)η 0 (x)dx = 0

a

=c(η(b)−η(a))=0

so ξ¯ satisfies the same condition as ξ. Now, define

η(x) := For this η, we first have η(a) = 0 and η(b) =

Z

b a

¯ ξ(z)dz =

Z

Z

x

¯ ξ(z)dz

a

b a

ξ(z)dz − c(b − a) = 0

by our definition of c, so this η is admissible. Next, Z

b a

0 ¯ ξ(x)η (x)dx =

Z

b a

and this must be 0, so ξ¯ ≡ 0 which implies that ξ ≡ c.

2 ¯ (ξ(x)) dx

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

11

2.5 This result is classical; see, e.g., [Mac05], [SW97], [You80]. √ 1+(y 0 )2 √ . By the “no x” result, the quantity The Lagrangian is L = y p 1 + (y 0 )2 1 = −p − Ly 0 y − L = p √ √ √ 0 2 y 1 + (y ) y 1 + (y 0 )2 y 0

(y 0 )2

must be constant, so we have

(1 + (y 0 )2 )y = C for some constant C. Make the substitution y 0 (x) = tan α (this is valid as long as y 0 doesn’t take the dy = −2C cos α sin α. Next, using a same value twice). Then y = C/(1 + tan2 α) = C cos2 α and so dα well-known trig identity, dx dy dx = = −2C cos2 α = −C(cos 2α + 1) =: −C(cos β + 1) dα dy dα dx = − C2 (cos β + 1), which after integration gives x = − C2 (sin β + β) + D In terms of β := 2α we have dβ where D is another constant. Letting a := D − Cπ/2, c := C/2, and θ := π − β, we finally arrive at the equations x = c(θ − sin θ) + a and y = C cos2 α = C2 (1 + cos β) = c(1 − cos θ) which match (2.7).

12

DANIEL LIBERZON

2.6 This is solved in [GF63, Chapter 3] (although it relies on the general formula for the variation of a functional) and in [SW77, Chapter 3]. Let y : [a, xf ] → R be an optimal curve. Since the terminal point is not fixed, let the terminal point of the perturbed curve be xf + α∆x and let the perturbed curve itself on this new interval be y + αη. (Note that this is similar to our later derivation of the Weierstrass-Erdmann corner conditions.) The cost of the perturbed curve is J(y + αη) =

Z

xf +α∆x

L(x, y(x) + αη(x), y 0 (x) + αη 0 (x))dx

a

The first variation, which is the derivative of this with respect to α at α = 0, is Z xf δJ|y (η) = (Ly η + Ly0 η 0 )dx + L(xf , y(xf ), y 0 (xf ))∆x a Z xf x d Ly0 η)dx + Ly0 η af + L(xf , y(xf ), y 0 (xf ))∆x = (Ly η − by parts a dx

Since perturbations η preserving the terminal point (∆x = 0) are still allowed, the usual EulerLagrange equation must still hold and so the integral equals 0. Also, η(a) = 0. We are left with Ly0 (xf , y(xf ), y 0 (xf ))η(xf ) + L(xf , y(xf ), y 0 (xf ))∆x = 0

(7.39)

But η(xf ) and ∆x are related, because the terminal point of the perturbed curve must still be on the curve y = ϕ(x): y(xf + α∆x) + αη(xf + α∆x) = ϕ(xf + α∆x) Let’s differentiate this relation with respect to α and set α = 0: y 0 (xf )∆x + η(xf ) = ϕ0 (xf )∆x Solving this for η(xf ) and plugging the result into (7.39) gives the desired transversality condition L(xf , y(xf ), y 0 (xf )) + Ly0 (xf , y(xf ), y 0 (xf ))(ϕ0 (xf ) − y 0 (xf )) = 0 (∆x is eliminated since it is arbitrary). For the case of the length functional, the transversality condition reduces to 1 + ϕ 0 (xf )y 0 (xf ) = 0, which is still orthogonality (of the tangent vectors (1, ϕ0 ) and (1, y 0 ) at the terminal point).

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

13

2.7 We see immediately from the second canonical equation that in the “no y” case the momentum is constant. In the “no x” case, we compute dH/dx and see that it is 0 because of the two canonical equations and because Hy0 is 0 (as noted also in (2.31) right after the exercise), hence the Hamiltonian is constant.

14

DANIEL LIBERZON

2.8 We just need to work with q ∈ R3N (the vector of generalized coordinates). The kinetic energy is the sum over all particles, same for momentum. The Hamiltonian equals the total energy of all particles. Interaction forces between particles are not a problem because they are absorbed into the potential function (i.e., they are conservative) as long as they depend on distance only. For more information, see Sections 10 and 13 in [Arn89].

15

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL 2.9 The map is F : Its Jacobian at

0 is 0

α1 α2

7→

J(y + α1 η1 + α2 η2 ) C(y + α1 η1 + α2 η2 )

δJ|y (η1 ) δC|y (η1 )

δJ|y (η2 ) δC|y (η2 )

If y is not an extremal of C, then we can pick η1 such that δC|y (η1 ) 6= 0. Then there exists a λ∗ such that δJ|y (η1 ) + λ∗ δC|y (η1 ) = 0. Arguing exactly as in the finite-dimensional case, we can show that the Jacobian is singular. Thus δJ|y (η2 ) + λ∗ δC|y (η2 ) = 0 for arbitrary η2 , which gives the claim (with the help of integration by parts and Lemma 2.1 exactly as in our original derivation of the Euler-Lagrange equation).

16

DANIEL LIBERZON

2.10 These results are classical; see, e.g., [GF63], [Mac05]. p Dido: L = y, M = 1 + (y 0 )2 . The Euler-Lagrange equation for L + λM gives λy 0 d p =1 dx 1 + (y 0 )2

hence

λy 0 p =x+c 1 + (y 0 )2

for some constant c. Squaring both sides and rearranging terms, we get (λ2 − (x + c)2 )(y 0 )2 = (x + c)2 or

This can be integrated:

x+c y0 = ± p λ2 − (x + c)2 y=∓

p

λ2 − (x + c)2 + d

where d is another constant. From this we finally get (y − d)2 + (x + c)2 = λ2 which is the equation of a circle. p p Catenary: L = y 1 + (y 0 )2 , M = 1 + (y 0 )2 . Let us use the “no x” result:

(y 0 )2 p p − 1 + (y 0 )2 = −(y + λ)/ 1 + (y 0 )2 (L + λM )y0 y 0 − (L + λM ) = (y + λ) p 1 + (y 0 )2 p must be constant, so y + λ = 1 + (y 0 )2 c for some constant c. Squaring and solving for (y 0 )2 we have (y + λ)2 −1 (y 0 )2 = c2 or dy dx p = 2 2 c (y + λ) − c Integrating this (using integration table) gives cosh−1 where d is another constant. Therefore

y + λ

y = c cosh

c

=

x+d c

x + d c

−λ

which is the catenary curve (2.3) modulo parallel translation in the (x, y)-plane.

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

17

2.11 L = T − U = 12 m(x˙ 2 + y˙ 2 ) − mgy, M = x2 + y 2 − `2 . Applying the Euler-Lagrange equation to L + λ(t)M componentwise (i.e., separately for x and for y), we easily get the equations m¨ x = 2λ(t)x m¨ y = 2λ(t)y − mg It is difficult to obtain the equations of motion from these equations directly since we don’t know what λ(t) is. On the other hand, if we make the substitution x/y = tan θ then after some calculations λ(t) cancels out and we recover (2.55).

18

DANIEL LIBERZON

2.12 We use the notation introduced in the solution to Exercise 2.3. The o(α 2 ) term can be written as

1 2

Z

b

(Lyy − Lyy )η 2 + 2(Lyy0 − Lyy0 )ηη 0 + (Ly0 y0 − Ly0 y0 )(η 0 )2 dx · α2

a

We can integrate the middle term by parts to bring this expression into the desired form, with 1 P¯ := (Ly0 y0 − Ly0 y0 ), 2

¯ := 1 (Lyy − Lyy ) − 1 d (Lyy0 − Lyy0 ) Q 2 2 dx

¯ are of order o(kαηk1 ), from which Arguing as in the solution to Exercise 2.3, we can show P¯ and Q the claim follows, but not o(kαηk0 ). Alternatively, if we want to avoid integration by parts, we can go one step further in the Taylor expansion and express the o(α2 ) term in terms of third-order partial derivatives: 1 6 We can then define

Z

b a

Lyyy η 3 + 3Lyyy0 η 2 η 0 + 3Lyy0 y0 η(η 0 )2 + Ly0 y0 y0 (η 0 )3 dx · α3

α α P¯ := Lyy0 y0 η + Ly0 y0 y0 η 0 , 2 6 and these still have the required properties.

¯ := α Lyyy η + α Lyyy0 η 0 Q 6 2

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

19

2.13 This is discussed (with some proof details omitted) in Section 27 of [GF63]; see also Exercise 9.38 in [Mac05]. The Euler-Lagrange equation for v is Ly (x, y + v, y 0 + v 0 ) =

d Ly0 (x, y + v, y 0 + v 0 ) dx

(7.40)

In the notation of the solution to Exercise 2.3 (after setting α = 1 there and with v in place of η) the left-hand side of (7.40) is (7.41) Ly + Lyy v + Lyy0 v 0 As for the right-hand side of (7.40), we first write Ly0 (x, y + v, y 0 + v 0 ) = Ly0 + Lyy0 v + Ly0 y0 v 0 and then, taking

d dx ,

we get d d d Lyy0 v + Lyy0 v 0 + Ly0 y0 v 0 + Ly0 y0 v 00 Ly 0 + dx dx dx

(7.42)

On the other hand, the desired equation (2.68), in view of the definitions of P and Q, is Lyy v −

d d Lyy0 v − Ly0 y0 v 0 − Ly0 y0 v 00 ≈ 0 dx dx

(7.43)

where ≈ is equality up to higher-order terms that we need to study. Writing down that (7.41) and (7.42) are equal, and also that y satisfies the Euler-Lagrange equation, we get some cancellations and indeed arrive at (7.43) where the higher-order terms on the right-hand side are (Lyy − Lyy )v +

d d d d Lyy0 − Lyy0 v + Ly 0 y 0 − Ly0 y0 v 0 + Ly0 y0 − Ly0 y0 v 00 dx dx dx dx

As in the previous exercises, we see that these are of order o(kvk), but the norm must be the 2-norm due to the presence of v 00 (even though [GF63] seems to imply that this should work with the 1-norm).

20

DANIEL LIBERZON

2.14 With the action Lagrangian, we have Lq˙q˙ = mI3×3 which is positive definite. Also, if the time interval is sufficiently small, the presence of conjugate points on it can be ruled out. Thus the secondorder sufficient condition applies (after a proper change of notation) and shows that extremals are automatically minima.

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

21

Chapter 3 3.1 Close to the corner point we are changing the derivative: in the interval [c, c + α∆x] between the old and the new corner point we have kyα − yk1 ≈ |y10 − y20 |. So, yα is not close to y with respect to the 1-norm. But to derive the first W-E corner condition we can work with ∆x = 0. Since ∆y can still be arbitrary, continuity of Ly0 still follows (see the last display in the proof).

22

DANIEL LIBERZON

3.2 We have L = (y 0 )3 , Ly0 = 3(y 0 )2 , y 0 Ly0 − L = 2(y 0 )3 . Since this is the “no x” case, y 0 must be constant, so the piecewise C 1 extremals must be piecewise linear. Moreover, to satisfy both W-E corner conditions, the slope cannot change at the corners, so they must be straight lines. The only line satisfying the boundary conditions is y ≡ 0. It is neither a weak nor a strong minimum (this is easy to see by constructing perturbations). This example appears on pp. 210 and 220 in [Mac05].

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

23

3.3 The first W-E corner condition follows immediately from the integral form of the Euler-Lagrange equation: since Ly0 is an integral, it must be continuous. The Weierstrass necessary condition was stated for non-corner points. However, by taking the limit from the left/right approaching a corner point and using continuity of L y0 , we easily see that it also holds at corner points in the sense that E(x, y(x), y 0 (x± ), w) ≥ 0. To show the second W-E corner condition at a corner point x, let us apply the above Weierstrass necessary condition at x+ with w := y 0 (x− ). We get E(x, y(x), y 0 (x+ ), y 0 (x− )) ≥ 0. By continuity of Ly0 , we have Ly0 (x, y(x), y 0 (x+ )) = Ly0 (x, y(x), y 0 (x− )), and so we can rewrite the previous inequality as y 0 (x− )Ly0 (x, y(x), y 0 (x− )) − L(x, y(x), y 0 (x− )) ≥ y 0 (x+ )Ly0 (x, y(x), y 0 (x+ )) − L(x, y(x), y 0 (x+ )) Interchanging the roles of x+ and x− , we arrive at the same inequality with the opposite sign. Thus it must actually be equality, which is exactly the second W-E corner condition. Finally, to show Legendre’s condition, write the second-order Taylor expansion of L: 1 L(x, y(x), w) = L(x, y(x), y 0 (x)) + (w − y 0 (x))Ly0 (x, y(x), y 0 (x)) + (w − y 0 (x))2 Ly0 y0 (x, y(x), v) 2 where v is between y 0 (x) and w. The Weierstrass necessary condition then implies that Ly0 y0 (x, y(x), v) ≥ 0. In the limit as w → y 0 (x) we obtain Legendre’s condition. Note that here we’re only using the Weierstrass necessary condition locally, i.e., for w close to y 0 , which confirms that it holds for weak minima as well. Also, corner points can be handled as before by working with x ± . See also [Mac05], pp. 216–217.

24

DANIEL LIBERZON

3.4 The first set of hypotheses is: f is continuous in t and u and C 1 in x; fx is continuous in t and u; and u(·) is piecewise continuous in t. Consider f¯ as in (3.20). It is piecewise continuous in t because (continuous)◦(piecewise continuous) is piecewise continuous. We also clearly have that f¯ is C 1 in x and f¯x is piecewise continuous in t. Therefore, existence and uniqueness of solutions is guaranteed for x˙ = f¯(t, x) and hence for the control system under every piecewise continuous control. The second set of hypotheses is: f is continuous in t and u; u(·) is piecewise continuous in t; and we have, locally in (t, x, u), the Lipschitz property |f (t, x1 , u) − f (t, x2 , u)| ≤ L|x1 − x2 |

(7.44)

If t is bounded then, since u is piecewise continuous, u(t) is also bounded. Thus we have, locally in (t, x), |f¯(t, x1 ) − f¯(t, x2 )| = |f (t, x1 , u(t)) − f (t, x2 , u(t))| ≤ L|x1 − x2 | by (7.44). The rest is as before. Further relaxation: f and fx can be only piecewise continuous in t. On the other hand, continuity of f in u cannot be relaxed to piecewise continuity, because (piecewise continuous)◦(piecewise continuous) is not necessarily piecewise continuous (just think of a composite function f ◦ g where g is constant and equal to a value of discontinuity of f ). See [Kha02, Son98, AF66] for more information (specific places in these books are given in the Notes and References).

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

25

3.5 The Hamiltonian is H = p1 u1 + · · · + pn un − L(t, x1 , . . . , xn , u1 , . . . , un ). The adjoint equations are p˙ ∗i = Lxi |∗ . We have Hui |∗ = p∗i − Lx˙ i |∗ and this must be 0, hence p∗i = Lx˙ i |∗ . Comparing this d Lx˙ i |∗ = Lxi |∗ which are the Euler-Lagrange equations. with the adjoint equation, we get dt

26

DANIEL LIBERZON

3.6 For simplicity, let’s start with the case n = 2, k = 1. In this case x˙ 1 = f (x1 , x2 , u), x˙ 2 = u, L = L(x1 , x2 , x˙ 1 , x˙ 2 ), H = p1 f (x1 , x2 , u)+p2 u−L(x1 , x2 , f (x1 , x2 , u), u), and we must have (omitting stars) 0 = Hu = p1 fu + p2 − Lx˙ 1 fu − Lx˙ 2 (7.45) The adjoint equation is p˙1 = −Hx1 = −p1 fx1 + Lx1 + Lx˙ 1 fx1

p˙2 = −Hx2 = −p1 fx2 + Lx2 + Lx˙ 1 fx2 Define λ∗ = p∗1 − Lx˙ 1 |∗ The augmented Lagrangian is

L(x1 , x2 , x˙ 1 , x˙ 2 ) + (p1 − Lx˙ 1 )(x˙ 1 − f (x1 , x2 , x˙ 2 )) =: L Calculating its partial derivatives: Lx1 = Lx1 − (p1 − Lx˙ 1 )fx1 , Lx2 = Lx2 − (p1 − Lx˙ 1 )fx2 , Lx˙ 1 = Lx Lx ˙ 1 + p1 − ˙ 1 , Lx˙ 2 = Lx˙ 2 − (p1 − Lx˙ 1 )fu = p2 where the last equality follows from (7.45). This gives d Lx˙ = p˙1 = −p1 fx1 + Lx1 + Lx˙ 1 fx1 = Lx1 dt 1 which is the first Euler-Lagrange equation, and next, d Lx˙ = p˙2 = −p1 fx2 + Lx2 + Lx˙ 1 fx2 = Lx2 dt 2 which is the second Euler-Lagrange equation. In the general case, the definition of the Lagrange multipliers is λ∗i = p∗i − Lx˙ i |∗ ,

i = 1, . . . , k

The calculations are similar but more tedious. They can be found in Chapter 5 of [PBGM62].

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL 3.7 Straightforward calculation.

27

28

DANIEL LIBERZON

3.8 a) H = pT Ax + pT Bu − xT Qx − uT Ru. Adjoint equation is p˙ = −AT p − 2Qx. Since 0 = Hu = B T p − 2Ru, the formula for the optimal control is u∗ = 12 R−1 B T p∗ . b) The Hessian of H is

2Q 0 ∇ H=− 0 2R 2

Rt The second variation is δ 2 J u∗ (ξ) = t01 (η T Qη +ξ T Rξ)dt and this is positive for each nonzero control perturbation ξ. In fact, due to the linear-quadratic structure of the problem, here the perturbed trajectory depends linearly on α: x = x∗ + αη (clear, e.g., from variation of constants formula), and there are in fact no terms of order higher than α2 in J(u∗ + αη) − J(u∗ ). So, positivity of the second variation allows us to conclude that u∗ is indeed a minimum, and in fact a global one. See [AF66], Corollary 5-2 and Exercise 5-13, pp. 270-271, and solution on p. 763. More on this class of problems in Chapter 6.

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

29

Chapter 4 4.1 A solution can be found in [SW97]. First, note that brachistochrone is a free-time, fixed-endpoint problem. So, we should apply the maximum principle for the Basic Fixed-Endpoint Control Problem. The Hamiltonian is H = √ √ p1 u1 y + p2 u2 y − p0 ≡ 0 along the optimal trajectory. It is maximized by choosing the control vector to be proportional to the adjoint vector and satisfying the magnitude constraint: 1 p1 u1 = 2 2 u2 p1 + p 2 p2 The adjoint equation is p˙1 = 0 p˙2 = −

|p| p1 u 1 + p 2 u 2 =− √ √ 2 y 2 y

We see that p1 is a constant. If it is 0, then u1 is also identically 0 and we get a vertical downward motion. Otherwise, u1 is never 0 and we can parameterize the curves4 by x. We have y 0 (x) = p2 /p1 and 1 p˙2 |p|2 1 dp2 = =− 2 y 00 (x) = p1 dx p1 x˙ 2p1 y From this we get the equation 1 + (y 0 )2 + 2yy 00 = 0 which is the Euler-Lagrange equation for brachistochrone, and we already know that its solutions are cycloids. The issue of existence of optimal curves still needs to be settled, and this can be done using the existence results in Section 4.5.

4

In calculus of variations, we tacitly assumed that y = y(x) is a well-defined (single-valued) function; here it is part of the proof, which is an advantage of the optimal control approach.

30

DANIEL LIBERZON

4.2 This is Exercise 5-21 in [AF66, p. 344].

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

31

4.3 Otherwise we can, starting at y(t3 ), apply the time-shifted version of the optimal control piece u∗[t2 ,t∗ ] and get a lower cost. More formally, if t3 = t2 + τ , then consider u ¯(t) := u∗ (t − τ ),

t ∈ [t3 , t∗ + τ ]

Then the corresponding trajectory, starting at x = x∗ (t2 ) at t = t3 , hits S 0 at time t∗ + τ with cost lower than y ∗ (t∗ ), a contradiction. Time-independence of the dynamics and the cost is important here: Z ∗ Z ∗ t +τ

t

L(x, u)dt =

t2 +τ

L(x, u)dt

t2

In particular, no other trajectory starting from y ∗ (t1 ) = (x0,∗ (t1 ), x∗ (t1 )) can hit S 0 below y ∗ (t∗ ).

32

DANIEL LIBERZON

4.4 Apply w1 on the interval (b − ε(a1 + a2 )b, b − εa2 b] and w2 on the interval (b − εa2 b, b]. We get y(b) = y ∗ (b) + εΦ∗ (b, b − εa2 b)νb−εa2 b (w1 )a1 + ενb2 (w2 )a2 + o(ε) Simply by continuity (and since b is not a discontinuity of u∗ ) we can write Φ∗ (b, b − εa2 b) = I + O(ε) and νb−εa2 b (w1 ) = νb (w1 ) + O(ε). Hence y(b) = y ∗ (b) + ενb (w1 )a1 + ενb2 (w2 )a2 + o(ε) and we’re done. See [PBGM62].

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

33

4.5 The warping map—let’s call it F —is continuous because the terminal points depend continuously on the perturbation parameters which parameterize the ball Bε ; see [PBGM62]. For any α ∈ (0, ε), the o(ε) terms satisfy |o(ε)| < αε if ε is small enough. For an arbitrary z in the (1 − α)ε ball, we want to find a y in Bε such that F (y) = z, which is equivalent to y = y − F (y) + z. The map G(y) := y − F (y) + z maps Bε to itself, hence it has a fixed point. See [Sus00], Lemma 5.3.1 (r ↔ ε, ρ ↔ o(ε)), also [LM67].

34

DANIEL LIBERZON

4.6 For t0 > t, m(x∗ (t0 ), p∗ (t0 )) − m(x∗ (t), p∗ (t)) t →t t0 − t ∗ 0 ∗ 0 H(x (t ), p (t ), u∗ (t)) − H(x∗ (t), p∗ (t), u∗ (t)) (now the arguments of u∗ are the same!) ≥ lim t0 →t t0 − t

= h Hx |∗ , x˙ ∗ (t)i + Hp |∗ , p˙∗ (t) = Hx |∗ , Hp |∗ + Hp |∗ , −Hx |∗ = 0 lim 0

If we use u∗ (t0 ) instead of u∗ (t), can check that the inequality is flipped: m must be differentiable as a function of time ⇒ the time derivative is 0.

dm dt t

≤ 0. We know that

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

35

4.7 p˙∗ = −(fx )T ∗ p¯∗ + Kxx |∗ f |∗ + (fx )T ∗ Kx |∗ − Kxx |∗ f |∗ = −(fx )T ∗ p∗ = −Hx |∗

For the second question, the reason is the same as the one behind (4.33): the above is a homogeneous LTV system, and the terminal condition is nonzero (we assumed this).

36

DANIEL LIBERZON

4.8 We can eliminate the terminal cost by adding hKx , f i to the running cost. In the new problem, p, f i + p0 (L + hKx , f i) + pn+1 = h¯ the Hamiltonian will be H = h¯ p + p0 Kx , f i + p0 L + pn+1 . The transversality condition for this new problem is h¯ p(tf ), di = 0 for all d ∈ Txf S1 . Now we can go back to the original problem by defining p(t) := p¯(t) + p0 Kx (x(t)), which gives the transversality condition hp∗ (tf ) − p∗0 Kx (xf ), di = 0

∀ d ∈ T xf S 1

where p∗0 ≤ 0. Canonical equations and the Hamiltonian R tf maximization condition are the same as before. For the Hamiltonian we have H|∗ (t) = − t Ht |∗ (s)ds and the boundary condition H|∗ (tf ) = 0. Theorem 5-11 in [AF66] is close but not exactly the same, and doesn’t have full details.

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

37

4.9 Perturbations of the initial points lead to infinitesimal directions of the form (0, d 1 ) in y-space, which when propagated up to time t∗ end up in the terminal cone, so we have (propagating back to initial time) hp∗ (t0 ), d1 i ≤ 0. On the other hand, at the final time we have as before hp∗ (t∗ ), d2 i ≥ 0, or h−p∗ (t∗ ), d2 i ≤ 0. Here, the vector d = (d1 , d2 ) is tangent to S2 . Combining, and using the fact that −d is also in the tangent space, we get (4.46). Note that we cannot pass from inequality to equality before combining the two inequalities into one, since d1 and d2 may be related; e.g., (d1 , −d2 ) might not be in the tangent space to S2 .

38

DANIEL LIBERZON

4.10 The difference with the example in the text is that now we’ll have the transversality condition p∗2 (tf ) = 0. Since p∗2 is a nonzero linear function of time, this means that p∗2 6= 0 for t < tf . Therefore, the optimal control is constant (doesn’t switch). For x0 < 0 it is u∗ ≡ 1 and for x0 > 0 it is u∗ ≡ −1. In terms of car motions, it means apply maximal acceleration/braking to turn around if necessary and slam into the wall at 0. This is [Sus00], “hard landing”, Handout 7, p. 5.

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

39

4.11 ∗ The adjoint equation is p˙ 1 = p2 , p˙p 2 = −p1 . So, u (t) = sgn(c1 sin t + c2 cos t) for some constants 2 2 c1 , c2 . We see that if we normalize by c1 + c2 and use the identity sin2 t+cos2 t = 1, we can simplify the expression to u∗ (t) = sgn(sin(t + d)) where d = arctan(c2 /c1 ). So the control is bang-bang, and the switches occur π seconds apart. When u = 1, we have x˙ 1 = x2 , x˙ 2 = −x1 + 1 and the trajectories are circles centered at (1, 0). When u = −1, we have x˙ 1 = x2 , x˙ 2 = −x1 − 1 and we get circles centered at (−1, 0). For further details, see [PBGM62, §5] or [Kno81, p. 22].

40

DANIEL LIBERZON

4.12 ∗

∗

We have hp∗ (t∗ ), eA(t −t) Bu∗ (t)i = maxu∈U hp∗ (t∗ ), eA(t −t) Bu(t)i. By controllability, the vector ∗ ∗ ν (t) := hp∗ (t∗ ), eA(t −t) Bi is not identically 0. The unit ball is strictly convex, so u∗ is unique for almost all t and is on the boundary of the ball. In fact, it’s given by the formula u∗ (t) = ν ∗ (t)/|ν ∗ (t)|. See Section 10.3 in [Son98].

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL 4.13 Answer: = hp∗ , [f, gi ]i|∗ +

P

j6=i

hp∗ , [gj , gi ]i|∗ u∗j .

41

42

DANIEL LIBERZON

4.14 Repeating the calculation in the text, we arrive at the condition that [g, (adf ) m (g)] should be a linear combination of (adf )i (g), i = 0, . . . , m + 1: [g, (adf )m (g)](x) =

m+1 X i=0

such that |αm+1 (x)| < 1 for all x. See [Sus79, Sus83] for details.

αi (x)(adf )i (g)(x)

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

43

4.15 The following example is taken from [Sus83]. Consider the time-optimal problem in R 3 : x˙ 1 = x2 x˙ 2 = u x˙ 3 = x21 |u| ≤ 1. Goal: (x1 (0), x2 (0), −J ∗ ) 7→ (0, 0, 0), where J ∗ is the optimal cost in Fuller’s problem. Then, we have exactly one trajectory that meets the boundary conditions ⇒ it’s automatically timeoptimal, and it has infinitely many switches. So, the property stated earlier for time-optimal controls in R2 does not extend to higher dimensions.

44

DANIEL LIBERZON

4.16 This is Example 3.5 in [BP07]. We can reach points arbitrarily close to (0, 0), of the form (0, ε), but (0, 0) ∈ / Rt because ε > 0 always (x1 cannot remain at 0). The problem of minimizing x2 (tf ) has no solution. For the convex case, we get a larger reachable set which will now be closed by Filippov’s theorem. In fact, this larger reachable set will be exactly the closure of the original one, i.e., the original one is dense in the new one. This is a special instance of the so-called relaxation theorem. Note, however, that for linear systems the reachable set is closed without convexification of U —see, e.g., [BP07, p. 67].

45

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

Chapter 5 5.1 Solution from [YZ99]: V (t, x) ≤ J(t, x, u) (u is arbitrary) Z t+∆t = Lds + J(t + ∆t, x(t + ∆t), u) t

Take inf over u on the right-hand side, noting that inf u[t,t1 ] = inf u[t,t+∆t] inf u[t+∆t,t

1]

and the second

infimum can be moved inside because the integral doesn’t depend on it. This gives V (t, x) ≤ V . Solution from [BP07]: For arbitrary ε > 0, there exists u[t,t+∆t] such that Z

t+∆t t

Lds + V (t + ∆t, x(t + ∆t)) ≤ V + ε

and there exists u[t+∆t,t1 ] such that J(t + ∆t, x(t + ∆t), u[t+∆t,t1 ] ) ≤ V (t + ∆t, x(t + ∆t)) + ε Taking u to be the concatenation of these two controls, we get J(t, x, u) ≤ V + 2ε, hence V = inf J must be ≤ V .

46

DANIEL LIBERZON

5.2 The value function V will not depend on t, which simplifies the HJB equation (left-hand side becomes 0). Solving HJB directly is hard. However, we can find V from the solution based on the maximum principle in Section 4.4.1, by calculating the time it takes to reach the origin along the optimal trajectory from an arbitrary initial condition. In fact, since |x˙ 2 | = |u| = 1, this time equals the total vertical distance traveled along the trajectory. This calculation is done in Example 7.3 in [BP07], p. 146. The formula for the value function is  q  x2 + 2 x 1 + 1 x2 if x1 ≥ − 12 |x2 |x2 2 2 q V (x1 , x2 ) = −x2 + 2 −x1 + 1 x2 if x1 < − 1 |x2 |x2 2 2

It can be directly verified that this V solves the HJB equation.

2

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

47

5.3 This is a simple manipulation of things already mentioned. Since Vb is the cost for the given control, it must satisfy the HJB equation with u ˆ plugged in, as in the second line of (5.13). Since it also satisfies the HJB equation with the inf, we conclude that u ˆ satisfies the H-maximization condition. Now we can invoke the sufficient condition. Alternatively, we can just follow the proof of the sufficient condition (the first part is not needed, just the second part).

48

DANIEL LIBERZON

5.4 Sufficiency consists of three conditions. First, the infinite-horizon version of the HJB equation:

0 = inf L(t, x, u) + Vbx (t, x), f (t, x, u) u∈U

(no boundary condition). Second, the minimization condition

L(t, x ˆ(t), u ˆ(t)) + Vbx (t, x ˆ(t)), f (t, x ˆ(t), u ˆ(t)) = min L(t, x ˆ(t), u) + Vbx (t, x ˆ(t)), f (t, x ˆ(t), u) u∈U

And third, we must Rassume that Vb (x(t)) → 0 as t → ∞ along every trajectory x(·) for which the ∞ infinite-horizon cost t0 Ldt is bounded. This last condition is unpleasant (it’s not easy to check) but without it the proof doesn’t work. With this condition, however, the proof works almost exactly as before. Alternatively, can assume that Vb (0) = 0 and all bounded-cost trajectories converge to 0 (although this is stronger and not any easier to check).

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL 5.5

p˙∗ = −

d ( Vx |∗ ) = −Vtx |∗ − Vxx |∗ ·x˙ ∗ dt | {z } matrix

= (swapping partials) −Vxt |∗ −Vxx |∗ · f |∗   ∂   T =− Vt + hVx , f i + (fx ) Vx |∗ {z } ∂x | |{z} =−L by HJB ∗ =−p∗ T = Lx |∗ − (fx ) p∗ ∗

and this is the correct adjoint equation.

49

50

DANIEL LIBERZON

5.6 A function with infinite slope (non-Lipschitz) like x1/3 , or a function like v(x) = x sin(1/x), both considered at x = 0. Or a function of two variables with a saddle point.

51

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL 5.7

This is proved in [BP07]. Fix an arbitrary pair (t0 , x0 ). We need to show that for every C 1 test function ϕ = ϕ(t, x) such that ϕ − V attains a local maximum at (t0 , x0 ) we have the inequality ϕt (t0 , x0 ) + inf {L(t0 , x0 , u) + hϕx (t0 , x0 ), f (t0 , x0 , u)i} ≤ 0. u∈U

The negation of this statement is that there exist a C 1 function ϕ satisfying ϕ(t, x) ≤ V (t, x) ∀ (t, x) near (t0 , x0 )

ϕ(t0 , x0 ) = V (t0 , x0 ), such that for all controls u we have

ϕt (t0 , x0 ) + L(t0 , x0 , u) + hϕx (t0 , x0 ), f (t0 , x0 , u)i > Θ > 0. Taking (t0 , x0 ) as the initial condition, let us consider an arbitrary control u(·) and the corresponding state trajectory x(·) on a small interval [t0 , t0 + ∆t]. We will now see that the rate at which the value function decreases during this time interval is too slow to be consistent with the principle of optimality. As long as we pick ∆t to be sufficiently small, we have Z t0 +∆t d ϕ(t, x(t))dt V (t0 + ∆t, x(t0 + ∆t)) − V (t0 , x0 ) ≥ ϕ(t0 + ∆t, x(t0 + ∆t)) − ϕ(t0 , x0 ) = dt t0 Z t0 +∆t Z t0 +∆t = ϕt (t, x(t)) + hϕx (t, x(t)), f (t, x(t), u(t))i dt > − L(t, x(t), u(t))dt + Θ∆t t0

t0

hence

V (t0 , x0 )
0 will do, or can assume Q(t) > 0 for all t, or observability (as discussed in Section 6.2.4).

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL 6.3 The cost-to-go from time t for control u is Z t1 xT Qx + uT Ru dt + xT (t1 )M x(t1 ) t Z t1 = xT Qx + uT Ru dt + xT (t1 )P (t1 )x(t1 ) t Z t1 = xT Qx + uT Ru dt + xT (t1 )P (t1 )x(t1 ) − xT (t)P (t)x(t) + xT (t)P (t)x(t) t Z t1 d = xT Qx + uT Ru + xT P x dt + xT (t)P (t)x(t) dt t Z t1 = xT Qx + uT Ru + xT (P˙ + P A + AT P )x + xT P Bu + uT B T P x dt + xT (t)P (t)x(t) x=Ax+Bu ˙ t Z t1 = xT P BR−1 B T P x + uT Ru + xT P Bu + uT B T P x dt + xT (t)P (t)x(t) RDE t Z t1 = (B T P x + Ru)T R−1 (B T P x + Ru)dt + xT (t)P (t)x(t) t

and now the optimal control and optimal cost-to-go are obvious.

55

56 6.4

DANIEL LIBERZON

√ ± 3√1 , need to pick the one with the plus signs to get a The solution of the ARE is P = 1 ± 3 positive semidefinite matrix. It is not very hard to solve the RDE either by using the Hamiltonian matrix as in Exercise 6.1 or just by computer simulation. The answer to the last question is no (as long as M ≥ 0 of course); even though in our analysis so far we did use the fact that M = 0 (see, e.g., the monotonicity argument) the independence of the limit from M follows from the proof of uniqueness of the solution of the ARE in Theorem 6.1.

CALCULUS OF VARIATIONS AND OPTIMAL CONTROL

57

6.5 Controllability was used to show the cost is bounded, but stabilizability gives that too. Indeed, picking a control that makes the closed-loop system exponentially stable, we get a bounded cost. Observability was used twice: First, when proving closed-loop stability, we needed the property y → 0 ⇒ x → 0—detectability gives that too. Second, in statement 1, to prove P > 0 we needed the property y ≡ 0 ⇒ x ≡ 0—detectability doesn’t give this, so we can only get P ≥ 0. Uniqueness remains valid for P ≥ 0 as we said in our uniqueness proof. See [KS72] for details.

58

DANIEL LIBERZON

6.6 It’s straightforward to derive that the closed-loop system is r b2 q ∗ ∗ x x˙ = − a2 + r from which all claims follow.

LIST OF EXERCISES

59

List of exercises 1.1 (sufficiency), 9 1.2 (regular point), 14 1.3 (IFT for multiple constraints), 15 1.4 (law of reflection), 16 1.5 (first variation example), 21 1.6 (second variation example), 22 1.7 (infinite dimensional compactness for Weierstrass theorem), 24 2.10 (Dido and catenary), 55 2.11 (pendulum via Lagrange multipliers), 58 2.12 (error term for 2nd-order sufficiency), 60 2.13 (Jacobi equation is variational equation for Euler-Lagrange), 65 2.14 (check principle of least action), 67 2.1 (example of variational problem), 31 2.2 (weak vs. strong minimum), 34 2.3 (norm-based first variation), 36 2.4 (DuBois-Reymond lemma), 41 2.5 (brachistochrone), 42 2.6 (variable terminal point, transversality conditions), 44 2.7 (two special cases via Hamilton’s equations), 45 2.8 (principle of least action for N particles), 50 2.9 (IFT for calculus of variations with constraints), 54 3.1 (W-E conditions for weak vs. strong minimum), 76 3.2 (broken extremals), 76 3.3 (Weierstrass implies W-E and Legendre), 80 3.4 (existence and uniqueness of solutions for control system), 85 3.5 (recovering Euler-Lagrange from MP), 94 3.6 (recovering Lagrange multipliers from MP), 94 3.7 (second variation), 95 3.8 (second-order conditions for LQR), 98 4.10 (hard landing), 137 4.11 (harmonic oscillator), 138 4.12 (bang-bang principle for unit ball), 140 4.13 (derivative of switching function: multiple inputs), 143

4.14 (ruling out singularity), 145 4.15 (Fuller in 3-D), 147 4.16 (nonclosed reachable set), 150 4.1 (brachistochrone using MP), 103 4.2 (example illustrating main proof steps), 107 4.3 (principle of optimality), 108 4.4 (stacking two needle perturbations), 116 4.5 (Brouwer’s fixed point theorem), 120 4.6 (Hamiltonian is constant), 125 4.7 (adjoint equation for terminal cost), 133 4.8 (maximum principle for general case), 134 4.9 (transversality with initial set), 134 5.1 (principle of optimality for value function: the other half), 161 5.2 (HJB for parking example), 164 5.3 (different take on sufficiency), 167 5.4 (sufficiency for infinite horizon), 167 5.5 (deriving adjoint equation from HJB), 169 5.6 (empty D + v(x) and D − v(x)), 174 5.7 (V is viscosity supersolution of HJB), 178 5.8 (sufficiency via viscosity), 178 6.1 (solving RDE via Hamiltonian system), 184 6.2 (P (t) is symmetric positive (semi)definite), 187 6.3 (direct derivation of optimal cost and control), 187 6.4 (example of solving ARE), 192 6.5 (stabilizability and detectability), 198 6.6 (cheap vs. expensive control), 198