369 63 67MB
English Pages 183 [200] Year 1986
Differential Calculus
111111(
111111(
•
►
/t~
/
A. AVEZ
►
~
NUNC COCNOSCO EX PARTE
THOMASJ. BATA LIBRARY TRENT UNIVERSITY
Digitized by the Internet Archive in 2019 with funding from Kahle/Austin Foundation
https://archive.org/details/differentialcalc0000avez
Differential Calculus
Differential Calculus A. AVEZ Université Pierre et Marie Curie France
Translated by D. EDMUNDS University of Sussex
UK
A Wiley-/nterscience Publication
JOHN WILEY AND SONS Chichester · New York · Brisbane · Toronto · Singapore
Originally published under the title Calcul Differentiel by A. Avez © Masson , París 1983 Copyright © 1986 by John Wiley & Sons Ltd . Ali rights reserved . No part of this book may be reproduced by any means, or transmitted, or translated into a machine language without the written permission of the publishers.
Library o/ Congress Cataloging-in-Publication Data: Avez, A. (André) Differential calculus. Translation of Calcul differentiel. 'A Wiley-Interscience publication.' Bibliography: p. Includes index. 1. Calculus, Differential. l. Title. QA304.A8413 1986 515.3 85-20177 ISBN O 471 90873 8
British Library Cataloguing in Publication Data: Avez, André Differential calculus 1. Calculus, Differential l. Title 515.3'3 QA304 ISBN O 471 90873 8
Printed and bound in Great Britain
To the memory of Renée Gallai and Gaston Roux, to lean Houlle. Three people who taught me most of the use/u/ things that I know. Two scholars the like of whom we sha/1 not see again.
Contents Foreword Notation
X
xi
l. The concept of a derivative
1
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Rules for calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Mean-value theorems
11
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Mean-value theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Converse of § 1.2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Necessary and sufficient condition for a map to be of class C 1 . . . . . . . . . . . • • • • • • • • • • • • • • • • • • . • • • • • . . . . . . . • • • • 2.4 A criterion for uniform convergence . . . . . . . . . . . . . . . . . . . . 2.5 Sard's theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 lntegration of regulated functions, and the fundamental theorem of integral calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Tbe concept of diffeomorphism, and the solution of equations
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Diffeomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 .2 The in verse function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 The implicit function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 An application of the implicit function theorem . . . . . . . . . .
11 11 14 14 15 16 18 22 22 22 23 28 29 31
4. Higher-order derivatives
4.1 4.2 4.3 4.4
1 1 4
Successive derivatives, and Schwarz' theorem . . . . . . . . . . . . . Rules for calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Taylor's formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Taylor series and analyticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
31 36 42 48
Vlll
S. The exponential function, and linear differential equations with constant coefficients
50
5 .1 5.2 5.3 5.4
Defmitions of the exponential ...... ................... . Properties of the exponential .......................... . One-parameter group of linear automorphisms .......... . Homogeneous linear differential equations with constant coefficients .......................................... . 5.5 Explicit calculation of solutions ........................ . 5.6 Homogeneous linear differential equations of order n with constant coefficients .................................. . 5.7 Bounded or periodic solutions of x' =Ax ............... .
. . . .
.
7. Vector fields and differential equations 7 .1 7 .2 7 .3 7.4 7 .5
Vector fields and autonomous differential equations ...... Existence and uniqueness of integral curves ............. Dependence on initial conditions ....................... Complete vector fields ................................ One-parameter groups of diffeomorphisms ..............
61 64
65 65 68 73 75 79
. . . . .
79 82 84 89 92 95
8. Conjugacy and local coordinates
Preliminaries ............................................. 8.1 Ck-conjugacy and coordinates ........................ 8.2 Local representation of a differentiable map ............. 8.3 The Morse-Palais lemma ............................. 8.4 Linearization of vector fields
56 58
65
6. The integral product, and linear differential equations
Preliminaries ............................................. 6.1 The integral product ................................. 6.2 Homogeneous linear differential equations .............. 6.3 Linear differential equations with a forcing term ......... 6.4 Linear differential equations of order n .................
50 51 54
. . . .
9. Differentiable submanifolds
9.1 Differentiable submanifolds 9 .2 The tangent space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Differentiable maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. Calculus of variations
10.1 Free extrema and restricted extrema . . . . . . . . . . . . . . . . . . . . . 10.2 Second-order conditions for an extremum . . . . . . . . . . . . . . . . 10.3 Spaces of curves, and the Euler-Lagrange equations . . . . . .
95
96 98 102 106
110 110 117 120 122 122 125 127
IX
10.4 10.5 10.6 10.7
The nature of the Euler-Lagrange equation . . . . . . . . . . . . . . Effect of a differentiable map . . . . . . . . . . . . . . . . . . . . . . . . . . Invariance of a lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Emmy Noether's theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
133 136 138 141
Appendix A: Banach spaces and multilinear maps . . . . . . . . . . . . . . . .
145
Appendix 8: The Banach fixed-point theorem . . . . . . . . . . . . . . . . . . . .
152
Appendix C: Newton's method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
156
Appendix D: Global inverse function theorems . . . . . . . . . . . . . . . . . . .
159
Appendix E: Reduction of linear endomorphisms . . . . . . . . . . . . . . . . .
164
Appendix F: Linear differential equations with periodic coefficients
168
Appendix G: Existence of solutions of differential equations and their dependence on initial data . . . . . . . . . . . . . . . . . . . .
172
Appendix H: Simplicity of S0(3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
175
Bibliography
177
lndex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
178
Foreword Several years ago the course committee of the Université Pierre et Marie Curie laid down a minimum programme for the unit of instruction of differential calculus: The notion of derivative; cr and C classes Symmetry of the second derivative The Taylor formula in several variables Implicit functions; the case of constant rank Differential equations: the theorem of existence and dependence on initial conditions and upon parameters, in the lipschitzian case. This book covers the above programme, and a little more. The appendices are of two kinds. Sorne recall concepts which are, strictly speaking, outside differential calculus and which are included to make the text self-contained, while the others deal with topics which may be omitted at a first reading. Bibliographical referen ces are given in full detail in the body of the text if they are used only once. If not, the author's name is given (rather than a number), and the alphabetical list at the end of the book gives the corresponding work. While this book can be used in connection with study for the Differential Calculus certificate, it is intended to be an introduction to more advanced works, such as that by R. Abraham and J. Marsden, to which it owes a great deal. Finally, the author wishes to thank Professor Paul Malliavin, who knew how to overcome the author's laziness, and without whom this book would never have seen the light of day. He also thanks Professor B. El Mabsout, who read the manuscript and prevented certain errors. The author also apologizes to his wife and sons for having deprived them of time which normally would have been devoted to family life. 00
X
Notation R+:
[ a, b]: ] a, b [: sup: max (a, ... ):
¡-1: idE:
A: exp (t): ch: sh: v.s. E*: Ec: !i'(E, F): End(E): GL(E):
set of positive real numbers set of real numbers t such that a ~ t ~ b set of real numbers t such that a < t < b least u pper bound the greatest of the numbers a, ... inverse (or reciproca}) of the bijection f identity map of E to E closure of the set A exponential of t hyperbolic cosine hyperbolic sine vector space dual of the vector space E complexification of the real vector space E set of all continuous linear maps of a vector space E to a vector space F (E,E)
group of continuous linear bijections of a vector space E onto itself SL(n ): unimodular group of R n E (±) F: direct sum of vector spaces E and F Ker (/): kernel of a linear map / det (/): determinant of a linear endomorphism f tr (/): trace of a linear endomorphism f t¡: transpose of a linear map / 11 x 11 v: norm in a normed linear space V B(a, r) or Br(a): open ball with centre a and radius r (, ) : scalar or hermitian product /*: adjoint of a linear map / sn: n-dimensional unit sphere O(n): orthogonal group of Rn Df(a): derivative of / at a D/: derivative map of / xi
xii
/': Ta/: grad /: D,f: Ck:
Ck(U; F):
lltllc·=
Diff' (E):
.X: TmM:
derivative of the function / tangent map of / at a gradient of the function / partial derivative with respect to the rth variable claSS Ck set of ali maps of class Ck from U to F C' norm of / group of ali C' diffeomorphisms from E to E image of the vector field X under the derivative map tangent space at m to the sub-manifold M
1 The concept of a derivative This chapter is devoted to the defmition of a derivative, and to those of its elementary properties which make no use of the ideas of a complete space or of integration. PRELIMINARIES
In this chapter the vector spaces are normed spaces E o ver the f1eld K = R or C. Appendix A recalls sorne of the properties of these spaces which are used here. In particular the norm of E will be denoted by 11 · 11 E, or by 11 · 11 if there is no ambiguity. When severa} spaces occur in the same statement, it is understood that they are all over the same field.
1.1
DEFINITIONS AND EXAMPLES
A function f: R-+ R is differentiable at a E R if there is a real number f' (a) such that lim [f(a + h) - f(a)]/h =f' (a). This definition makes no sense for h--+O a map f: R n -+ R m, n > 1 (how can we divide by a vector h of R n?); but it may be reformulated thus: lim lf(a + h) - f(a) - f' (a)h 1 / 1 h 1 = O, and this gives h--+O rise to the following generalization.
Definition
1.1
Let U be a non-empty open subset of a normed v.s. E and let f be a map of U to a normed v.s. F. We say thatf is differentiable ata E U if there is a continuous linear map LE!.l (E; F) such that lim IIJ(a + h) - f(a) - Lh II F/ 11 h 11 E h--+O = O. This assumes that 11 h 11 is so small that a+ h E U. Let k E E be arbitrarily given, and let t be real. Since U is open, a+ tk E U if I t I is small enough. Replacing h by tk in the definition, we see that if L exists it is unique and is given by L(k) = lim [f(a + tk) - f(a)]/t. It is ,-o simply to ensure the uniqueness of L that U is assumed to be open.
2
We call L the derivative o ff at a and denote it by Df(a); it is an element of 9! (E; F) . Df( a )k is called the derivative off in the direction of the vector k. W e have d Df(a)k = - f(a + tk) lr=o dt
(1.1)
where the right-hand side is the usual derivative of t E R M-+ f ( a + tk) EF at t = o. Recall that a real-valued function g of a real variable t, defmed on a neighbourhood of zero, is said to be o(t) if lim g(t)/t = O. Thus f(a + h) = t-+O f(a) + Df(a)h + o(II h 11), and f(a) + Df(a)h is seen to be an approximation to f of the first order in h in the neighbourhood of a. Remarks
(a) The definition of the derivative depends on the no rms of E and F . Let us replace them by equivalent norms (see §A.2.4). U remains open since the topologies of E and F are unchanged. It is easy to see that f remains differentiable and that its derivative at a is still Df(a ). lf, in particular, E and F are finite-dimensional, differentiability does not depend on the norm, for they are all equivalent (§A .2. 5) , and the hypothesis of continuity of Df(a) is redundant, for every linear map is continuous (§A.2.5). (b) We known that if E = K, 9! (E; F) may be identified with F by the canonical isomorphism L E9! (E; F) M-+ L .1 EF (Appendix, Theorem A.8). Thus Df(a) is identified with Df(a). l, which is simply the usual derivative f' (a) since d Df(a).1 = dtf(a + t) lr=o = f' (a) (c) lf F = K, Df(a) is a continuous linear functional. lf, in addition, the norm of E arises from an inner product (, ), then E* = P (E; K) may be identified with E by the canonical isomorphism x E E - < x, · > E E*, and Df(a) is the image of a vector grad f(a), called the gradient off at a and characterized by < gradf(a), v > = Df(a)v for all v in E. Note that gradf(a) depends on the inner product chosen, while Df(a) does not. (d) Jacobian matrix. lf E= Kn, F =Km, the matrix of the linear map Df(a) in the canonical bases is a matrix with m rows and n columns. It is called the Jacobian matrix off at a. We shall learn later how to determine its elements. (e) A map which is differentiable at a is continuous at a since, as Df(a) is continuous, lim f(a + h) = f(a). (The converse is false: tER M-+ 1 ti for t = O). h -+ O (f) The notion of derivative may be extended to affme spaces, and this possibility is of great importance in physics and mechanics. Let U be an open subset of the affine space E with associated v .s. É which is normed, and let f be a map of U to an affine space F with normed associated
3
v. s. F. We say that f is differentiable at a E U if there exists L Eff (É; such that
F)
f(a + h) =f(a ) + Lh + o(ll lill)
By means of a choice of fixed origins in E and F this notion reduces to that for v .s., and we shall confine ourselves to this to simplify the notation (the arrows may now be omitted) .
Definitions 1.2 If f: U--+ F is continuous at each point of the open subset U of E , we say that f is of class eº, or is eº, in U, and it will be convenient to set Dºf = f. If f is differentiable at each point of U, we say that f is differentiable on U. In this case, the mapping Df of the open set U to the normed v.s. !l! (E; F), defined by x M-+ Df(x), is called the derivative off, and it will be convenient to set D 1f = Df. If Df is continuous for the topologies of U and !l!(E; F) defined by their norms, we say that f is continuously differentiable, or of class e 1, or simply
e 1 , on
u.
If A is a subset of E, not necessarily open, we say that f: A--+ F is differentiable (resp. e 1) on A if f is the restriction to A of a differentiable (resp. e 1) map from an open set containing A to F. Examples
(a) Let E=F=R 2 , normed by l(x,y)l=lxl+IYI, and letf: E--+Fbe defined by f(x,y) = (x + y, xy). If a= (a1, a2) and h = (h1, h2), we have f(a + h) = f(a) + (h1 + h2, h1a2 + h2ai) + (O, h1h2). Since 11 (O, h1h2) 11 = 2 2 1h1h21 ~ ( 1h11+1h21 ) = 11 h 11 , then (O, h1hi) = o( 11 h 11 ). Thus f is differentiable at a and Df(a)h = (h1 + h2, h1a2 + h2ai); hence its Jacobian matrix at a is
(b) By §A.1.3, the space E= e 1([0, 1]) of functions u: [O, 1] --+ R which have a continuous derivative is normed by 11 u lle 1 = 11 u lleº+ 11 u' lle 0 • The space F= eº([O, 1]) of continuous functions v: [O, 1] --+ R is normed by 11 v 11 eº. Let f: E-+ F be the mapping which to each u E E makes correspond the functionf(u): t E [O, 1] - . u' (t) + tu 2(t). With an obvious abuse of notation, if u,hEE, then f(u+h)=f(u)+Lh+th 2, where Lh=h'+2tuh. Clearly L is linear, and it is continuous since lleº~ 11 h' lleº + 2 II u lleº 11 h lleº~ [ 1 + 211 u lleº] 11 h lle 1 2 On the other hand, 11 th 2 lleº~ ( 11 h lle 0) 2 ~ ( 11 h lle 1) 2 and so th = o( 11 h llc 1). Thus f is differentiable on E and Df(u)h = h' + 2tuh. 11 Lh
4
1.2 1.2.1
RULES FOR CALCULATION
Derivative of a continuous linear map
A continuous linear map fE!J!(E; F) is differentiable on E and its derivative Df(a) =f for all a E E. For f(a + h) =f(a) + fh [the o( 11 h 11) term reduces to zero].
1.2.2
Derivative of a continuous aftine map
More generally, let f: E - F be a continuous affine map: f(x) = Lx + b, where L E!J!(E; F), b E F. Then f is differentiable on E and Df(a) = L. If, in particular, f is constant its derivative is zero. We shall see that the converse is true if f is defmed on a connected open set U. It is false if U is not connected (take E= F = R, U= disjoint union of non-empty intervals A and B, f = O on A and f = 1 on B).
1.2.3
Continuous bilinear maps
Let E 1, E 2 and Fbe three normed v.s., and let b: E1 x E2 - Fbe a continuous bilinear map (§A.3). Then b is differentiable on E1 x E2 and the value at h = (h1, h2) E E1 X E2 of its derivative at a= (a1, a2) is Db(a)h = b(h1, a2) + b(a1, h2). In order to prove this, since b(a + h) = b(a) + b(h1, a2) + b(a1, h2) + b(h1, h2), it is enough to show that b(h1, h2) = o( 11 h 11 ). But this follows from 2 11 b(h1, h2) 11 ~ 11 b 11 11 h1 11 11 h211 ~ 11 b 11 l ll h1 11 + 11 h2 ll l 2 ~ 11 b 11 11 h 11 · The above can be generalized. Let E 1, ... , En and F be normed v.s., and let f E!J!(E1, ... , En; F). Then f is differentiable at every point a= (a1, ... , an) and the value at h = (h1, ... , hn) of its derivative is n
Df(a)h
=
~ f(a1, ... , ak-1, hk, ak+t, ... , an), k=l
1.2.4 Trace of a differentiable map Let f: E- F be a differentiable map between normed v.s. A vector subspace E' of E is normed by 11 x 11 E' = 11 x 11 E if x E E'. Then the restriction f iE' of f to E' is differentiable, and its derivative is the restdction to E' of the derivative off. The proof is obvious. Consider a special case of the above. Suppose E is a Hilbert space and let its norm be induced by its inner product (, ) . If E' is a closed subspace of E, every x in E has an orthogonal projection px on E' (projection theorem). Suppose that f is a scaler valued function (F = K). Then the gradient off IE' is the projection on E' of the gradient off. In fact, for every h E E' we have (p grad f(a), h) = (grad f(a), h) = Df(a)h = DJIE•(a)h = (grad/lE•(a), h).
5 1.2.5
Linearity of the derivative
Let Ube a non-empty open subset of a normed v.s. E, and let/and g be two maps of U to a normed v.s. F. If k, k' E K, define kf + k' g: U-+ F by (kf + k' g)(x) = kf(x) + k' g(x) for ali x E U. It is easy to see that if f and g are differentiable at a E U, so is kf + k' g and that D(kf + k' g)(a)
= kDf(a) + k' Dg(a)
The set of ali maps which are differentiable at a (resp. on U) is thus a v .s. So is the set of ali C 1 maps of U to F; we denote it by C 1 ( U; F). 1.2.6
Derivative of a composite map (the chain rule)
Let E, F and G be three normed v .s. Let f be a map of an open subset U of E to F, and suppose that it is differentiable at a E U. Let g be a map of an open subset of F, which contains f(U), to G. Assume that g is differentiable at b = f(a). Then g o f is differentiable at a, and D(gof)(a) =Dg(f(a))oDJ(a)
where the right-hand side is the composition of the continuous linear maps Dg(b) and Df(a). Proof. Set y = f( a+ h) and b = f( a). By hypothesis we have g(y)
= g(b) + Dg(b)(y - b) + r(y- b)
where lim ;(y- b) 11 / 11 y- b 11 = O. Taking account of the linearity of y-.b Dg(b) and the hypothesis y - b = Df(a)h + s(h), where lim 11 s(h) 11 / 11 h 11 = O, h-0 we have (gof)(a + h)-(gof)(a)-Dg(b)oDJ(a)h =A with A = Dg(b )s(h) + r(y - b ). Let us show that A= o(II h 11). First 11 Dg(b)s(h) 11 ~ 11 Dg(b) 11 11 s(h) 11 shows that Dg(b)s(h) = o( 11 h 11 ). Next, given any s > O, we can choose 11 h 11 so small that 11 y- b 11 = 11 Df(a)h + s(h) 11 ~ ( 11 Df(a) 11 + s) 11 h 11- Thus 11 y- b 11 / 11 h 11 ~ 11 Df(a) 11 + s. On the other hand, the continuity off ata implies that y-+ b if h-+ O. Hence r(y- b) = o(II h 11), and so A= o(II h 11). Tangent map
Suppose that /: U e E-+ F is differentiable on U. The tangent map Tf: UxE-+FxFis defined by Tf(a, h)=(f(a), Df(a)h). Here is its geometrical interpretation. Take a differentiable curve in E, with origin a, that is, a diff erentiable map e: R-+ E such that c(O) = a. We may think of c(t) as the position at time t of a point moving in E. Let e' (O)= h be its velocity vector at the origin. By the preceding theorem the curve Jo e: R -+ F, the image of the curve e under the map f, is differentiable and its
6
tangent vector at the origin is: (fo e)' (O)= D(fo c)(O). l = Df(c(O)) o Dc(0).1 = Df(a)c' (O)= Df(a)h .
Thus to the pair consisting of a point on a curve and the velocity vector at this point, Tf makes correspond the pair consisting of the image of the point and the velocity vector of this image. It is easy to see that the last theorem may be written as T(g o f) = Tg o Tf; T is thus a covariant functor. This is not the case for D, as D(gºf) = (DgoJ)oDJ and not DgoDf.
1.2.7
Mapping into a direct sum
Let F = Fi (E) ... (E)Fn be a direct sum of normed v .s. (§A.1.4). Denote by p,: F-+ F, the projection (x1, ... , Xn) M....+ x, and by i,: F,-+ F the injection which makes correspond to x, the vector all of whose components are zero except for the rth, which equals x,. A map f of an open subset U of a normed v.s. E to F may be written as x M-+ (/i (x), ... .fn(x)), where f,(x) is the rth component of f(x). Thus: n
f, = p, o f, f = ~ i, o f, r=
(1.2)
1
Theorem 1.1
The map f is differentiable at a E U if, and only if, / 1, ... ,fn are too. Under these conditions Df(a) = (D/1 (a), ... , Dnf(a)); that is: n
Df(a)
=
~ i,oDJ,(a) r=
1
It follows that f is C 1 on U if, and only if, /1, ... ,fn are too.
Proof This is an immediate consequence of the chain rule and equation (1.2).
□ Corollary. If /: x E Rn M-+ (/1 (x), ... ,Ím(X)) E Rm is differentiable at a, the rth row of its Jacobian matrix (§ 1.1) at a is the Jacobian matrix at a of /,:Rn-+R.
1.2.8
O
Leibnitz' formula
LetE,F1,F2 and G be normed v.s., let Ube an open subset of E, let/: U-+ F1 and g: U-+ F2 be maps which are differentiable at a E U (res p. C 1 on U), and let b: Fi x F2 -+ G be a continuous bilinear map. Define p = b (f, g ): U-+ G by x -+ b(f(x), g(x)). Then p is differentiable at a (resp. C 1 on U) and Dp(a)h = b(Df(a)h, g(a)) + b(f(a), Dg(a)h) for all h E E.
7 ProoJ. Since pis the composite map x
M-t
(J(x), g(x))
M-t
b(J(x), g(x)), the
theorem follows from the chain rule and from §§1.2.3 and 1.2.7: Dp(a)h
= Db(J(a), g(a)) o D(J, g)(a)h = Db(J(a), g(a)) o (DJ(a)h, Dg(a)h) = b(DJ(a)h, g(a)) + b(J(a), Dg(a)h)
o
Special cases
If E= K, Remark (b) in§ 1.1 shows that DJ(a)h may be identified with the product hJ' (a) of the scalar h EK and the derivative vector J' (a). Similarly Dg(a)h = h · g' (a). Leibnitz' formula then becomes, on setting h = 1: p' (a)= b(f' (a), g(a)) + b(J(a), g' (a))
lf Pi= F2 = G = K and b(u, v) = uv, we thus have the formula for the derivative of a product Jg of scalar-valued functions: (Jg)' = J' g + Jg'. If Pi = F2 = G is the space R 3 , oriented and endowed with its usual scalar product, and if bis the vector product, we obtain (J /\ g)' (a)= J' (a)/\ g(a) + J(a) /\ g' (a). An analogous formula may be obtained for the scalar product. 1.2.9
Map defmed on a direct sum
Partial derivatives
Let E be the direct sum Ei (±) ... (±)En of normed v.s., and let U be an open sub set of E. A map J of U to a normed v .s. F may then be written as (xi, ... , Xn) M-t J(xi, ... , Xn); it is a function of the n variables Xi E E1, ... , Xn E En, Let a= (ai, ... , an) EE be given. The map Xr E Er
,v...-.
J(ai, ... , Gr-i, Xr, Gr+ i,
... ,
Gn) E F
(1.3)
(the rth component ar is replaced by Xr) is the composite map Jo Ir, where Ir: Xr - ( ai, ... , Gr- 1, Xr, Gr+ 1, . . . , Gn) is clearly a continuous map of Er to the direct sum Ei (±) ... (±)En, Hence Jo Ir is defined on the open subset r; 1 ( U) of Er, which contains ar, If the mapping in equation (1.3), that is Jo Ir, is differentiable at ar, we call its derivative at ar the partial derivative of J with respect to the rth variable at the point a, and denote it by DrJ(a); it is an element of !f(Er; F). If Ei = ... =En= K, DrJ(a) is called, more classically, the partial derivative with respect to the rth variable at the point a. This is the usual derivative g' (ar) of the vector-valued function of a scalar variable: X ,v..._. g(x) =J(ai, ... , Gr-i, X, Gr+i, ... , Gn)
at the point x = ar, It is also denoted by (oJ/axr )(a). For example, if J: R 2 - R is defined by J(x, y)= sin(x 2 y), then DiJ(a, b)
8 2 is the derivative at a of x "'-+ sin(x 2 b); that is, 2ab cos(a b) . Similarly, 2 2 D 2f(a, b) is the derivative at b of y"'-+ sin(a 2 y); that is, a cos(a b). Let us now return to the general case and its notation.
Theorem 1.2 If f is differentiable at a, then each of its partial derivatives D,f(a), r = 1, ... , n, exists; and if h = (h1, ... , hn) EE we have: n
Df(a)h
=~
D,f(a)h,.
r= l
Proof If i,: E, - E is the injection x, "'-+ (O, ... , O, x,, O, ... , O) already considered in § 1.2. 7, we have J,(x,) = a+ i,(x, - a,). Thus /, is an affine map, and hence DI,(a,) = i, (§ 1.2.2). The first part now follows from the chain rule, and D,f(a)
= D(f0 I, )(a,)= Df(a) o i,.
(1.4)
But, if p,: E - E, is the projection on the rth component already considered n
in §1.2.7, clearly f=fº ~ i,o p,. The chain rule and equation (1.4) r= l g1ve: n
n
Df(a) = ~ Df(a) o i,o p, = ~ D,f(a) o p, r= 1
and thus: n
Df(a) · h
o
= ~ D,f(a)h,. r=I
Remarks (a) The existence of partial derivatives does not imply differentiability. For example, if we take f: R 2 - R to be defined by:
_ [xy(x 2 + y 2 )-½ f(x, y) - O
if x 2 + y 2 ;é O ·r O l X=y=
the function is zero for x = O (resp. y= O), and so D 1 f(O, O)= D i.f(O, O)= O. If f were differentiable at (0, O), its derivatives at that point would be zero, by Theorem 1.2, and we would have f(h1,h2) = o(II hll), where 11 h 11 = (h¡ + h~) ½. However, if h = (3t, 4t), we have f(3t, 4t)/II h 11 = 12/25, which does not tend to zero with t. Furthermore, a function f may have partial derivatives at a point without being continuous there. Consider at (O, O) the function
_ [xy(x f(x,y)- O
2
+ y 2 )- 1
2
if x 2 + y ;é O ·r O l X= y =
9
The reason for this 'pathology' is that the existence of partial derivatives at (O, O) guarantees the existence of tangents at the origin to the curves x M-+ f(x, O) and y M-+ f(O, y), which are the intersection of the surface with equation z = f(x, y) with the planes y= O and x = O, but it says nothing about the behaviour of the surface outside these planes. (b) If f is C 1 on U, so are its partial derivatives, since equation (1.4) shows that Dr f( a) = Df( a ) o ir depends continuously on a. In Chapter 2 we shall prove a converse to this .
1.2.10
Calculation of the Jacobian matrix
If f: x = (x1, ... , Xn) E R n M-+ (Ji (x), .. . ,Ím(x)) E R m is differentiable on an open set containing a, the element at the ith row and rth column of its Jacobian matrix at a is Drf¡(a).
Theorem 1.3
By § 1.2. 7, the ith row of the J acobian matrix at a is the Jacobian matrix at a of f;: Rn - R. By §1.2.9, the Jacobian matrix of /; at a is (D1f¡(a), ... , Dnf¡(a)).
We can give an alterna ti ve proof. If (e;) is the canonical basis of R n, consider the differentiable curve e: t E R M-+ a+ ter E Rn and the composite map (Joc)(t)=f(a+ter)=(fi(a+ter), . .. ). On applying the chain rule and Remark (b) in § 1.1, and then setting t = O, we have: d dt
Df(a)er = - f(a
+ ter) lr=O = (Dr/1 (a),
... , Drfm(a)).
D
The following theorem demonstrates an important practical application, the changing of variables.
Theorem 1.4
Let g 1, ... , gm: Rn ___. R be differentiable ata, and let/: Rm ___. R be differentiable at b=(g 1(a), ... ,gm(a)). Define F:Rn___.R by F(x)=f(g1(x), ... , gm(X)). Then: n
D;F(a)
=b
Drf(b)D;gr(a).
r= 1
Proof. If g: Rn ___. Rm is defined by g(x) = (g1 (x), . .. , gm(X)), then F =J 0 g. The chain rule gives DF(a) = Df(g(a)) º Dg(a) = Df(b) 0 Dg(a). It is now enough to note that the Jacobian matrix of Fata is equal to the product of the Jacobian matrices of /at b and of gata, and that these matrices are given
by Theorem 1.3.
D
10 Example
We calculate the partial deriva ti ves of the map F: R 2 -+ R defmed by F(x, y)= f(h(x), s(y)), where f, h and s are differentiable. We reduce the problem to the preceding theorem by setting g1(x, y)= h(x) and g2(x,y) = s(y). Thus: F(x, y)= f(g1 (x, y), g2(x, y))
and D1F(x, y)= D1f(u)D1g1 (x, y)+ D2f(u)D1g2(x, y) D2F(x, y)= D1f(u)D2g1 (x, y)+ D2f(u)D2g2(x, y)
where for shortness we have put u= (g1 (x, y), g2(x, y)). But D1g1 (x, y)= h '(x), D1g2(x, y)= O, D2g1 (x, y)= O and D2g2(x, y)= s' (y). Thus D 1F(x, y)= D1f(u)h '(x) and D2F(x, y)= D2f(u)s' (y).
2 Mean-value theorems This chapter is devoted to the generalization of the classical mean-value theorem for scalar-valued functions. Among other applications, the fundamental theorem of integral calculus is proved. PRELIMINARIES
Let f be a real-valued function defmed and continuous on an interval [ a, b] and differentiable on ] a, b [. The classical mean-value theorem says that there exists e E ] a, b [ such thatf(b) - f(a) = (b - a)f' (e). In geometrical terms, this means that there is a point (c,f(c)) of the graph off where the tangent is parallel to the line joining the end-points (a,f(a)) and (b,f(b)) of this graph. Now if, under the same hypothesis,fis a function with values in Rn, n ~ 2, the above geometrical property may not hold: think of the graph of a 'corkscrew', where the tangent at a point cannot be parallel to a line joining two points of the graph. An example is/: [O, 1r/2] - R 2 defmed by f(t) = (cos t, sin t). Let us give another interpretation of the mean-value theorem. lf /: [ a, b] - Fis a continuous map to a normed v.s. F, differentiable on] a, b[, we may think of f(t) (resp. f' (t)) as the position at time t of a point moving in F. The speed is 11 /' (t) 11- Suppose that a second point starts from f(a) at the same time as the first, that it describes a straight line and that its speed g' (t) is, at each instant, at least equal to that of the first point. It is intuitively clear that the second point will recede from the point of departure f( a) more quickly than the first. Let us make this rigorous. 2.1 2. 1.1
MEAN-VALUE THEOREMS
General results
Theorem 2.1 Let a and b be real numbers such that a< b, let F be a normed v.s., and let /: [ a, b] - F and g: [ a, b] - R be maps which are continuous on [ a, b] and 11
12 differentiable on ] a, b [ . Suppose that 11 (/) ' (t) 11 ~ g ' (t) for a < t < b. Then llf(b)- f(a) 11 ~ g(b) - g(a).
Proof We shall show that if u and v are real numbers such that a< u< v < b, then 11 /(v) - f(u) 11 ~ g(v) - g(u). Since f and g are continuous at a and b, the theorem will follow by letting u approach a and v approach b. Suppose the result is false. Then 11/(v)-/(u) 11 - [g(v)- g(u)] = M> O. Divide [ a0 = u, bo = v] at its mid-point m =(u+ v)/2. The triangle inequality shows that on one of the intervals [u, m] and [ m, v] we have: 11 f(v) - f(m) 11 - [ g(v) - g(m)] ~ M/2 or llf(m)-f(u) 11 - [g(m)- g(u)] ~ M/2. Denote by [ a 1, bi] one of these intervals on which the inequality holds. Repeat the procedure by dividing [ a1, b1] at its mid-point, etc. We obtain a sequence of intervals [ an, bn] such that ao ~ ... ~ an ~ bn ~ ... ~ bo, lbn-anl =(b-a)/2n, and llf(bn)-f(an) 11 - [g(bn)- g(an)] ~ M/2n.
It follows that an and bn converge to the same point w E ] a, b [ and that M/2n ~ llf(bn)-f(w) 11 [g(w)- g(an)]
+ llf(w)-f(an) 11 - [g(bn)- g(w)] -
~ 11 Df(w)(bn - w) 11
+ o(bn -
w)
+ 11 Df(w)(w -
an) 11
+ o(w -
an)
- [g'(w)(bn-w)+o(bn-w)]-[g'(w)(w-an)+o(w-an)] ~ 11 Df(w) 11
+ o(bn -
(1 bn -
w)
= [ 11 Df(w) 11
W
1+ 1W -
+ o(w -
Gn
1) -
g' (w)(I bn -
W
1+ 1W -
Gn 1)
an)
- g' (w)] (bn - an) + o(bn - an).
Divide both sides by bn - an = (b- a)/2n and let n tend to oo. We obtain the contradiction M/(b - a) ~ 11 Df(w) 11 - g' (w). □ The intuition which led to Theorem 2.1 makes us guess that pauses, followed by departures in new directions, are permitted in the movement of the point f(t). This follows from a stronger theorem, the proof of which can be found in texts by Cartan and Dieudonné (see the Bibliography).
Theorem 2.2 LetFbeanormedv.s.,andletf: [a,b] -Fandg: [a,b] -Rbecontinuous. Suppose that for all t E [ a, b], except perhaps for a countable set of points, f and g are differentiable and 11 /' (t) 11 ~ g' (t). Then 11/(b)-/(a)II ~g(b)-g(a). □
13
Corollary 2.1
Let/: [ a, b] - F be a continuous map into a normed v.s. such thatf' (t) exists for all t E ] a, b [. Suppose there is a constant k such that 11 /' (t ) 11 ~ k fo r all tE]a,b[. Then 1 1/(b)-/(a)II ~k (b - a). Proof. This can be shown by taking g(t) = kt in Theorem 2.2.
O
We shall now assume that / is defmed on an open subset U of a normed v .s. E, which need no longer be R. Corollary 2.2 lf /: U - F is differentiable on U , and if the line segment [a, b ] = { (1 - t)a + tb: O ~ t ~ 1}, with end-points a, b E U, is contained in U, then :
11 /(b) -
f(a)
11
~ sup OE;t,;;l
11 D/[(l -
t)a + tb]
11 11 b - a 11
Proof. The chain rule, applied to t E [O, 1] ,..___. h(t) =/[(1- t)a + tb]
gives h'(t)=Df[(l-t)a+tb](b-a). Thus
llh '(t)II
~
sup
IID/[ lll
O.;;t,;;l
11 b- a 11,
and it is enough to apply Corollary 2.1, replacing a by O, b by 1, / by h and k by sup 11 D/[ ] 11 11 (b - a 11□ 2.1.2
Convexity
We say that a subset U of a v.s. is convex if, for all a, b E U, the line segment joining these points lies in U. Theorem 2.3
Let Ube a convex open subset of a normed v.s. E, and let/be a differentiable map of U to a normed v.s. F. lf there exists k ~ O such that II Df(u) 11 ~ k for all u E U, then:
11/(b)-/(a)II ~kllb-all
forall
a,bEU.
Proof. This is an immediate consequence of Corollary 2.2.
□
Corollary 2.3
Let U be a convex open subset of a normed v.s. E, and let/be a differentiable map of U to a normed v.s. F. Then for all a, b, e E U, we have:
11/(b) - f(a) -Df(c)(b -
a)
11
~ sup uEU
11 Df(u)- Df(c) 11 11 b - a 11·
14 Proof. Apply Corollary 2.2 to u-. f(u) - Df(c) u , the derivative of which is Df(u) -Df(c) . □
Let us now move on to applications of the preceding results . 2.2
CONVERSE OF §1.2.2
Theorem 2.4
Let U be a connected open subset of a normed v.s. E, and let f be a differentiable map of U to a normed v.s. F . If Df(u) = O for all u in U, then f is constant. Proof. Fix a in U and denote by B the set of all points b E U such that f (b) = f ( a ). Since/is continuous, Bis closed in U. On the other hand, every x in Bis the centre of an open ball contained in U and of positive radius. This ball is convex. By Theorem 2.3,/(y) =f(x) =f(a) for all y in this ball. Hence Bis also open in U. As Bis non-empty and U is connected, B = U. □
2.3
NECESSARY AND SUFFICIENT CONDITION FOR A MAP TO BE OF CLASS C 1
Let us recall § 1.2.9. Let U be an open subset of the direct sum E= E1 (±) ••. (±)En of normed v.s., and let f be a map of U to a normed v.s. F. We have seen that, if f is e 1 in U, then the partial derivatives Dif: U-+ fl!(Ek; F) are e 1 in U. We shall prove the converse. Theorem 2.5
With the above notation, suppose that the partial derivatives exist at each point x = (xi, ... , Xn) of U and that the maps Dk/: U-+ !J!(Ek; F) are continuous at a E U. Then f is e 1 at a. n
Proof. By the formula Df(a)h
show that Df(a) exists. Define
=
~ Dk/(a)hk of §1.2.9, it is enough to k=l
n
g(x)
= f(x) -
f(a) - ~ Dif(a)(xk - ak)k= 1
Then Dkg(x) = Dk/(x) - Dk/(a). Since the partial derivatives are continuous at a, given any e > O, there exists r > O such that 11 Dkg(x) 11 ~ e, k = l, ... , n, if x is in the open ball with centre a and radius r. It follows from Theorem 2.3 that: n
11 g(x) 11 ~ ~ 11 g(x1, ... , Xk, ak+ 1,
... ,
an)
k=l
- g(X¡, .. . , Xk - 1, Ok, ... , On) 11
15 n
~ ~ 811 Xk - ak 11 ~ m 11 X- a 11k= l
As e is arbitrary: n
f(x)
= f( a) +
~ Dk/(a)(xk - ak) + o(II x- a
11).
o
k=l
2.4
A CRITERION FOR UNIFORM CONVERGENCE
Theorem 2.6 Let U be a connected open subset of a normed v .s. E, and let fn : U -+ F be a sequence of differentiable maps into a Banach space F. Suppose that: (1) there is a point a of U such that the sequence fn(a) converges; (2) the sequence Dfn: U-+ :/!(E; F) converges uniformly on each bounded subset of U to a map g: U-+ !J' (E; F). Then for each x E U, the sequence/n(x) converges to a limit, denoted by /(x). This convergence is uniform on each bounded convex subset of U. Finally, f = lim fn is differentiable and Df = g. n-+oo
Proof Let B be a ball, with centre a and radius r > O, contained in U. By Corollary 2.2, for all x E B we have: llfp(x)- Jq(x)- [fp(a)- Jq(a)] 11 ~ sup 11 Dfp(u)-Dfq(u) 11 11 x - a 11 · UEB
(2.1)
Since the sequence fn(a) converges and the sequence Dfn(u) is uniformly convergent on B, the sequence fn(x) is a Cauchy sequence in F. As F is complete, n-oo lim fn(x) = f(x) exists. This reasoning also shows that if f converges at one point of an open ball in U, it converges uniformly on that ball. The set of points u of U at which fn (u) converges is thus open and closed in U. Since it contains a and U is connected, it coincides with U. In other words, lim fn(u)=f(u) exists for all uE U. n-+oo
Now denote by B a bounded convex subset of U, of diameter d = sup x,yEB
11 x-y 11- By Corollary 2.2, inequality (2.1) still holds if a, xE B, and thus: llfp(x)- Jq(X) 11 ~ 11/p(a)- Jq(a) 11
+ sup 11 11 d B
This shows that fn converges uniformly on B. Consider inequality (2.1) again and let p tend to + oo. Since lim Dfn = g, we have: 11/(x)- f(a)- [fq(x)- Jq(a)] 11 ~ sup 11 g(u)-Dfq(u) 11 11 x- a 11uEB
Since the convergence of Dfq is uniform on B, given any e > O, there exists N > O such that q > N implies that sup 11 g(u) - Dfq(u) 11 < e. On the B
16
other hand , given any such q, there exists r ' ~ r such that implies that: 11
Jq(x) - Jq(a) - Dfq(a)(x- a)
11 ~
11
x- a
11 ~
r'
s 11 x- a 11 ·
The triangle inequality and the preceding inequalities now show that: 11
f(x) - f(a)- g(a)(x- a)
11 ~
3s 11 x- a 11 ·
Thus f is differentiable at a and Df(a) = g(a).
o
We deduce from this the following theorem.
Theorem 2.7 Let Ube a connected open subset of a normed v.s. E, and letfn be a sequence of differentiable maps of U to a Banach space F. Suppose that: (1) there is a point a of U such that the series bÍn(a) converges; (2) the series bDfn converges uniformly on every bounded subset of U to a map S: U - P(E; F). Then for each x in Uthe series bÍn(x) converges to a limit, denoted by h(x). This convergence is uniform on each bounded convex subset of U. Lastly, h is differentiable and Dh = S. □
2.5
SARD'S THEOREM
Definition 2.1 Letf: u- Rn be a differentiable map of an open subset U of RP to Rn. We say that a E U is a critica/ point off if the rank of the linear map Df(a) is less than n.
Definition 2.2 Give Rn its usual scalar product. This scalar product defines the usual norm and distance. A displacement of Rn is an affine map of Rn into Rn which preserves this distance. Let a1, ... , an E R and h1 > O, ... , hn > O be given. Put: P
= [ a1,
a1
+ hi]
X ... X [
an, On + hn]
We shall understand by a box in Rn any image of P under a displacement. The volume of a box is, by definition, h1 ... hn,
Definition 2.3 A subset E of Rn is of measure zero if, given any s > O, there is a covering of E by boxes the sum of whose volumes is less than s.
17
lt is left as an exercise for the reader to show that the union of a countable family En, n = 1, 2, ... , of sets of measure zero is itself of measure zero (hint: cover En by boxes the sum of whose volumes is less than s/2n).
Theorem 2.8 Let/: U ➔ R n be a e 1 map of an open subset U of RP to Rn. Then the image /( e) of the set e of critica} points off is a set of measure zero in R n.
Proof. The proof is difficult if p > n (see J. Milnor). We shall limit ourselves to the case p = n. Step 1: Cover Rn by cubes with sides of length 1 and with vertices with integer coordinates; Rn is a countable union of such cubes. By Defmition 2.3, it is enough to show that the image under f of the set of critica} points contained in one of these cu bes is of measure zero. Application of a translation enables us to assume that this cube is the unit cube / = [O, 1] n. lt remains to prove that /(en/) is of measure zero. Step 2: Let xE en/. Since rank Df(x) < n, Df(x)Rn is a subspace of dimension less than or equal to n - 1. Hence there exists a hyperplane P, passing through/(x) and containing all vectors of the form /(x) + Df(x)z, z E Rn. Fix s > O and let y belong to the open ball B(x, s) with centre x and radius s. By Corollary 2.3 we have: f(y)
= f(x) + Df(x)(y -
x)
+ b(s) 11 Y - x
11
where b(s) ~ sup 11 Df(x) - Df(u) 11 tends to zero with s, since Df is ueB
continuous and thus uniformly continuous on the compact set l. As f(x) + Df(x)(y- x) E P, we see that the distance of /(y) from the hyperplane P is dominated by sb(s). Thus f[B(x, s)] lies between two hyperplanes parallel to P and whose distan ce to P is sb (s ). On the other hand, Corollary 2.2 shows that 11 /(y) - /(x) 11 ~ sup 11 Df(u) 11 11 y- x 11- It follows that /(y) lies in the ball with centre I
/(y) and radius as, where a= sup II Df(u) ll1
To sum up,/[B(x,s)] lies in a right cylinder, whose base is the intersection of P with the ball of centre /(x) and radius as, and whose height is 2sb(s). It is clear that f[B(x, s)] is contained in a box in R n, whose sides parallel to Pare of length (Jn)ab and whose side perpendicular to Pis of length 2sb(s). The volume of such a box is 2n12 an-lsnb(s). Step 3: Divide each of the n sides of the unit cube into k equal parts. We obtain kn cubes of side 1/k. Each of them is contained in a ball of radius s = Jn/k (Pythagoras' theorem!). Sorne of them will intersect the set e of critica} points. Their image under the map f is thus contained, by step 2, in a box of volume:
18
It follows that /(C) is contained in the union of at most k" boxes, the sum of whose volumes, 2n O such that 11 u 11 < e implies that 11 Df(u) - Df(O) 11 < ó. Take x and x + y in the open ball B(O, e) with centre O and radius e, and suppose thatf(x) = f(x + y). Since Df(O) = idE, the fundamental theorem of integral calculus (Theorem 2. 10) im-
plies that: O= IIJ(x+ y)- f(x)
= I IY + ( ~
11
y
11 -
11
= 11 ( Df (x+ ty)y dt 11
[Df(x+ ty)-Df (O) ]y dt 11 ó
11
y
= (1
11
- ó)
11
y
(3.1)
11·
Thus y= O and f is injective on B (O, t ) . Let us now show that the image of the ball B(O, c/2) under f contains a ball B(O, r) , provided that O< r < c(l - ó)/4. Step 3: Let z E B(O, r) be given . The closed ball B(O, c/2) of the finitedimensional space E is compact. The continuous function x E B(O, c/2) M-+ 11f(x) - z 11 then attains its infimum at a point Xo. We claim that Xo is an interior point of this ball. lf 11 x 11 = t/2, the fundamental theorem of integral calculus and Df(O) = idE imply that: 11
f(x)
11
= 11
X+ (
= (1
- ó) 11
[Df(tx) - Df(O)] X dt 11 X 11
~
11 X 11 -
ó II X 11
= t(l - ó)/2 > 2r.
Thus IIJ(x) - z 11 ~ IIJ(x) 11 - 11 z 11 > r > 11 z 11 = IIJ(O) - z 11 ~ IIJ(xo) - z 11The infimum therefore cannot be attained on the boundary of B(O, c/2). Step 4: Let us show that f(xo) = z. Set y= k [f(xo) - z], where k < O and 1 k I is small enough for 11 Xo + y 11 < c/2 [this is possible, for Xo E B(O, c/2)]. The fundamental theorem of integral calculus shows that IIJ(xo + y)-f(xo)- Y
11
= 11 (
[Df(xo + ty)-Df(O)]y dt 11
~ ó IIY 11
and hence: IIJ(xo + y)-zll ~ IIJ(xo+y)-f(xo)-YII + llf(xo)-z+yll ~ ó 11 Y
11
= [1 + k -
+ (1 + k) 11 f(xo) - z ók]
11 /( Xo)
-
11
Z 11-
Since 1 + k - ók < l, the definition of x 0 shows that f(xo) = z. Step 5: Set/= B(O, c/2)nf - 1 [B(O, r)]. Asf is continuous, this is an open neighbourhood of O. By what is above, f: 1 - J = f(I) is a bijection. We shall prove that the inverse map is continuous.
25 lf Y, y+ k E J, there exist x, x+ k E/ such that y =f(x) and y+ k = f(x + h) . By equation (3.1) we have 11 f(x + h) - f(x) 11 ~ (1 - ó) 11 h 11, and thus:
□ To sum up, f: 1- J is a C homeomorphism. lt remains to prove that 1- is differentiable and that its derivative at y is {Df[f- 1 (y)] } - 1 • We shall not do this here, for the proof is no more difficu lt in infmite dimensions and will be given in §3.2.2. 1
1
Remark Step 4 above may be simplified. First we prove a lemma which will later be useful for other purposes. Lemma 3.1 Let A be a subset of a normed v .s. E. lf /: A - R is differentiable (see Definition 1.2) and has a minimum atan interior point a of A, then Df(a) = O.
Proof. Let h E E. Since there is an open set containing a and contained in A, a + th E A for t E R, if I t I is small enough. The differentiable function t ,.,._. f(a + th) E R thus has a mínimum at t = O. Hence O= (d/dt)f(a + th) 1t=O = Df(a)h, and Df(a) = O, since h is arbitrary. Since E is finite-dimensional, ali norms on E are equivalent (see §A.2.5). We may thus assume that there is an inner product on E and that ( x, x) defines the square of the norm. The function x M-+ therefore has a mínimum when x = x0 • Its derivative at Xo is zero. We thus obtain (see §1.2.8): Df(xo)[f(xo) - z] = O. Since Df(xo) is invertible, f(xo) = z. Remark The proof above has two drawbacks. First, it uses the local compactness of E, which is finite-dimensional (see J. Dieudonné, p. 106, for the converse: a
locally compact normed space is finite-dimensional). Secondly, it does not give an algorithm which enables us to approximate the solution efficiently. The proof which follows is free of these defects. 3.2.2
Proof of the inverse function theorem
Now E and F are arbitrary Banach spaces. The proof requires several steps. Lemma 3.2 (lnversion of an isomorphism between Banach spaces). The set GL(E; F) of ali isomorphisms from E on to F is an open subset of !/!(E; F), and the map J : u M-+ u- 1 of GL(E; F) to GL(F; E) is continuous.
26 Proof We may assume that E= F. For if v E GL(E; F) (assumed non-empty for otherwise everything that follows would be trivial), the map u - v- 1 o u of !l(E; F) in ff(E; E) is continuous and GL(E; F) is simply the inverse image of GL(E; E) under this map. Let u E GL(E; E) and h E If(E; E). We shall prove that u+ h E GL(E; E) if 11 h 11 < 1/11 u- 1 11- For simplicity, denote by 1 the identity map idE. Since u is invertible and u+ h = u o [ 1 + u- 1 oh], it is enough to prove (set v = 1 -u- 10 h) that 1-v is invertible if 11 v 11 = 11 u- 10 h 11 ~ 11 u- 1111 h 11 < l. To do this the formula (1 - x)- 1 = 1 + x + x 2 + ... , where x E R, 1 x 1 < 1, suggests consideration of the sequence Xo = 1, X1 = 1 + v, ... , Xn = 1 + v + ... + vn. This is a Cauchy sequence in !Jl(E; E), for (see §A.2.2) we have 11 Xp+q-Xp 11 = 11 vp+q + • • • + vP+I 11 ~ 11 v llp+I + • • • + 11 v IIP + q' which tends to zero as p-+ + oo, since 11 v 11 < 1. Since Il(E; E) is complete (see Theorem A.2), Xn converges to X. But (1 - v) o Xn = 1 - vn+ 1 , composition is continuous (see §A.3.1) and vn+I-+ O as n ➔ + oo; thus (1- v)oX= 1 and X is the desired inverse of 1 - v. Let us retain the notation and show that J is continuous: n-+co
-- 1·1m (V + ... + V n) O U -1 • n-+co
Since 11 V+ · · · + Vn 11 ~ 11 V 11 + · · · + 11 V 11 n ~ 11 V 11/(1 - 11 V 11)
~ 11 h ll llu- II/O - II u- 10 h 11) 1
□
we have IimllJ(u+h)-J(u)II =0. h-+O
Proof of Theorem 3 .1 Stepl: As in §3.2.1, we may assume that a=O, f(a)=O, E=F and Df(O) = idE. Step 2: Set g(x) = x-f(x). We have g(0) = O, Dg(O) = O. Since Dg is continuous, there exists r > O such that if xE B(O, 2r) then 11 Dg(x) 11 ~ 1/2. The mean-value theorem (Theorem 2.3), applied in the ball B(O, 2r), which is convex, shows that:
11 g(x) 11
= 11
g(x) - g(0) 11 ~ 11 x 11/2
O and f(a, b) = O, there exists an interval A which contains a and an interval B which contains b, such that to each x E A there corresponds a unique y E B such that f(x, y)= O. This defi nes a function x EA ,..._.y= g(x) E B such that f(x, g(x)) = O. In the present case g(x)
= (1
-
x 2 )½.
We could have associated to the number a another number e, here equal to b, such that f(a, e)= O. We would then obtain another function h, here equal to - g, such that f(x, h (x )) = O. We say that each of these functions g and h is defined implicitly by the equation f(x , y ) = O. If we had chosen I a 1 = 1, it would have been impossible to find such a fu nction defined on an open interval containing a. More generally, consider m equations f¡ (x, y) = O, i = 1, . . . , m, in m unknowns y= (Yi, ... , Ym), depending on n parameters x = (x i , .. . , Xn). Assumethat f¡ (a,b)=O, i= 1, .. . , m, for a = (ai, .. . , an) and b=(bi, .. , bm). Under what conditions can we make correspond to each x near to a, a unique y near to b, satisfying the m equations f¡(x, y) = O? The following theorem gives a simple criterion which enables us to answer this question. Theorem 3.2
(lmplicit function theorem) Let E and F be Banach spaces, let U be an open subset of E, Van open subset of F, and letfbe a e 1 map of Ux Vin a Banach space G. Assume that at (a, b) E U x V the partial derivative D2f(a, b) E!Ji(F; G) is an isomorphism of F on G. Then there exist a neighbourhood A of a, a neighbourhood W of f(a, b) anda unique e 1 map g1: A x w- V such that, for all (x, w) EA X W, we have f(x, gi (x, w)) = w.
v- E(±) G, defined by O, the expression 'f is n times differentiable on U' and the derivative Dnf(a) of order n off at aE U. To this end we recall that the normed v.s. of continuous linear maps of E to the normed v.s.fl!n- 1 (E; F) of continuous (n- 1)-linear maps of Ex ... X E (n - 1 times) to F is canonically isomorphic to pn (E; F) (see Theorem A.9). We say that f is n times differentiable at a E U if there is an open neighbourhood U' e U of a in which/ is (n - 1) times differentiable, and the map u E U' M....+ nn- 1/(a) of U' to pn-I (E; F) is differentiable at a. The derivative of nn- I f at a is denoted by Dnf(a) and is called the derivative of order n off at a. It is an element of !/!(E; !l!n- I (E; F)) = pn (E; F) and its value at (h 1 , ••• , hn) E En is denoted by Dnf(a)(h 1 , ••• , hn). This is the same as (D(Dn- 1/)(a) h1 )(h2, ... , hn).
It is left to the reader to verify that this definition is equivalent to the following: f is n times differentiable at a if it is differentiable in an open neighbourhood U' C U of a and Df: U' - !/!(E; F) is (n - 1) times differentiable at a.
34 If f is n times differentiable at each point of U, we say that it is n times differentiable on U. In this case, uM--+ D"f(u) defines a map D".f: U -!J!" (E; F) which is called the derivative of order n off. If the map D".f is continuous on U, we say thatf is of class e" on U. It is left to the reader to verify the equivalence of this definition with the following: f is of class e" on U if f is of class e 1 on U and Df is of class en-i on U. Note that since a map which is differentiable at a point is continuous at that point, an n times differentiable map is of class en- 1 •
4.1.4
The space C'1(U; F)
The linearity of differentiation shows that the e" maps f:U- F form a v.s.; we denote this by e" (U; F). In particular, eº (U; F) is the space of continuous maps from U to F. The intersection n e" (U; F) will be written as e 00 ( U; F); its elements n>O are maps f: u- F which are of class e" for every integer n > O. We say that such an element is of class e 00 or, what amounts to the same, is infinitely differentiable, that is, n times differentiable for all n.
4.1.5
Generalization of Schwarz' theorem
Theorem 4.2 lf f: U C E - F is n times differentiable at a E U, then D"f( a) Eff" (E; F) is a symmetric n-linear map. In other words, for all h1, . .. , hn E E and every permutation s of the integers [ 1, 2, ... , n}, we have: D"f(a)(h1, ... , hn)
= D".f(a)(hs(l), ... , hs(n))
Proof The question arises only when n ~ 2, and Theorem 4.1 answers it when n = 2. Thus we suppose that n ~ 3 and prove the theorem by induction, supposing it true for n - l. By hypothesis, D".f(a) = D(D"- 1f)(a). On the other hand, D"- 1fhas values in the subspace p;- 1 (E; F) of !Y"- 1 (E; F) consisting of symmetric (n - 1)-linear maps. Thus D".f(a)h 1 = D(D"- 1f)(a)h 1 E!l!;- 1 (E; F) for h1 E E, and: D".f(a)(h1, ... , hn)
= (D"f(a)h1 )(h2, ... , hn)
is a symmetric function of h2, . .. , hn. lf the permutation s leaves the number 1 fixed, the theorem is proved. lf the permutation s does not leave the number 1 fixed, it is the product of permutations leaving 1 fixed and a permutation which interchanges 1 and 2, leaving 3, ... , n fixed. The theorem will thus be proved if we can prove that D"f(a).(h1, h2, ... , hn) does not change when h1 and h2 are interchanged. But
35
= D 2(Dn - 2f(a)); thus, by applying Theorem 4.1 to nn - 2J, we (Dnf(a) hi)h2 = D 2(Dn - 2f)(a)(h1, h2) = D 2(Dn- 2f)(a)(h 2, hi) = (Dnf(a)h2)h1.
D".f(a)
4.1.6
fmd: □
Calculation of Dnj(a)
Let f: U C E-+ F be n times differentiable at a E U. Repeated application of equation (1.1) gives the following generalization of the result in §4.1.2: D n'f(a)(h1, ... , hn)
d = -d fn
... dd f ( a+ (¡
_.I;n
)1
t;h;
,
(4.2)
ti= •.. =tn=O
t=l
where h1 , ... , hn E E. This formula reduces the calculation of D".f(a) to that of the successive derivatives of a function of n real variables and with values in F. The number on the right-hand side of equation (4.2) generalizes the notion of the derivative off in the direction of a vector h of E. Denote by d ".f(a) the operator which maps (h 1 , ••• , hn) E En to this number on the right-hand side (we sometimes call this the Gateaux derivative of order n off at a). We shall now establish a generalization of Theorem 2.5. Theorem 4.3
Assume that: (a) d nf(x) exists in an open neighbourhood U of a; (b) d".f(x)E!J!n(E; F) for all x in U; (c) x E U""-+ d ".f(x) is continuous.
= Dnf(a). We proceed by induction on n. If n = l, f""-+ f(x + th) is differentiable
Then f is of class en at a, and dnf(a)
Proof. at t = O for all x E U and all h EE, and its derivative df(x)h depends continuously on x. Set x = a+ sh, where O ~ s ~ 1 and 11 h 11 is so small that x E U. Then t ""-+ f( a + sh + th) is differentiable at t = O and its derivative df(a + sh)h, which is simply the derivative with respect tos of s""-+f(a + sh), is continuous. The fundamental theorem of integral calculus thus shows that:
L l
f(a
+ h)- f(a) =
Hence: 11
f(a
+ h)- f(a)- df(a)h
11
= 11
~
df(a + sh)h ds
L
[df(a + sh)- df(a)] h ds 11
1 1
11
h 11
11
0
df(a
+ sh) - df(a)
11
ds
36 which is o(II h 11) as xM-+ df(x) is continuous at a. Thus Df(a) exists, it is equal to df(a) and Df is therefore continuous at a. Suppose the theorem has been proved for n - l. By hypothesis and by the fundamental theorem of integral calculus:
[Dn-lf(a + hn)-Dn-lf(a)] (h1, ... , hn-d
= [dn-lf(a + hn)- dn-lf(a)] (h1, ... , hn-d =
1:
(dnf(a + shn) hn)(h1, ... , hn-d ds
It follows that: 11 [D n-lf(a
+ hn) - nn-lf( a ) - d nf(a )hn] (h 1 ' ... ' hn-dll
~1 11:
[dnf(a +shn) - dnf(a) ](h 1, ... , hn)dsll
~ 11 hill -•- 11 hn 11
1:
11 d"f( a+shn )- dnf(a) 11 ds .
Since d "f is continuous at a, the integral tends to zero with 11 h 11- Thus nn-lf(a + hn) - nn-lf(a) - d nf(a)hn = o( II hn 11 ); hence Dnf(a) exists and equals d nf(a), and is therefore continuous at a. □
4.2 4.2.1
RULES FOR CALCULATION
Continuous linear maps
If fis a continuous linear map of a normed v.s. E to a normed v.s. F, we know (§ 1.2.1) that its deriva ti ve is constan t. Hence D"f = O for n ~ 2 andf is of class
coo.
4.2.2
Continuous bilinear maps
Let E1, E2 and F be normed v .s. and let b: E 1 x E2 - F be a continuous bilinear map. We know that Db is differentiable and that D 2 b is a constant. Hence Dnb = O for n ~ 3 and b is of class C 00
•
4.2.3
Leibniz' rule
u-
Let E, F1, F2 and G be normed v.s., let U be an open subset of E , letf: F1 and g: U - F2 be maps which are of class Ck and let b: F1 x F2 - G be a continuous bilinear map. As in §1.2.8, define a map p: G by x M-+ b(f(x ), g(x)). Then pis of class Ck .
u-
Proo f.
Suppose thatf and g are of class C 1 • Then pis differentiable at a E U,
37 and if h E E we have: Dp(a)h
= b(Df(a)h, g(a)) + b(f(a), Dg(a)h). and p is of class e 1 •
(4.3)
Thus Dp is continuous We shall prove the theorem by induction, supposing it true for k - 1. Suppose thatf and g are of class ek. The map v: .P(E; Fi) x A-+ !JJ(E; G), which to each A Efl'(E; F1) and each / E F2 makes correspond the continuous linear map h M-+ b (A (h ), /) of E to G, is continuous and bilinear. Moreover, Df and g are of class ek - 1 • Since the theorem holds for k - 1, it follows that x M-+> v(Df(x), g(x)) = b(Df(x), g(x)) is of class ek-i. We can show in the same way that xM-+b(f(x),Dg(x)) is of class ek-i_ Hence, by equation (4.3), Dp is of class ek-I and pis of class ek. □ The proof shows that in the statement of the theorem the phrase 'of class
ek• may be replaced by 'k times differentiable'. 4.2.4
Higher-order derivatives of a product
lf f and g are scalar-valued functions defined on an open set where they are of class en, the last rule shows that their product is of class en. Applying n times the rule for differentiation of a product, we see that the n th derivative is of the form: (fg)(n)
=
n ~
Áqnf(n-q)g(q)
q=O
where the coefficients Aqn are constants independent of f and g. Choose f(x) = eªX, g(x) = ebx, where a and b are scalars. After simplification the preceding formula becomes (a+ b)n = ¿¡Aqnan-qbq. Since a and b are arbitrary, the coefficients Aqn are those of the binominal formula: Aqn = neq = n!/(n - q)!q!. Hence: (fg)
s= I
where Ds(Drf)(x) E!,f(Es; fJJ(Er; F)). Combining this result with (d/dt)DJ(a + th and equation (4.6) we obtain: D 2f(a)(h(l), h< 2>)
n
= b
r,s= I
Ds(Drf)(a)(h}2>, h}º).
(4.7)
We put Ds(Drf)(a) = D;rf(a); this is an element of P (Es; !,f(Er; F)), which we know from Theorem A.9 to be canonically isomorphic to ff(Es, Er; F). We call it a partial derivative of order 2 off at a. By Schwarz' theorem, D 2f(a) is a symmetric bilinear form. Equation (4.7) thus shows that:
r,s
r,s
Interchanging the summation indices r and son the right-hand side, we obtain: r,s
r,s
Since the h} 1> and h} 2> are arbitrary, it follows that D;sf(a)(h}º, h} 2>) = D 2 rsf(a)(h}2>, h} 1>).
The map D ;sf( a) E ff(Er, Es; F) is thus the composition of the map: (hf>, h} 1>) E Er X Es
M-+
(h}º, hJ 2)) E Es X Er
and the map D;rf(a) E !l!(Es, Er; F). In particular, D;rf(a) is a symmetric bilinear map. However, one should not think that D;sf(a) = D;rf(a): these are not even elements of the same space if s ~ r.
42
lf f is of class Ck, k ~ 2, equation (4. 7) is easily generalized:
I;
Dkf(a)(h, ... , M/>) (4.8)
with an obvious notation.
4.2.9
Case where E= Rn or
en
Let us take E1 = ... =En= R(or C) in the last paragraph. Then !f(Er; F) may be canonically identified with F, by Theorem A.8; thus !f(Es; fl' (Er; F)) may be identified with fl'(Es; F), that is, with F again. It follows from the last paragraph that Dsrf(a) and Drsf(a) may be identified with the same element of F. In this particular case we may thus write Dsrf(a) = Drsf(a), often written
a2J
axr OX 5 (a). lf, in addition, F= Rm (or cm), and if f is given by its components in the canonical basis [f(x) = (f¡ (x), ... ,Ím(x))], the components of the bilinear form D 2f(a), defined on Rn X Rn and with values in Rm, are (o 2f/oxr ox 5 ) (a), i = 1, ... , m, 1 ~ r, s ~ n, in canonical bases. More generally, the components of Dkf(a) in the canonical bases are D~ .... , rkf;(a), often written (okf¡/OXr 1 • • • OXrk) (a), i= 1, .. . ,m, 1 ~ r1 , .. . ,rk ~ n.
4.3
TAYLOR'S FORMULA
We intend to extend to 'order n' the fundamental theorem of integral calculus: f(a
+ h)- f(a) =
i:
Df(a + th)h dt
which uses the first derivative off. Lemma 4.1
lf u is an (n + 1)-times differentiable function of a real variable t with values in a Banach space F, then: D[u(t) + (1- t)Du(t) + ... +(1/n!)(l - ttDnu(t)]
= (1/n!)(l -
onnn+lu(t).
Proof We apply Leibniz's rule (§1.2.8) with E= R, Fi = R, F2 = F, f(t) = (1 - t)\ g(t) = Dku(t), O~ k ~ n and, as a continuous bilinear map b: Fi x F2 = R X F--+ F the product of r E R by y E F. The proof follows by 'telescoping'. Lemma 4.2
lf u is a en+ 1 function of a real variable t, defined on an open set containing
43
[0,1] and with values in a Banach space F , then: u(l)-u(O)-u '(O) - ½u"(O)- ... -(1/n!)u(O)=
L
((1 - tt/ n!)u O, there is therefore a ó > O such that 11 Dg(h) 11 ~s 11 h II n- I if 11 h 11 < ó. By the mean-value theorem we obtain 11 g(h) 11 = 11 g(h)-g(O) 11 ~ sil h lln, and so 11 g(h)II = 0 (11 h lln). D
The same result could have been deduced from the last theorem, but at the cost of too strong a hypothesis: 'f has a derivative of order n + l which is bounded on a neighbourhood of a'. The estimate of Theorem 4. 9 gives an n th -order generalization of the inequality which defines the first-order derivative: IIJ(a + h) - f(a)-Df(a)h 11
= o(II h 11).
It says that: I
= f(a) + Df(a)h + ... + (1/n!) Dnf(a)(h)n
approximates f(a + h) in a neighbourhood of a to order n in 11 h 11- We may ask if the property characterizes /. The next section gives a first answer.
4.3.3
Uniqueness of Taylor's formula
Theorem 4.10
Let E and F be Banach spaces, and let f be a en map of an open subset U of E to F. Suppose there are symmetric maps Ak E!l!k (E; F), k =O, ... , n, such that for sorne a E U: n
f(a
+ h)-
~ (l/k!)Ak(h)k k=0
= o(II h
11 n).
Then Dkf(a)=Ak for k=O, ... ,n. Put Bk = (l/k!)(Ak - Dkf(a)). Taylor's formula (§4.3.2) and the assumed relation give, by subtraction term by term:
Proof
n
~ Bk(h)k k=0
Letting Bo =
...
= o(II h
(4.9)
11 n).
h - O in equation (4.9) we obtain Bo = O. = Bk = O for k < n. Then equation (4.9) gives:
Suppose
that
n
-Bk+1(h)k+I=
~ B;(h);+o(llhlln). i=k+2
Thus: 11 Bk+I 11 11 h 1111
k+l
~
n
~
i=k+2
11 B; 11 11 h 11; + o(II h 11 n).
Dividing both sides by 11 h 11 k+ 1 and letting h - O, we obtain Bk+ 1 = O. Ali the B; are thus zero. D
46 This result often enables us to identity easily the successive derivatives off when we know that f is of c/ass en. Not only is the Taylor series to order n unique, but its property of approximating /(a+ h) to o(II h 11 n) almost characterizes it. This is shown by the following theorem, due to R. Abraham and J. Robbin (Transversal Mappings and Flows, Benjamin, 1967).
4.3.4
Converse of Taylor's formula
Theorem 4.11
Let f be a map from an open subset U of a normed v .s. E to a normed v .s. F. We make the following hypotheses: (a) There are continuous maps a1: U-+ !1' 1(E; F), j = O, 1 , ... , n, such that aj(x) is symmetric for ali x in U. (b) Let x E U and h EE be such that x + h lies in an open hall with centre x and contained in U. Put: n
Rn(X, h)=f(x+h)-
I;
(1/k!)ak(x)(ht
(4.10)
k=O
where (hl = (h, ... ,h), and assume that for all XoE U, 11 Rn(x,h) 11/11 h 11 n-+ O ~ k times
if (X, h) -+ (Xo, 0). Then f is of class en and Dkf = ak for k
= O, 1, ... , n.
Proof (E. Nelson) The case n = l. By hypothesis, f(x + h) = ao(x) + a1(x)h + Ri(x,h). Since 11 R1(x, h) 11 = o(II h 11), we have ao(x) =f(x); thus f(x + h) = f(x) + a1 (x)h + o(II h 11). Hence a1 = Df and, since a1 is continuous, f is of class e 1. We shall next prove the theorem for an arbitrary n by assuming it true for n-1. Since (1/n!)an(x)(h)n+Rn(X,h)=Rn-1(x,h) is such that: 11 Rn-l (X, h) 11/11 h 11 n-l ~ ((1/n!)ll an(X) 11 + 11 Rn(X, h) 11/11 h 11 n] 11 h 11
-+
O
as h-+ O, the inductive hypothesis implies that a0 = f, ... , an-l = nn-i f. Now fix x E U and choose t > O so small that the ball with centre x and radius 2t lies in U. Take y,zEE so that IIY 11, 11 z 11 < t and write/(x+ y+z) in two different ways: f(x+ y+ z) =f(x+ y)+ Df(x+ y)z + ... + (1/(n - l)!)Dn-l/(x+ y)(z)n-l + (1/n!)an(X+ y)(z)n + Rn(X+ Y,Z)
and f(x+y+z)=f(x)+Df(x)(y+z)+ ... + (1/ (n - l)!)Dn- 1/(x)(y + z)n- l + (1/n!)an(x)(y + zt + Rn(X, Y+ z).
47
By su btraction it follows that go(Y)
+ g¡(y)z+ . . , + gn-1(y)(z)n - l + [(1 /n!)an(X + y)(z)n - (1/n!)an(x)(z)n +Rn(x+y,z)-Rn(x,y+z)] =0
(4.11)
where, using the symmetry of the n-linear form an(x): gn-1 (y)(z)n- l
= (1/ (n -
l)!)[Dn-lf(x + y) - nn- lf(x) - an(X)y] (z)n-l.
(4.12)
Now suppose that 11 z 11 (4.11) satisfies
< 11 y
11- The quantity in square brackets in equation
11 [ ] 11/11 Y 11 n ~ (1/n!) 11 an(X + y)- an(X) 11
+
11 Rn(X + Y, Z) 11/11 Z 11 n
+ 2n 11 Rn(X,
Y+ Z) 11/11 Y+ Z 11 n.
Since an is continuous, the first term on the right-hand side tends to zero as y- O. Since (x + y, z) - (x, O) if y - O, hypothesis (b) shows that the second term tends to zero as y- O. This is also so for the third term. Thus from equation (4.11) we see that, if y-O: 11 go(y)
+ g¡ (y)z + · · · + gn-1 (y)(z)n-I 11/11 Y II n - O.
Taking z = O we obtain go(Y)
= o(II y
II n), and thus:
g¡ (y)z + ... + gn- 1 (y){z)n-l
= o( 11 Y 11 n).
Replace z by z/2 in this, then multiply both sides by 2 and subtract the relation obtained from the original relation; finally multiply both sides by 2. We obtain g2 (y)(z) 2 + ... + (2 - (1/2 n- 3 )gn-1 (y)(z)n- I = o(II Y 11 n). Repeat this procedure so as to eliminate g2 (y)(z) 2 , etc. We finally obtain: gn-1(y)(z)n-l
= o(II Y II n).
By equation (4.12) this implies that II nn- 1f(x + y) -nn- 1f(x) - an(x)y 11/ 11 y 11 - O as y - O. By the definition of a derivative, this shows that Dnf(x) = an(x) and, since an is continuous, f is of class en. □ 4.3.5
An application
The above result is a convenient criterion for deciding whether a map is of class en. Consider, for example, the inversion J: u - . u- 1 introduced in §4.2.6. If xE GL(E; E), h EP(E; E) and 11 x- 1 oh 11 < 1, we may write n
~
J(x+h)= I; (-x- 1oh)kox- 1 +Rn(x,h),whereRn(X,h)= I; k=O
k=n+ 1
(-x- 1 oh)kox- 1•
Set: Ak(x)(h1, ... , hk)
= ~ ( - ll (x- 1 o hso> o x- 1 ) o ... o (x- 1 o hs(k) o x- 1 ) s
48 where h 1, ... , hk EP (E; F) and the summation is over ali permutations s of ( 1, 2, .. . , k) . Clearly Ak(x) E :Jlk (E; F) is symmetric and x M-+ A k( x) is continuous, since x M-+ x- 1 is continuous. Also Rn (X, h) is of the form pn(X, h)(h)n , where Pn(X, h) ESr(E; F) depends continuously on (x , h) and satisfies Pn(X, O)= O. Since J(x + h) =
n
I: k=O
Ak(x)(h)k + Rn (x, h ), Theorem 4.1 1 shows that J is of
class en and that DkJ = Ak, k =o, ... , n, for ali n.
4.4
TA YLOR SERIES AND ANAL YTICITY
Definitions 4.3 00
Suppose f is of class C
We may thus form the series I; (1/k!)Dkf(a)(h)k.
00 •
o
We call this the Taylor series off at a (if a= o we sometimes call it the Maclaurin series of /). lf, given a number r > O, the Taylor series converges for ali h E E with 11 h 11 < r, and if its sum is equal to f(a + h), we say that f is analytic at a. Since/(a + h) =
n
¿ o
(1/k!)Dkf(a)(h/
+ Rn(a, h) for every integer n > O,fis
analytic at a if, and only if, the remainder Rn(a, h) tends to zero with 1/n. Without dwelling on a study of analytic maps, which itself could occupy a whole book, we can nevertheless observe that the Taylor series may converge without having a sum equal to f(a + h). Here are sorne examples which are interesting in their own right.
4.4.1
Non-analytic functions of class C
00
Consider the function /: R __. R defined by: f(x)
= [e.e-11x2 O
>?
if x otherw1se
We shall show that it is C Clearly there is no problem for x < O, as then Dnf(x = O for ali n. If x > O, we see by induction on n that D"f(x) = x - 3 nQn(x1f(x), where Q 0 (x) = 1, Q1 (x) = 2, Q 2 (x) = 4- 6x 2 and Qn(x), for n ~ 1, is a polynomial of degree 2n - 2 which satisfies the recurrence relation: 00
•
Qn+1(x)
= (2- 3nx 2 )Qn(X) + x 3 Q~(x).
The function f is thus C for x > O. We shall show, by induction on n, that Dnf(O) = O. This is clear if n = O. Suppose that this result has been established for n. Then: 00
lim. Dnf(x)/x= lim. x - 3n- lQn(X)e- 11 x x- o x- o
2
=Q
49
and thus Dn + 1f(O) = O. In particular, the nth derivative of f is everywhere continuous, even at x = O. The function f is thus C The Maclaurin series off plainly converges to zero since Dnf(O) = O. The sum is therefore not f(h); f is not analytic at zero. Numerous analogous functions may be obtained from f. Since f is C"° and increasing for x > O, the function g(x) =f[ 1 - f(x)] is of class C"°, decreasing on R and such that g(x) = 1 for x ~ O, g(x) = O for x ~ 1. Let a(A) is invertible. The relation [exp(L)]- 1 = exp(-L) shows this directly for a step function: [Pg(A)]
-I
= [exp(t:,,.tn · An) ... exp(t:,,.t¡ .A¡)] - l = exp(-t:,,.t¡. A1) ... exp( -
t:,,.tn ,An)
(Note the arder of the factors.) This property extends by continuity to any regulated function. (d) Chasles' relation. Plainly pg (A) = 1 ( = id E). If a> b, the last property enables us to define pg(A) by [Pg(A)]- 1 • With this convention, if A: /-+ End(E) is a regulated function defined on an interval /, we have the analogue of Chasles' relation: P~(A)=Pb(A)Pg(A)
for
a,b,cEI.
(Note the arder of the factors.) This may easily be verified for a step function, and the general case follows by continuity. b
(e) The reader who is curious about the notation II a
[1 + A (t) dt]
may
68 prove, in the same way as in Definition 5.2, that: n
Pf(A)=limIT [l+At;.A(t;)] i=O
when
the
greatest
of the steps A t; = t; + 1 - !; of the partition a= to< .. . < In= b tends to zero. Lastly, here is the analogue of Theorem 2.9 which says that a continuous function is the derivative of any of its primitives.
Theorem 6.1
lf a O, uniform continuity ensures the existence of a partition to< ti< ... (t, y) 11 ~ eKI t - tol 11 x- y 11 for ali x, Y E Br12(Xo) and I t - to 1 ~ a/2. In particular, x ~ (t, x) is lipschitztian on Br12(Xo) uniformly in t: 11 (t, x)- (t, Y) 11
~ eKª
12
II x- Y 11·
Proof. The first part is simply a reformulation of Theorem 7.1. If xEBr12(xo) we replace the point x 0 by the point x and Br(Xo) by Br12(x) (which still belongs to the open set U). We also replace the number a in by a/2 so that (t, x) lies in Br12(x). If x, y E Br12(x0 ), (t, x) and (t, y) satisfy equation (7.2), with suitable change of notation. Put u(t) = 11 (t, x)- (t, y) 11- Since Xis K-lipschitzian,
we have: u(t)
~
= 11 x- y+
11 x- Y 11
+K
~ :º
[X((s, x)) - X((s, y))] ds
I ro
11 (s, x) - (s, Y) 11
ds
1=
11 11 X- Y 11
and the theorem follows from Gronwall's lemma if t ~ to.
+K
Irº
u(s) ds
1
86 If to > t, we may reduce the matter to the preceding case by changing t to - t and X to - X.
Theorem 7.5 (elass e
1 )
Suppose the hypotheses of Theorem 7 .1 hold, save that we assume that X is of class e 1 • We then know (see §2.1) that X is locally lipschitzian. Taking a smaller open set U, we may also assume that X is K-lipschitzian for sorne K. Then the conclusions of Theorem 7.4 hold. But we shall show, in addition, that is of class et. Proof. If we knew that was sufficiently differentiable, Schwarz' theorem (§4.1) and the equation dq,/dt = X o would imply that: :t D2(t, x)
= DX[ (t, x)]
o D 2q,(t, x)
with D2(to, x) = 1, since (to, x) = x. This leads us to study the solution u of: :t u(t)
= DX[ cj,(t, x)] u(t)
(7.4)
such that u(t 0 ) = 1 ( = idE). This is a linear differential equation in u whose coefficient DX[ ( )] is continuous since X is of class e 1 • The desired solution thus exists, it is unique and is defmed for all t (Theorem 6.2). We denote it by t/;(t, x). (a) To show that t/; is continuous. By Theorem 6.2, the map t ""-+ t/;(t, x) is continuous and even of class et for all x. If we can show that x ,.,.__. t/;(t, x) is uniformly continuous in t, the result will follow. Since DX and are continuous, we may choose the radius r of the ball B,(xo) and the number a> O so small that 11 DX[(t, x)] 11 is bounded above by a number m for all t E [to - a/2, to+ a/2] and all x E B,12(xo). With this choice of r, since is continuous, the set (cj,(t, x): 1 t - to 1 ~ a/2, xE B,12(xo)} is compact. The continuous function cj,(t, x) ""-+ DX[(t, x)] is thus uniformly continuous on this compact set: given any & > O there exists f, > O such that: sup 11 (t, x) - (t, Y) lt-tol -.;a/2
11
~ ó
(7.5)
implies that: sup 11 DX[(t, x)] -DX[(t, y)] lt-tol ,a/2
11 ~ t
(7.6)
By Theorem 7.4, is lipschitzian in x uniformly in t; thus there exists ó' > O such that 11 x- y 11 ~ ó' implies equation (7.5) and hence (7.6).
87 Now consider the functions ,J¡(t, x) and v,(t, y) defined by
dt/; dt (t, x) = DX[ (x), (x))
□
Remark
The reader will find in R. Abraham and J. Marsden (pp. 175-176) an even shorter proof which, however, requires ideas not covered in this book.
Corollary 8. 1
The non-degenerate critical points off are isolated.
106 Corollary 8.2
Let J : Rn-+ R be a function of class ck+z, k;?: l. Suppose that J (O) = O and that O is a non-degenerate critica} point of J. Then on sorne neighbourhood of O there are local coordina tes (X¡) in which J rnay be expressed as 2
X1
+ . . . + Xr2 -
2
X r+ 1 -
• • • -
2
Xn .
In particular, the local Ck-conjugacy class at O of J is characterized by the nurnber of squares with a + sign which occur in the decornposition of the quadratic forrn D 2J(O) (h, h) as a surn of squares. The reader will fi nd other applications of the Morse-Palais lernrna in Chapter 10. 8.4 8.4.1
LINEARIZATION OF VECTOR FIELDS
Conjugacy in the group of diffeomorphisms
We return to the Ck-conjugacy defined in §8 .1.1 and to the notation used already: J: E-+ E' is Ck-conjugate to J1: E1-+ E{ if there are Ck diffeornorphisrns O for all h E E. Then f(m) < f(x) for all xE E distinct from m. Here is an idea of the proof. The integral curves of the vector field grad f all emanate from the point m. They cover E, and the restriction off to any one of them is an increasing function, starting from a. In what follows we are going to specialize the space E and the function f.
10.3
10.3.1
SPACES OF CURVES, AND THE EULER-LAGRANGE EQUATIONS
The space of C 1 curves
1 Let V be a normed vector space on R, and let / = [ a, b], a< b. A e map -y: / - V is called a e 1 curve, parametrized by t E/, in the space V. The set e 1(/; V) of all such curves is clearly a v .s. on R: if k E R and if -y 1, -y2 E e 1(/; V), we define e 1 curves k-y1 and -y1 + -y2 by setting (k-y1 )(t) = k-y1(t), (-y1 + -y2)(t) = -y1(t) + -y2(t). The real vector space e 1(/; V) has a natural norm. For since / is compact and-yand-y' arecontinuous, 11-y(t) 11 and 11-y'(t) 11 areboundedon/. Wemay thus set 11-Y 11 e 1 = sup 11-y(t) 11 + sup 11-Y' (t) 11- It is easy to check that 11-Y 11 e 1 tfl l tEl is a norm; we call 1t the e norm. Suppose that V is a Banach space. An obvious adaptation of the proof given 1 in §A.1.3 (Example (c)) shows that e 1(/; V) is complete with respect to the e norm; it is therefore a Banach space.
10.3.2
Functions of curves
There are many ways in which we may make a real number correspond to a e 1 curve 'Y· Suppose that V is a euclidean space; we can, for example, attach
128
to 'Y its euclidean length J~11-y' (t) 11 dt. We can also interpret -y(t) as the position in V at time t of a particle of mass + 1 and attach to it the action 11-y'(t) 11 2 dt. These examples are included in the following scheme, to which we shall confine ourselves. Let L: R (±) V(±) V - R be a e 1 function (called the lagrangian ). Given -yEe 1 (/; V), tM-+L(t,-y(t),-y'(t)) is a continuous function. It is thus integrable, and we may define a function L: e 1 (/; V)-+ R by setting:
½J~
1 b
Í,(-y) = ª L(t, -y(t), -y'(t)) dt.
(10.2)
We shall prove that Lis of class e 1 and shall calculate its derivative. We begin with a lemma which is interesting in its own right.
10.3.3
Differentiation under the integral
Lemma
10.2
Let U be an open subset of a normed v.s. E, let / = [ a, b], a< b, and let f be a continuous map of / X U to a normed v .s. F. Suppose that Di./ exists and is continuous; put g(u) = J~ f(t, u) dt. Then g is of class e 1 on U and Dg(u) = J~ D2f(t, u) dt. Proof. Let u E U. Choosing, if necessary, a small enough open neighbourhood of u, again denoted by U, we may assume that D2f is bounded on / x U. Let /: E-+ F be the continuous linear map h EE-J~ D 2f(t, u)h dt. By the fundamental theorem of integral calculus:
L = 1: [1 ~ = 1:[1~ b
g(u
+ h) -
g(u) - /(u)=
[f(t, u+ h) - f(t, u) - D2f(t, u)h] dt
D2f(t, u+ sh)h ds- D2f(t, u)h] dt
[D2f(t,u+sh)-D2f(t,u)]hds] dt.
It follows that:
11 g(u + h) - g(u) - l(h) 11 ~ (b - a 11 h 11 sup 11 Di.f(t, u+ sh) - D2f(t, u) 11 where the supremum is taken over ali s, t with O~ s ~ 1 and a~ t ~ b. However D2f is uniformly continuous on the compact set / x (u). Hence, given any e > O, there exists ó > O such that 11 h 11 < ó implies that sup 11 11 < e. This shows that / = Dg(u). As uM-+JgD2f(t, u) dt is continuous, gis of class e'. O
129 Theorem 1O. 7
If the lagrangian L is of class e1, then the map [,: e 1 (J; V)-+ R defmed by equation (10.2) is of class e1, and for all h E e 1 (J; V) we have DL(-y)h = Jg[DiL(). h(t) + D3L( ) . h '(t)] dt, where we have abbreviated (t, -y(t), -y' (t)) by ( ). Proof. We shall apply Lemma 10.2 with U=E=e 1 (J; V), F=R and f(t, 'Y) = L (t, -y(t), 'Y' (t)). The map f is continuous as it is the composition of obviously continuous maps: M
L
Ixe 1 (l; V)-+J(±) V(±) V-+R, (t,-y)
M-+
(t,-y(t),-y'(t)) M-+L(t,-y(t),-y'(t)).
We shall prove that D2f exists a·nct is continuous. Since L is of class e 1, it is enough to show that DzM exists and is continuous. Let -y, h E e 1 (I; V). Then M(t,-y+h)-M(t,-y)=(O,h(t),h'(t)). But the map hEe 1(J; V)M-+ (O,h(t),h'(t))El(±) V(±) V, which is plainly linear, is continuous since 11 (O, h (t), h' (t)) 11 = 11 h (t) 11 + 11 h '(t) 11 :::;; 11 h llc 1 • Thus D2M exists and D2M(t, 'Y )h = (O, h (t), h' (t)), which does not depend on 'Y and which 1s continuous in t. Applying the chain rule to f =Lo M, with h E e 1 (l; V), we obtain: D2f(t, -y)h = DL(M(t, -y)). DzM(t, -y)h = DL(t, -y(t),-y' (t)). (O, h(t), h'(t)) D2f(t,-y)h=D2L( )h(t)+D3L( )h'(t)
(10.3)
where ( ) = (t, -y(t), -y' (t)). Since L is of class e 1 , we see a posteriori that D 2(t, 'Y) depends continuously on(t,-y). With our choice off, the hypotheses of Lemma 10.2 are satisfied and we obtain DL(-y)h = Jg D 2f(t, -y)h dt which, taking account of equation (10.3), pro ves Theorem 10. 7 D
10.3.4
A restricted extremum problem
Let us retain the above notation. We intend to find the extrema of the function [: e 1 (l; V)-+ R, i.e. to find the curves 'Y which make L(-y) = J~ L(t, -y(t), -y' (t)) dt a minimum (or maximum). In practice we meet several problems of the following type. Find the shortest path from A to B in the euclidean space V. This amounts to finding the e 1 curves 'Y with origin A and end B which make Jg 11-Y' (t) 11 dt a minimum. More generally, given two points A, B in the space V in which the curves are traced, find a curve 'Y E e 1 (I; V) with origin -y( a)= A and end -y(b) = B which makes i(-y) an extremum. We are led to look for the extrema of the restriction of i to the subset
130 r (A , B) = (-yEet (/; V):-y(a) = A, -y(b) = B) . It is clear that this subset is a closed affine su bspace of et(/; V) : if O ~ u ~ 1 and if 'Yt, -y2 E r(A, B), then U-yt + (1 - u)-y2 E r(A, B); and if 'Yn E r(A, B) converges to -yE e 1 (/; V) in the sense of the e t norm, then -y{a) = lim 'Yn(a) = A and -y{b) = lim 'Yn (b) = B. It can also be seen that r (O, ,) is a vector subspace and that, if -yo is a fixed element of r (A, B)- for example the segment t M-+ [ (A - B)t +(aB-bA)]/(a - b)-then r(A,B) is obtainable from r(O,O) by the translation 'Y M-+ 'Y + -yo. By the trace theorem (§ 1.2.4), the restriction of [ to r (0, O) (thus to r (A, B)) is of class et and the derivative of this restriction is the restriction to r (O, O) of DL. It is important to understand that the derivative of the restriction of [ to r (A, B) is a continuous linear map which operates on r (O, O): if -y, 'Y+ h E (A, B) their difference h E r(O, O), by the observation made above. lt follows that: [, 1
r(A,B)('Y + h) - [
1 r(A,B>('Y)
= DL(-y)h + o(h)
With the help of Theorems 10.1 and 1O. 7, we have thus proved the following result. Theorem JO. 8
The map [:
r (A, B) -+ R
1:
has an extremum at 'Y only if:
[D2L (t, -y(t), -y' (t))h(t) + D3L ( )h' (t)] dt
=O
for ali h E r (0, O), that is, for ali h E et(/; V) such that h (a) = h(b) = O. □ Such a curve 'Y is usually referred to as an extrema/ of the /agrangian L with fixed extremities A and B. This terminology should not give rise to errors: the condition DL (-y)= O is necessary for the extremum, but is not sufficient, in general. To determine the extremals, we shall give a more manageable form of the last theorem. To do this, we limit ourselves to curves 'Y traced on a Hibert space V. This enables us to identify V with its dual v* by means of the map X E V M-+ (X, · ) E V*. 10.3.5
Du Bois Reymond's lemma
Lemma 10.3
Let B: I = [ a, b] -+ v* be a continuous function with values in the dual of a Hilbert space V. In arder that IgB(t)h' (t) dt = O for ali h E et(/; V) satisfying h(a) = h(b) = O, it is necessary and sufficient that B should be constant. Proof. The condition is plainly sufficient. To show that it is necessary, note
131 that , if C: J---. v * is a constant, we have:
1:
[ B (t) - C] h ' (t) dt
= O.
Choose C= (lj(b - a )) J~ B (t ) dt , identify V with its dual, and take: h(t ) =
[
[B (s) - C ] ds.
Then h (a)= h (b) = O; h is differentiable and h ' (t) = B (t) - C is continuous. We must therefore have: 2
0= 1: [B(t) - C ] h ' (t)dt = 1:II B(t) - C ll dt .
□
This implies that B(t) = C for ali t E/. Corollary 10. 1
Let A: / =- [ a, b] - v* and B: I - v* be continuous. In arder that J~ [A (t)h(t) + B(t)h ' (t) ] dt = O for ali h E C 1 (/; V) such that h(a) = h(b) = O, it is necessary and sufficient that B should be differentiable and that B' =A. Proof. The condition is sufficient, for B' h
+ Bh' = (B, h) ', and thus:
1: [Ah+ Bh'] dt = 1>B, h)' dt = (B(b), h(b)) - (B(a), h(a)) = O. To show that it is necessary, set C(t) = J~ A (s) ds. Then C' (t) = A (t), and so Ch' +Ah=< C, h') + ( C', h) = ( C, h)'. Integrate and use the facts that h(a) = h(b) = O: 0= (C(b),h(b))-(C(a),h(a)) = 1: (C,h)' dt
=
1:
[C(t)h'(t) + A(t)h(t) ] dt.
By hypothesis, J~[B(t)-C]h'(t)dt=O and, by Lemma 10.3, B(t)= C(t) + constant. Hence B' exists and equals C' =A. □
Theorem 1O. 9
Let V be a Hilbert space. In order that 'Y E C 1 (/; V) should be an extremal of the lagrangian L with fixed extremities "((a)= A and "((b) = B, it is necessary and sufficient that t M-+ D 3L(t, "((t), 'Y'(t)) be differentiable and that: i_D 3L(t, "f (l), 'Y' (t)) = DiL(t,'Y(t), 'Y' (t)) for all tE/. dt
(10.4)
132
Proof. This is an immediate consequence of Theorem 10.8 and Corollary 10. 1, in which we take A= DiL and B = D 3L. Equation (10.4) is known as the Euler-Lagrange equation. lt determines the extremals . Note that the values of 'Y at a and b do not occur in it. Remark In the book by H. Cartan (pp. 294-296) there is a proof of Du Bois Reymond's lemma which is valid not only in Hilbert spaces but also in real normed linear spaces. 10.3.6
Free extrema problems
Let us look for the extrema of L: e 1 (/; V) --+ R without requmng that -y (a)=A and -y(b ) =B. If L has an extremum at 'Y, then DL(-y)=0. The restriction of DL(-y ) to every affine subspace of e 1 (/; V) passing through 'Y is thus zero. In particular, let us take the affine subspace r(A, B) and apply Theorem 10.9. We see that 'Y must also satisfy the Euler-Lagrange equations. This is, moreover, clear by Theorem 10.7: if DL(-y)h = O for ali h E e 1 (/; V), it must certainly be so for all h satisfying h(a) =A, h(b) = B. Conversely, if 'Y satisfies the Euler-Lagrange equations, we may ask whether DL(-y) = O. The reader should prove that for all h E e 1 O, be situated in euclidean space R 3. Denote by q;(t) and v;(t) their positions and velocities respectively at time t. Suppose they are subjected to forces derivable from a C 1 potential U(q 1, ... , QN). Newton's equations of motion may be written as m;(dv;/dt) = - au/aq;, where au/aq; = D;U. 1, D;U being the partial derivative of U with respect to q; E R 3 • We may represent these N points by a single point q = (q 1 , ••• , QN ) of R 3 N, the velocity of which is v = (v 1 , .•• , VN) E R 3 N. Introduce the kinetic energy T= ½I:m; 11 v; 11 2 and the lagrangian L: R (±) R 3 N (±) R 3 N - R defined by L(q, v) = T- U. Then Newton's equations coincide with the Euler-Lagrange equations; for, with obvious notation: aL av;
ar av;
aL aq;
au aq;
-=-=m;v; and-= - - ; thus
O= m; dq;dt + au =~ (ªL) _aq; aL. aq; dt av; This is Hamilton 's principie of least action: the motion of the system is given by the extremals of the action integral l(T - U) dt. The lagrangian is regular, for D 33 L( ) has matrix with components:
[º
a2 L
if i ,¿j av;avj = m; if i =J lf we put D 3 L = p, we find that the components P1, ... , PN of pare p; = m;v;, which is simply the momentum of the ith particle. The function G of Theorem 10.10 is very simple: v; = p;/m;. The first-order system written in Theorem 10.10 is here: dq; dt
p;
= m,-'
dp; dt
=
136
With H(q, p) =
½~ 11 p; 11 /m; + U(q), this becomes 2
dq¡ dt
= aH ap;'
dp¡ dt
=
aH aq;
This is the hamiltonian form of the equations of motion (see §7 .1.5). Suppose that the system reduces to one particle of mass + I and that it is not subjected to any externa! force. The Euler-Lagrange equation (or Newton's equation) shows that the velocity of the particle remains constant. The extremals are straight lines described at constant speed. Here the lines are not simply extremals of the length J~ 11 'Y' (t) 11 dt, but they are also extremals of the action J~ 11 'Y' (t) 11 2 dt. The following example is a generalization of this result.
Example Let L: R (±) V© V-+ R be a lagrangian independent of t E R and such that, for each Q E V, v E V M--. L(q, v) is a positive non-degenerate quadratic form. In other words, there is a bilinear, symmetric, non-degenerate form b(q) such that L(q, v) = b(q)(v, v) > O if v ;:t O, the function q M--. b(q) being of class
e1.
We propase to show that every extrema! 'Y of J~ L dt is an extremal of J~ JL dt. Let us begin by proving that if 'Y is an extrema! of L, Lb(t),'Y'(t)] does not depend on t. For dL/dt=D2L."f' +D 3L.'Y"= [D2L - (d/dt)D3L] 'Y' + (d/dt)(D3L. 'Y'). But the first term on the right-hand side is zero, by the Euler-Lagrange equation; and Leibniz' rule shows that D 3L · v = D 3 [b(q)(v, v)] v = 2b(q)(v, v) = 2L. Hence O= (d/dt)L. The last relation and the Euler-Lagrange equation for L show that:
= - ! L - ¼ dL + 12 L - ½ [~ D3L 4
dt
dt
DiL]
This is the Euler-Lagrange equation for the lagrangian extrema! of JL (see H. Cartan, pp. 303-306).
10.5
JL;
=O 'Y is thus an
EFFECT OF A DIFFERENTIABLE MAP
We return to the space e 1 (/; V) of e 1 curves 'Y: /---+ V. If : V-+ V is a map of class e\ k ~ I, the image o 'Y of a e 1 curve is a e 1 curve; thus induces a map : e 1 ([; V)-+ e 1 (/; V) defmed by ('Y)(t)=b(t)]. Since e 1 (/; V) is a space normed by the e 1 norm, we may ask whether is differentiable.
137 Proposition 10.1
If O, situated in a plan e R 2 , is subjected to a potential which depends only on the distance p of the particle from the origin. Take polar coordinates (p, 0) with origin O. The lagrangian T- U introduced in § 10.4.2 may be written as; L(p,0,p',0')= ½m(p' 2 +p 2 0'
2
)-
U(p).
The Euler-lagrange equation decomposes into two scalar equations: d (ªL) aL d (ªL) aL dt ap' = ap' dt a0, = a0 ·
The first of these gives mp" = mp0' 2 - dU/dp. As 0 does not appear in L, the second gives aL/a0' = constant; that is, mp 2 0' = constan t. This is the law of conservation of angular momentum about O, also called the 'law of areas'. We shall see in §7 how such conservation laws may be included in a more general scheme.
10.6
INV ARIANCE OF A LAGRANGIAN
Definition 10.2 Suppose that the lagrangian L: / (:E) V (:E) v- R does not depend on t E/. It is then a C I function defined on V (:E) V. In this case it is convenient to interpret V (:E) V as the set TV = (q E V, v E V) of pairs formed by a point q E V and by a vector v tangent to V at that point (we regard V as a submanifold of V).
139 The lagrangian Lis then a et function on TV. If s(q) 1s=o = w /\ q. Noether's theorem thus can be written as D 3L[')'(t),y'(t)] · (w/\q) = constant.
143 ldentifying a linear functional with a vector by means of the isomorphism x E VM-t(v, ·) E v*, this can be written as (D 3L,w/\q) =constant. Taking account of the properties of the mixed product, (q /\ D 3L [ ], w) = constant. Again consider the example in § 10.4.2. lnvariance under rotation is represented by (LJm;(q;/\v;),w) = constant. If, in particular, the potential U depends only on the distance from the origin, L = T- U is invariant under al/ rotations about O. We may thus choose w arbitrarily, from which it turns out that the angular momentum LJ m;(q;/\ v;) about O is constant.
10. 7.3
Generalization of Noether's theorem
Suppose that L: / O such that IIX[x+f(t)] -X(y)II