127 34 3MB
English Pages [382]
Colecci´on Yachay Tech
M01
A course of Functional Analysis with Calculus of Variations Juan Mayorga-Zambrano
Cover image: xxxx xxxx
Colecci´ on Yachay Tech, M01 A course of Functional Analysis Juan Mayorga-Zambrano
© por Juan Ricardo Mayorga Zambrano Editorial XXXXX XXXXXX Primera edici´ on
ISBN XXX-XXX-XXXX
El objetivo de la Colecci´ on Yachay Tech es poner a disposici´ on de los estudiantes libros de texto y material de estudio de excelente calidad acad´ emica. Los libros son preparados tomando en cuenta los perfiles y caracter´ısticas de las carreras de Yachay Tech, donde se tiene un tronco com´ un de materias de formaci´ on b´ asica de tres semestres que da paso a estudios espec´ıficos de carrera (en ingl´ es) desde cuarto semestre. Para m´ as informaci´ on, visite www.yachaytech.edu.ec
Contents Preface
I.
vii
Functional Analysis
3
1. Preliminaries 1.1. Sets . . . . . . . . . . . . . . . . . . . . 1.2. Relations and functions . . . . . . . . . . 1.3. Families of sets, partitions and equivalence 1.4. Order relations . . . . . . . . . . . . . . . 1.5. Cardinality . . . . . . . . . . . . . . . . . 1.6. Linear spaces and algebras . . . . . . . . 1.7. Linear operators . . . . . . . . . . . . . . 1.8. Problems . . . . . . . . . . . . . . . . . .
. . . . . . . . . . relations . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
5 5 7 10 12 13 15 26 31
2. An introduction to topological spaces 2.1. Definition of topology . . . . . . . . . . . . . . . . 2.2. Topological basis and fundamental systems . . . . 2.3. Interior, adherence and boundary of a set. Density. 2.4. Separation conditions and sequences . . . . . . . . 2.5. Accumulation points. Limit superior and inferior. . 2.6. Subspaces and continuity . . . . . . . . . . . . . . 2.7. Initial topology. Product spaces. . . . . . . . . . . 2.8. Compact sets . . . . . . . . . . . . . . . . . . . . 2.9. Problems . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
35 35 36 39 42 47 49 54 59 64
3. An introduction to metric spaces 3.1. Definition of metric . . . . . . . . . . . . 3.2. Definitions of norm and interior product . 3.3. Basic properties of metric spaces . . . . . 3.4. Complete metric spaces. Baire’s theorem. 3.5. Isometries. Completion of a metric space. 3.6. The spaces L1 ([a, b]) and L2 ([a, b]) . . . . 3.7. Banach fixed point theorem . . . . . . . . 3.8. Compact sets in metric spaces . . . . . . 3.9. Problems . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
69 69 72 79 82 89 93 95 102 109
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
4. Banach and Hilbert spaces 117 4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 4.2. Equivalence of norms . . . . . . . . . . . . . . . . . . . . . . . . 119 4.3. Finite-dimensional normed spaces. Weierstrass Approximation Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 i
ii
Contents 4.4. Additional examples of Banach and Hilbert spaces . . . . . . . . 4.4.1. The space (Rn , ∥ · ∥p ) . . . . . . . . . . . . . . . . . . . 4.4.2. The space lp (R) . . . . . . . . . . . . . . . . . . . . . . 4.4.3. The Lebesgue space Lp (I), I ⊆ R . . . . . . . . . . . . 4.4.4. The space Cn ([a, b]) . . . . . . . . . . . . . . . . . . . 4.4.5. The Sobolev spaces H1 (I) and W1,p (I) . . . . . . . . . 4.5. Schauder and Hilbert basis . . . . . . . . . . . . . . . . . . . . 4.6. Direct sums and orthogonal of a set . . . . . . . . . . . . . . . 4.7. Hilbert basis of L2 (I) . . . . . . . . . . . . . . . . . . . . . . . 4.7.1. Fourier-Legendre series . . . . . . . . . . . . . . . . . . 4.7.2. Trigonometric Fourier series. Stone-Weierstrass theorem. 4.7.3. Fourier-Hermite series . . . . . . . . . . . . . . . . . . . 4.8. Convex sets, functions and hyperplanes . . . . . . . . . . . . . . 4.9. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
126 126 130 131 135 135 138 146 148 150 155 161 163 166
5. Fundamental theorems of Functional Analysis 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Some properties of L(V, W ) . . . . . . . . . . . . . . . . 5.3. Continuous embeddings . . . . . . . . . . . . . . . . . . . 5.4. Riesz-Fr´echet representation theorem . . . . . . . . . . . . 5.5. Hahn-Banach theorem . . . . . . . . . . . . . . . . . . . . 5.6. The dual of C([a, b]). The Riemann-Stieljes integral. . . . 5.7. Geometric forms of Hahn-Banach theorem . . . . . . . . . 5.8. Adjoint operator . . . . . . . . . . . . . . . . . . . . . . . 5.9. Reflexivity and separability (I) . . . . . . . . . . . . . . . . 5.10. Uniform boundedness principle . . . . . . . . . . . . . . . 5.11. The open mapping theorem and the closed graph theorem 5.12. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
173 173 181 185 189 192 196 198 204 206 209 212 216
6. Weak topologies, reflexivity and separability 6.1. Weak convergence and topology σ(E, E ′ ) . 6.2. Weak ∗ convergence and topology σ(E ′ , E) 6.3. Reflexivity and separability (II) . . . . . . . 6.4. Uniformly convex spaces . . . . . . . . . . 6.5. Problems . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
229 229 236 245 249 252
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
II. An introduction to the Calculus of Variations 7. Calculus on normed spaces 7.1. What Calculus of Variations is about . . . . . . . . . 7.1.1. A problem from classical Calculus . . . . . . 7.1.2. A situation to deal with Calculus of Variations 7.1.3. Euler’s finite-difference method . . . . . . . . 7.2. A couple of important things . . . . . . . . . . . . . 7.2.1. Normed and Banach algebras . . . . . . . . . 7.2.2. Exponential mapping . . . . . . . . . . . . . 7.2.3. Small o . . . . . . . . . . . . . . . . . . . . 7.2.4. Uniform continuity. Canonical basis of Rn . .
. . . . . . . . .
257 . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
259 259 259 261 263 264 264 266 266 267
Contents 7.2.5. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3. The differential . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1. Directional derivatives and Gateaux differential . . . . . . 7.3.2. Fr´echet differential . . . . . . . . . . . . . . . . . . . . . 7.3.3. Gateaux or not-Gateaux? That’s the question . . . . . . . 7.3.4. Differentiability of a generalized vector field . . . . . . . . 7.3.5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.6. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4. Chain rule. Class C1 . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1. Chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2. Mappings of class C1 . . . . . . . . . . . . . . . . . . . . 7.4.3. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5. Critical points and extremum . . . . . . . . . . . . . . . . . . . . 7.5.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.2. Necessary condition for an extremum . . . . . . . . . . . . 7.5.3. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6. Mean Value Theorem. Connectedness. . . . . . . . . . . . . . . . 7.6.1. Classical and generalized Mean Value Theorem . . . . . . 7.6.2. Derivative of a curve . . . . . . . . . . . . . . . . . . . . 7.6.3. Main tool . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6.4. Proof of the Mean Value Theorem . . . . . . . . . . . . . 7.6.5. An important result that depends on connectedness . . . . 7.6.6. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7. Some useful lemmas . . . . . . . . . . . . . . . . . . . . . . . . . 7.8. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.9. Partial differentials . . . . . . . . . . . . . . . . . . . . . . . . . . 7.9.1. Differential in terms of partial differentials . . . . . . . . . 7.9.2. Partial differentials and the class C1 . . . . . . . . . . . . 7.9.3. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.10. Riemann integral . . . . . . . . . . . . . . . . . . . . . . . . . . 7.10.1. Step and regulated mappings . . . . . . . . . . . . . . . . 7.10.2. Integral of step and regulated mappings . . . . . . . . . . 7.10.3. Fundamental Theorem of Calculus and derivation under the integral sign . . . . . . . . . . . . . . . . . . . . . . . . . 7.10.4. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.11. Taylor expansion. Second order conditions for extremum. . . . . . 7.11.1. Higher-order differentials . . . . . . . . . . . . . . . . . . 7.11.2. Multilinear mappings . . . . . . . . . . . . . . . . . . . . 7.11.3. Higher-order differentials and multilinear mappings . . . . 7.11.4. Taylor expansions . . . . . . . . . . . . . . . . . . . . . . 7.11.5. Second-order conditions for extrema . . . . . . . . . . . . 7.11.6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 7.11.7. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .
iii 267 267 268 269 273 275 277 283 286 286 288 290 291 291 292 294 294 294 295 296 297 297 299 300 302 303 303 304 306 309 309 310 312 314 314 314 315 318 318 321 323 332
8. The Euler-Lagrange equation and other classical topics 335 8.1. The elementary problem of the Calculus of Variations . . . . . . . 335 8.1.1. The functional of the elementary problem. Admissible elements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
iv
Contents 8.1.2. Weak and strong extremum . . . . . . . . . . . . . . . . . 8.1.3. Derivation of Euler-Lagrange equation . . . . . . . . . . . 8.1.4. Bernstein’s existence and unicity theorem . . . . . . . . . 8.1.5. Differentiability of solutions of the Euler-Lagrange equation 8.1.6. Convexity and minimizers . . . . . . . . . . . . . . . . . . 8.1.7. Second form of the Euler-Lagrange equation . . . . . . . . 8.1.8. Special cases . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.9. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.10. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2. Some generalizations of the elementary problem of the Calculus of Variations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1. The fixed end-point problem for n unknown functions . . . 8.2.2. The isoperimetric problem . . . . . . . . . . . . . . . . . 8.2.3. A generalization of the isoperimetric problem . . . . . . . 8.2.4. Finite subsidiary conditions . . . . . . . . . . . . . . . . . 8.2.5. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .
336 337 340 340 342 343 344 347 349 354 354 356 358 358 359
References
363
Index
365
!M השKברו
To my wife, Carmen, and my children, Daniel, Keren, Jaya and Jaim,
Preface This work was born to attend the needs of my students at Yachay Tech University1 . Some years ago, when I was assigned the subject of Functional Analysis, I realized that the students needed to acquire a big bunch of tools in a very short lapse and that, because of this, good books like [17] could not satisfy our needs. So, I prepared some notes for the course which, thanks to God, evolved to become Part I of this book. As departure point we recognize the need of solid foundations and, therefore, some preliminaries are introduced (Chapter 1), including topological and metrics spaces (Chapters 2 and 3). The last allows to deal with Banach and Hilbert spaces (Chapter 4). Part I also dedicates a good share to the study of the fundamental theorems of Functional Analysis (Chapter 5): the uniform boundedness principle, the open mapping theorem, the closed-graph theorem, and the theorems of RieszFr´echet and Hahn-Banach. With these tools at hand a-not-so-short study of weak topologies (Chapter 6) is presented to finish Part I. Just after the mentioned course finished, our university lost, as faculty member, professor Anvarbek Meirmanov, a very gifted Russian mathematician who had the course of Calculus of Variations among his duties. His course became mine. Anvarbek used to teach it in a classical way, like in [9], but now this clearly became not the best option because most of the students were the same that studied previously Functional Analysis with me. So, again, I wrote notes for the course; such notes developed into Part II of this book. Part II is dedicated to topics of Calculus of Variations taking advantage of what was done in Part I. A good part of Calculus in normed spaces (Chapter 7) is studied before getting into the classical topics of the Calculus of Variations (Chapter 8) like the properties of the Euler-Lagrange equation. In Part I, a good number of proposed exercises is presented at the end of each chapter. Since the topics in Part II are more “operative”, a list of proposed exercises is presented at the end of each section of the corresponding chapters. Before trying an exercise, remember to study the concepts which are involved and to study (even redo) examples provided in the document and in other sources. A few number of exercises are marked with an ∗; these are harder than the others. I already mentioned that this project started as course notes for my students at YT, the first university in Ecuador founded on a research-objectives basis. This also justifies the fact that it’s prepared in English, the language of science, instead of Spanish, the most-spoken official language of Ecuador. At YT the students have a 3-4 semesters of basic multidisciplinary formation concerning subjects of mathematics, physics, chemistry, biology, etc. which are taught in Spanish. After this common-core of subjects, each student chooses a career where he o she studies in English. The system has worked very well as it can be seen from the big number of alumni who have obtained full scholarships to study postgrades in prestigious 1 Yachay
means knowledge in Quechua language.
Contents
1
universities around the world. It’s also notable the number of published scientific papers where at least one of the co-authors is a student or alumni declaring affiliation to YT (see e.g. [1] and [20]). We assume that the reader has some acquaintance with Maxima commands. You will find Maxima code along the document, specially in examples. It’s the style that was used to write [18]. In fact, in that book - designed for physics and engineering students - we present tips to immediately apply Maxima commands to solve problems. Of course, there are many other internet sources where you can find guides to learn to work with Maxima, a nice Computer Algebraic System that doesn’t require a paid licence: https://maxima.sourceforge.io/. With God’s help, several books will come in the near future (both in Spanish and English) as a sequel of [18] and this document. All these have as main public the students of YT, but we hope that they will be atractive and useful for students of other universities. This project was finished with help of students of YT. Thank you Christian, Guido, Iv´an and Antonio for your time and support! I hope this book could help you and other students to learn the ways of the Force and to use it in the right way. Juan Mayorga-Zambrano, PhD Departament of Mathematics Yachay Tech University Urcuqu´ı, Ecuador [email protected] [email protected]
Part I.
Functional Analysis
3
1. Preliminaries This first chapter serves as an introduction to a number of basic concepts which are necessary to start our course of Functional Analysis. For the reader who already knows these topics, it can serve as a reminder and will introduce him to the notation that shall be used along the document. This text is prepared having in mind the characteristics and needs of the career of Mathematics at Yachay Tech University. Our main references are [4], [14], [15], [17], [8], [22] and [19].
1.1. Sets The ZFC system, formed by the Zermelo-Fraenkel set theory and the Axiom of Choice, is the axiomatic foundation of the modern Set Theory. It’s an umbrella that protects us from a rain of paradoxes1 since the beginning of the twentieth century. We’ll not use its full formalism; instead, in this brief summary, we shall apply an intuitive approach (see e.g. [10] and [11]) to advance the rudiments which are necessary for our course of Functional Analysis. Let’s recall that it’s not possible to define the concept of set and yet the idea is clear, it’s a collection of elements. We shall usually denote a set by an uppercase letter (A, B, C, etc.) and an element by a lowercase letter (a, b, c, etc.). By x ∈ A (or A ∋ x) we mean that the element x belongs to the set A; the opposite is denoted by x ∈ / A. We say that the set A is contained in the set B (or that B contains A or that A is a subset of B), denoted by A ⊆ B or B ⊇ A, iff ∀x ∈ A :
x ∈ B.
The equality of sets is given by A=B
⇐⇒
(A ⊆ B ∧ B ⊆ A).
Remark 1.1 In practice, to prove that A ⊆ B, one takes a generic element a ∈ A and works to prove that a ∈ B. We say that A is a proper subset of B, denoted A ⊊ B or B ⊋ A, whenever A is a subset of B but A ̸= B. All the sets have the empty set or void set, denoted ∅, as a subset. In a particular study, there will always be a universe set, X, to which belong all the elements under consideration. The set of parts of X, denoted by P(X), is the set whose elements are all the subsets of X. Observe that {a} ∈ P(X) ⇐⇒ a ∈ X. 1 From
1893 to 1901, Gottlob Frege’s axioms worked well to avoid paradoxes coming from the beautiful but unpolished Cantor’s set theory. But then Russel showed that one of Frege’s axiom was inconsistent.
5
6
Chapter 1. Preliminaries
A set {x} which has only one element is called a singleton. Given two subsets A and B of the universe X we write A ∩ B = {x ∈ X/ x ∈ A ∧ x ∈ B}, A ∪ B = {x ∈ X/ x ∈ A ∨ x ∈ B}, A \ B = {x ∈ X/ x ∈ A ∧ x ∈ / B}, for the intersection, union and diference of the sets A and B, respectively. In particular, we denote Ac = X \ A = {x ∈ X/ x ∈ / A}, the complement of A. To operate with sets is helpful the following link between intersection and complements: A \ B = A ∩ Bc. From the definition it immediately follows that both intersection and union are commutative operations on P(X). Moreover, it’s easy to show that two distributive properties hold, i.e., for all A, B ∈ P(X): (A ∩ B) ∪ C = (A ∪ C) ∩ (B ∪ C), (A ∪ B) ∩ C = (A ∩ C) ∪ (B ∩ C). Theorem 1.1 (De Morgan’s Laws) Let A, B, C ∈ P(X). Then A \ (B ∩ C) = (A \ B) ∪ (A \ C);
(1.1)
A \ (B ∪ C) = (A \ B) ∩ (A \ C).
(1.2)
(B ∩ C)c = B c ∪ C c ,
(1.3)
In particular,
c
c
c
(B ∪ C) = B ∩ C .
(1.4)
Proof. Let’s prove (1.3). Points (1.1), (1.2) and (1.4) are left as exercise to the reader. 1. Let’s prove that (B ∩ C)c ⊆ B c ∪ C c .
(1.5)
c
Let x ∈ (B ∩ C) , generic. Then x ∈ / B ∩ C or, equivalently, ¬(x ∈ B ∧ x ∈ C)
⇐⇒ ⇐⇒
x∈ /B ∨ x∈ /C c x∈B ∨ x ∈ C c,
so that x ∈ B c ∪ C c . Since x was chosen arbitrarily, it follows (1.5). 2. As we have biconditionals in all the previous steps, it follows that (B ∩ C)c ⊇ B c ∪ C c .
(1.6)
1.2. Relations and functions
7
From (1.5) and (1.6) we obtain (1.3).
■
Remark 1.2 Along the book, we shall denote N∗ = N ∪ {0}; In = {1, 2, ..., n},
n ∈ N.
1.2. Relations and functions A set is determined by its elements and so, for example, {a, b} = {b, a}. Therefore to define sets whose elements are ordered a trick is necessary: Definition 1.1 (Ordered pair. Cartesian product) Let A and B be nonvoid sets. The cartesian product of A and B is given by A × B = {(a, b) : a ∈ A ∧ b ∈ B}, where, (a, b) ≡ {a, {a, b}} ∈ A × B is said to be the ordered pair with first component a ∈ A and second component b ∈ B. From this definition, it follows that the equality of ordered pairs is given by (a, b) = (c, d)
⇐⇒ (a = c) ∧ (b = d).
In a similar way it’s defined the ordered product of the non-void sets A1 , A2 ,..., An : A1 × A2 × ... × An = (A1 × A2 × ... × An−1 ) × An . In particular, we denote An = A × A × ... × A (n times). In this book we shall deal with three kinds of relations: order relations, equivalence relations and functions. In this section we present the concept of function. In Sections 1.3 and 1.4 we shall define the concepts of equivalence relation and order relation, respectively. Let’s make precise the concept of relation: Definition 1.2 (Relation) Given two non-void sets X and Y , we call relation from X into Y to every r ⊆ X × Y . In the context of the Definition 1.2, the domain, codomain and image of the relation r are given, respectively, by Dom(r) = {x ∈ X / ∃y ∈ Y : (x, y) ∈ r}, Cod(r) = Y, Im(r) = {y ∈ Y / ∃x ∈ X : (x, y) ∈ r}. It’s usual to write x r y to mean that (x, y) ∈ r. Next, we present the abstract concept of function.
8
Chapter 1. Preliminaries Definition 1.3 (Function) Let X and Y be non-void sets. We say that the relation f ⊆ X × Y is a function from X into Y iff the following condition holds ∀x ∈ X, ∃!y ∈ Y : (x, y) ∈ f.
In this case we denote f :X −→ Y x 7−→ y = f (x).
(1.7)
It’s clear that Dom(f ) = X. Remark 1.3 Instead of the term function sometimes are used the terms mapping, map and application. Depending on the context, for some kinds of functions are also used the terms functional, operator, transform, etc. Sometimes, instead of the notation (1.7), it can be used the shorter form X ∋ x 7−→ f (x) ∈ Y.
(1.8)
Remark 1.4 (Sequence) Given a non-void set A, we call sequence in A to any function N ∋ n 7−→ xn ∈ A, i.e., a function whose domain2 and codomain are N and A, respectively. In this case the notation (1.7) is changed to (xn )n∈N ⊆ A,
(yn )n∈N ⊆ A,
(zn )n∈N ⊆ A,
etc.
(1.9)
In our book we will need the concept of extension and restriction of a function. Definition 1.4 (Extension and restriction of a function) We say that a function g : V → Y is an extension of the function f : U → X iff U ⊆ V , X ⊆ Y and ∀u ∈ U : g(u) = f (u). In this case we also say that f is a restriction of g. Example 1.1 (Gamma function, factorial) The gamma function, Γ :]0, +∞[−→ R, given by Z +∞ Γ(x) = tx−1 e−t dt, (1.10) 0
verifies Γ(n) = (n − 1)!, for every n ∈ N. Therefore the function ] − 1, +∞[∋ β 7→ Γ(β + 1) ∈ R, is an extension of the factorial sequence, ( 1, n = 0, n! = n · (n − 1)!, n ∈ N. The graph of the gamma function can be obtained in Maxima with (% i1)
plot2d(gamma(x),[x,0,6],[y,0,120]);
In Figure 1.1 we observe that lim Γ(x) = lim Γ(x) = +∞.
x→0+
x→+∞
1.2. Relations and functions
9
Figure 1.1.: The gamma function.
With the concept of function in mind we can state the Axiom of Choice that was mentioned in Section 1.1. Axiom of Choice. Let X be a set. There exists a choice function C : P(X) → X such that ∀A ∈ P(X) \ {∅} :
C (A) ∈ A.
Remark 1.5 Recall that an axiom is a statement that is not proved. It’s accepted as a departure element to build a theory, as long as the set of axioms does not provoke contradictions (or paradoxes). To finish this section let’s recall the concepts of direct and inverse image. Definition 1.5 (Inverse and direct image) Let f : X → Y , A ⊆ X and B ⊆ Y . The direct image of A is given by f (A) = {f (x) / x ∈ A}. The inverse image of B is given by f −1 (B) = {x ∈ X / f (x) ∈ B}. Therefore f (A) is the set formed by all the images of the elements of A. On the other hand f −1 (B) is the set of all the preimages of elements of B. In the context of Definition 1.5, if A1 , A2 ∈ P(X) and B1 , B2 ∈ P(Y ), f (A1 ∪ A2 ) = f (A1 ) ∪ f (A2 ); f (A1 ∩ A2 ) ⊆ f (A1 ) ∩ f (A2 ); f
−1
f
−1
(B1 ∪ B2 ) = f
−1
(B1 ∩ B2 ) = f
−1
(B1 ) ∪ f
(B2 );
(1.12)
(B1 ) ∩ f
−1
(B2 );
(1.13)
f (A) ⊆ B ⇐⇒ A ⊆ f 2 In
(1.11)
−1
−1
(B);
(1.14)
more general terms the domain can be any set with cardinality ℵ0 , e.g. N∗ = N ∪ {0} and Z.
10
Chapter 1. Preliminaries f (f −1 (B)) ⊆ B; f −1 (f (A)) ⊇ A; A1 ⊆ A2 ⇒ f (A1 ) ⊆ f (A2 ); B1 ⊆ B2 ⇒ f −1 (B1 ) ⊆ f −1 (B2 ); f −1 (B c ) = (f −1 (B))c .
Example 1.2 (Gaussian function) Let’s consider the Gaussian function f : R → 2 R given by f (x) = e−x . In Figure 1.2 we can observe that f ([0, +∞[) =]0, 1] and f ([0, 1]) = [e−1 , 1]. On the other hand, f −1 (]0, 1]) = R and f −1 ([e−1 , 1]) = [−1, 1].
2
Figure 1.2.: The function R ∋ x 7−→ f (x) = e−x ∈ R.
1.3. Families of sets, partitions and equivalence relations Let Ω be a non-void set. We call family of subsets of Ω to any function Λ ∋ λ 7−→ Aλ ∈ P(Ω). In this case, we say that Λ is the index set of the family and denote (Aλ )λ∈Λ ⊆ P(Ω). In particular, when Λ = N we obtain a sequence of subsets of Ω: (An )n∈N ⊆ P(Ω). Given a family (Aλ )λ∈Λ ⊆ P(Ω) its union and intersection are defined by [
Aλ = {x ∈ Ω /
∃λ0 ∈ Λ : x ∈ Aλ0 } ,
Aλ = {x ∈ Ω /
∀λ0 ∈ Λ : x ∈ Aλ0 } .
λ∈Λ
\ λ∈Λ
By convention, for Λ = ∅ we put [ Aλ = ∅ λ∈∅
and
\
Aλ = Ω.
λ∈∅
An important family of sets is the partition of a universe set. A family (Aλ )λ∈Λ ⊆ P(X), is said to be a partition of Ω if it’s disjoint and exhaustive, which, respec-
1.3. Families of sets, partitions and equivalence relations
11
tively, means that ∀λ, β ∈ Λ :
λ ̸= β ⇒ Aλ ∩ Aβ = ∅; [ Aλ = Ω.
(1.15) (1.16)
λ∈Λ
The importance of partitions is that they perfectly classify the elements of a set. This is stated in Theorem 1.2. For this we need to introduce the concept of equivalence relation. In the context of Definition 1.2, if X = Y we say that r is a relation on X. In this case, we say that r is i) reflexive iff ∀x ∈ X :
x r x;
ii) symmetric iff ∀x, y ∈ X :
x r y =⇒ y r x;
iii) antisymmetric iff ∀x, y ∈ X :
(x r y ∧ y r x)
=⇒
x = y;
iv) transitive iff ∀x, y, z ∈ X :
(x r y ∧ y r z)
=⇒
x r z.
Let’s now state what an equivalence relation is. Definition 1.6 (Equivalence relation, equivalence class) We say that a relation ∼ on Ω ̸= ∅ is an equivalence relation if it’s reflexive, symmetric and transitive. The equivalence class of an element x ∈ Ω, denoted x (or [x]), is x = {y ∈ Ω / x ∼ y}.
(1.17)
The equivalence classes asociated with ∼ form a partition of Ω , i.e., x ̸= y
⇒ x ∩ y = ∅, [ x = Ω. x∈Ω
More precisely, we have: Theorem 1.2 (Partition associated to an equivalence relation) Let ∼ be an equivalence relation on Ω ̸= ∅. Then there exists Λ ⊆ Ω such that {x}x∈Λ is a partition of Ω. The proof of this result requires the use of Zorn’s lemma (see Section 1.4). Example 1.3 (Relation of congruence modulo m) Let m ∈ Z \ {0}. On Z we define the relation of congruence modulo m by a ∼ b mod(m)
⇐⇒
a − b is multiple of m.
12
Chapter 1. Preliminaries
In particular, when m = 2, we get the separation of Z in odd and even numbers: Z = 0 ∪ 1. When m = 3 we have Z = 0 ∪ 1 ∪ 2, where 0 = {3k / k ∈ Z} = {..., −6, −3, 0, 3, 6, ...}, 1 = {3k + 1 / k ∈ Z} = {..., −5, −2, 1, 4, 7, ...}, 2 = {3k + 2 / k ∈ Z} = {..., −4, −1, 2, 5, 8, ...}. To finish this section, let’s remark that points (1.12) and (1.13) have generalizations to families of sets. If (Bλ )λ∈Λ ⊆ P(Y ) is a family of sets, then ! f
−1
[
Bλ
=
λ∈Λ
[
f −1 (Bλ ) ,
(1.18)
f −1 (Bλ ) .
(1.19)
λ∈Λ
! f −1
\ λ∈Λ
Bλ
=
\ λ∈Λ
1.4. Order relations Zorn’s lemma is a powerful tool to prove existence statements. It’s equivalent (see e.g. [13]) to the Axiom of Choice that was presented at the end of Section 1.2. To formulate it we need a number of concepts. Definition 1.7 (Order relation, ordered set) We say that a relation ≤ on the set Ω is an order relation iff it’s reflexive, antisymmetric and transitive. In this case we say that (Ω, ≤) is an ordered set. In the context of Definition 1.7, the term order relation is sometimes replaced by partial order relation to emphasize that there could be elements a, b ∈ Ω that are incomparable, that is, neither a ≤ b nor b ≤ a hold, see Example 1.5. If Ω has no incomparable elements, i.e., for every a, b ∈ Ω, a ≤ b or b ≤ a, then we say that Ω is a chain or a totally ordered set. We say that A ⊆ Ω is bounded from above iff it has an upper bound, i.e., ∃u ∈ Ω, ∀x ∈ A :
x ≤ u.
We say that s ∈ Ω is the supremum of A, denoted s = sup(A), iff s is an upper bound of A and s ≤ u, for every u upper bound of A. Whenever sup(A) ∈ A we say that A has maximum s = max(A). We say that m ∈ Ω is a maximal element iff ∀x ∈ Ω : m ≤ x ⇒ m = x. In the same way are defined the concepts of boundedness from below, lower bound, infimum, minimum and minimal element.
1.5. Cardinality
13
Example 1.4 On Z the relation ≤ is defined by a≤b
⇐⇒
b − a ∈ N ∪ {0}.
It’s easy to prove that (Z, ≤) is an ordered set. The well-ordering principle states that any subset of N has a minimum. Then Z is a chain with no upper bound or lower bound. Example 1.5 (Set inclusion as an order relation) Let Ω be a set. It’s not difficult to check that (P(Ω), ⊆) is an ordered set. It’s clear that P(X) is not a chain. The only maximal element is Ω and also Ω = max(P(Ω)). The only minimal element is ∅ and also ∅ = min(P(Ω)). We say that an ordered set is inductive if every totally ordered subset of it is bounded from above. Now we are in condition to state Zorn’s lemma. Theorem 1.3 (Zorn’s lemma) Let Ω be an ordered set. If Ω is inductive then Ω has a maximal element. In Sections 5.5 and 1.6 we shall apply Zorn’s lemma to prove, respectively, HahnBanach theorem and that every linear space has a basis. To finish this section, let’s state a characterization of the infimum and supremum of a subset of R. Theorem 1.4 (Characterization of the supremum) Let A ⊆ R be bounded from above. Then α = sup(A) iff ∀ϵ > 0, ∃x0 ∈ A :
α < x0 + ϵ.
A proof of this result can be found in [3]. Corollary 1.1 (Characterization of the infimum) Let A ⊆ R be bounded from below. Then β = inf(A) iff ∀ϵ > 0, ∃x0 ∈ A :
β > x0 − ϵ.
(1.20)
This result is required as an exercise at the end of the chapter.
1.5. Cardinality Comparing the size of two sets is possible only when at least one of them is finite. To extend this idea to infinite sets we need the concept of bijectivity of functions. Definition 1.8 (Injectivity, surjectivity and bijectivity) Let f : A → B. We say that f is injective iff ∀x1 , x2 ∈ A :
x1 ̸= x2 ⇒ f (x1 ) ̸= f (x2 ).
(1.21)
We say that f is surjective or onto iff ∀y ∈ B, ∃x ∈ A :
f (x) = y.
We say that f is bijective iff it’s injective and surjective.
(1.22)
14
Chapter 1. Preliminaries
Remark 1.6 (How to prove injectivity and surjectivity) To prove the injectivity of a function f we usually check ∀x1 , x2 ∈ A :
f (x1 ) = f (x2 ) ⇒ x1 = x2 ,
which is clearly equivalent to (1.21). Proving that f is onto is equivalent to prove that ∀y ∈ B, ∃x ∈ A : f (x) = y. Therefore proving the bijectivity of f is equivalent to prove that ∀y ∈ B, ∃!x ∈ A :
f (x) = y.
It’s easy to verify that if f : A −→ B is bijective then there exists a function g : B −→ A such that ∀y ∈ B :
f (g(y)) = y;
(1.23)
∀x ∈ A :
g(f (x)) = x.
(1.24)
Points (1.23) and (1.24) tell us that f and g are mutually inverse processes. It’s said that f and g are inverse to each other, denoted by f −1 = g
and g −1 = f.
Definition 1.9 (Cardinality) We say that two sets A and B have the same cardinality (or power), denoted by #[A] = #[B], if there is a bijective function between them. If there is an injective function from A to B we write #[A] ≤ #[B] and if, additionaly, #[A] ̸= #[B], we put #[A] < #[B]. Remark 1.7 A set A is said to be i) countable iff #[A] = ℵ0 ; ii) discrete iff #[A] ≤ ℵ0 ; where, ℵ0 = #[N] is the least infinite cardinality. Example 1.6 Let P be the set of even natural numbers. It’s clear that P ⊊ N but, since the function η : N → P given by η(k) = 2k, is bijective, it follows that #[N] = #[P]. The last example shows a characteristic of the infinite sets: they can be put in bijective correspondence with a proper subset. On the other hand, if A and B are finite sets such that A ⊆ B, then A⊊B
⊻
#[A] = #[B].
1.6. Linear spaces and algebras
15
Proposition 1.1 (Strict increment of the power) Given a set Ω, it holds #[P(Ω)] > #[Ω]. It’s easy to verify that if the set Ω has n elements, n ∈ N∗ , then P(Ω) has 2n elements. By notation, even when Ω is not finite, it’s usually written #[P(Ω)] = 2#[Ω] . In particular, c = ℵ1 = #(R) = 2ℵ0 , is called power of the continuum. To end this section, we state a result that is useful e.g. to handle Hilbert basis. Theorem 1.5 (Countable union[of countable sets) Let (Bn )n∈N be a seBn is also countable. quence of countable sets. Then n∈N
1.6. Linear spaces and algebras In this section we introduce the algebraic structures of linear space and algebra which are very important in Functional Analysis. Let Ω be a non-void set. We call internal operation on Ω to every function ⊕ : Ω × Ω −→ Ω (u, v) 7−→ u ⊕ v.
(1.25)
Let K be another non-void set. We call external operation on Ω (with help of K) to every function ⊙ : K × Ω −→ Ω (λ, u) 7−→ λ ⊙ u.
(1.26)
Before stating the definition of linear space we need to introduce the concept of group. Definition 1.10 (Group, Abelian group) Let ⊕ be an internal operation on V . We say that (V, ⊕) is a group iff the following conditions hold a) Additive asociativity. ∀u, v, w ∈ V : (u ⊕ v) ⊕ w = u ⊕ (v ⊕ w). b) Additive neutral element. ∃0 ∈ V, ∀u ∈ V : u ⊕ 0 = 0 ⊕ u = u. c) Additive inverses. ∀u ∈ V, ∃v ∈ V : u ⊕ v = v ⊕ u = 0. We say that (V, ⊕) is an Abelian group if, additionaly, it holds d) Additive commutativity. ∀u, v ∈ V :
u ⊕ v = v ⊕ u.
It’s not difficult to show that the neutral element or zero element, 0, is unique. Given u ∈ V is also easy to show that its additive inverse is unique; it will be denoted ⊖u. Let’s state the concept of linear space.
16
Chapter 1. Preliminaries
Definition 1.11 (Linear space) Let (V, +) be an Abelian group and · : K × V → V an external operation, where K is a field. We say that (V, +, ·) is a linear space over K iff the following properties hold i) ii) iii) iv)
Harmlessness of 1. ∀u ∈ V : 1 · u = u. Mixed associativity. ∀u ∈ V, ∀α, β ∈ K : (αβ) · u = α · (β · u). Vector distributivity. ∀u ∈ V, ∀α, β ∈ K : (α + β) · u = α · u + β · u. Scalar distributivity. ∀α ∈ K, ∀u, v ∈ V : α · (u + v) = α · u + α · v.
Remark 1.8 If in the Definition 1.11, K = R we say that V is a real linear space; if K = C we say that V is a complex linear space. Along this book we shall consider only real linear spaces. Most of the results are also valid for complex linear spaces but we avoid unnecessary complications and focus on the objetives of the work. In this sense, if we do not specify, we shall assume that we are dealing with a real linear space. Example 1.7 (Matrices) The set Mmn (R) of all the real matrices with and n columns is a linear space when it’s equipped with the operations a11 + b11 a12 + b12 . . . a1n + b1n .. .. .. .. A + B = (aij + bij ) = . . . . am1 + bm1
am2 + bm2
...
m rows ,
amn + bmn
and
λa11 λ · A = λ · (aij ) = ... λam1
λa12 .. . λam2
... .. . ...
λa1n .. , . λamn
where λ ∈ R, and
a11 .. A = (aij ) = . am1
a12 .. . am2
... .. . ...
a1n .. ∈ M (R), mn . amn
and
b11 .. B = (bij ) = . bm1
b12 .. . bm2
. . . b1n .. ∈ M (R). .. mn . . . . . bamn
Definition 1.12 (Functional space) We shall say that a linear space V is a functional space if its elements are functions. Remark 1.9 Given the non-void sets A and B, we denote by F (A, B) the set of all the functions whose domain is A and whose codomain is B. For the particular case of B = R, we write F (A) instead of F (A, R).
1.6. Linear spaces and algebras
17
Example 1.8 (A very abstract functional space) Let A be a non-void set. On F (A), two operations are determined in the following way. Given f, g ∈ F (A) and λ ∈ R we define the functions f + g : A −→ R and λ · f : A −→ R by (f + g)(x) =f (x) + g(x), (λ · f )(x) =λ · f (x),
x ∈ A; x ∈ A.
(1.27) (1.28)
Then (F (A), +, ·) is a linear space. It’s usual to deal with a subset W of a linear space (V, +, ·) that is itself a linear space, i.e., when W is provided with the operations of V but restricted to W : +|W ×W and ·|R×W . In this case we say that W is a linear subspace (or simply a subspace) of V . The linear space {0} is referred to as the trivial space. The following result provides a shortcut to prove that a subset is actually a subspace. Theorem 1.6 (Linear subspace) Let V be a linear space and W ⊆ V . Then W is a linear subspace of V iff ∀u, v ∈ W, ∀λ ∈ R :
λ · u + v ∈ W.
The proof of this result is a nice exercise for the student. Example 1.9 Let I ⊆ R. By using Theorem 1.6, it’s not difficult to verify that P(I) = {p : I ⊆ R −→ R / p is a polynomial} is a subspace of the functional space F(I). Recall that p ∈ P(I) iff there are a0 , a1 , ..., an ∈ R such that p(x) = a0 + a1 x + a2 x2 + ... + an xn . Example 1.10 Let I ⊆ R and C(I) = {f ∈ F (I) / f is continuous}. By using Theorem 1.6 and a little bit from the course of Calculus is easy to show that C(I) is a linear subspace of F (I), so that C(I) is a functional space itself. Observe that P(I) is a subspace of C(I). Example 1.11 Let’s consider I = [−10, 10]. Since C([−10, 10]) is a functional space, the sum of the continuous functions f : I −→ R and g : I −→ R, given by (% i1) f(x):= sin(x); (% i2) g(x):=x; is also a continuous function: (% i3) define(h(x),f(x)+g(x)); (% i5) plot2d(h(x),[x,-10,10]); In Figure 1.3 we can see the efect of adding the functions f and g. The following result states that the structure of linear space is stable under intersections.
18
Chapter 1. Preliminaries
Figure 1.3.: The function [−10, 10] ∋ x 7−→ h(x) = sin(x) + x.
Theorem 1.7 (Intersection of linear spaces) Let (Wλ )λ∈Λ be a family of \ linear subspaces of the linear space V . Then W = Wλ is also a linear λ∈Λ
subspace of V . Proof. We have to prove that ∀α ∈ R, ∀u, v ∈ W :
αu + v ∈ W.
(1.29)
Let α ∈ R and u, v ∈ W , generic. Since, for every λ ∈ Λ, Wλ is a linear subspace of V and u, v ∈ Wλ , it follows, by Theorem 1.6, that ∀λ ∈ Λ :
αu + v ∈ Wλ .
Therefore, αu + v ∈ W =
\
Wλ .
λ∈Λ
Since α, u and v were chosen arbitrarily, we have proved (1.29).
■
Example 1.12 Let −∞ < a < b < +∞. Let’s define V1 = {f ∈ F ([a, b]) / ( Z V2 =
f ∈ F ([a, b]) /
b
f (a) = f (b) = 0}, ) |f (x)|dx < +∞ ,
a
where we are using the Riemann integral. It’s easy to check that both V1 and V2 are subspaces of F([a, b]). Therefore, by Theorem 1.7, ( ) Z b
V3 =
f ∈ F ([a, b]) /
|f (x)|dx < +∞ ∧ f (a) = f (b) = 0 a
is a subspace of F ([a, b]) as well. Remark 1.10 (Space of continuous functions) Let Ω ⊆ Rn . By Ck (Ω, Rm ) we denote the set of functions ψ : Ω ⊆ Rn → Rm which have continuous derivatives at least to the order k ∈ N∗ . When m = 1 we simply write Ck (Ω) instead of Ck (Ω, R).
1.6. Linear spaces and algebras
19
Example 1.13 Let −∞ < a < b < +∞ and V0 = C(]a, b[). For each k ∈ N, Vk = Ck (]a, b[) is a linear subspace of V0 . Actually we have V∞ ⊊ ... ⊊ Vk+1 ⊊ Vk ⊊ Vk−1 ⊊ ...V1 ⊊ V0 , where V∞ = C∞ (]a, b[) =
+∞ \
Ck (]a, b[)
k=0
is the space of functions having continuous derivatives of all orders. In a linear space V , we call linear combination to any vector of the form N X
αk vk ,
(1.30)
k=1
where N ∈ N and for every k we have that αk ∈ R and vk ∈ V . Remark 1.11 Sometimes (1.30) is called a finite linear combination because in Functional Analysis, we frequently allow N = +∞, in whose case we refer to (1.30) as an infinite linear combination. Since an infinite linear combination in V is actually a limit, N +∞ X X αk vk , αk vk = lim N →+∞
k=1
k=1
the space V should have at least a topology to handle the convergence of the limit process. This spaces are referred to as topological linear spaces; on them the internal and external operations of V are continuous. To this class belong the normed and inner-product spaces that we shall define later. Next we define the concept of span of a subset of a linear space. Definition 1.13 (Span of a set) Given a subset D of a linear space V , the span of D (also called the space generated by D), denoted ⟨D⟩, is the smallest linear subspace of V that contains D. From this definition, it follows that \
⟨D⟩ =
U,
U ∈W
where W is the set of all the linear subspaces of V that contain D. Moreover, ⟨D⟩ is formed by the (finite) linear combinations of elements of D: Theorem 1.8 (Characterization of the span) Let D be a subset of the linear space V . Then u ∈ ⟨D⟩ iff there exist v1 , v2 , ..., vn ∈ D and α1 , α2 , ..., αn ∈ R such that n X u= αk vk . (1.31) k=1
20
Chapter 1. Preliminaries
Example 1.14 Let V = P3 (R) = {p ∈ P(R) / deg(p) ≤ 3}, that is, the space of real polynomial with degree less or equal than 3. Let’s consider the set D = {p1 , p2 , p3 } ⊆ V , where, for x ∈ R, p1 (x) = 1 + x,
p2 (x) = 1 + x2 ,
p3 (x) = 1 + x3 .
By applying Theorem 1.8 we find that ⟨D⟩ = {p ∈ P(R) / p(x) = a + bx + cx2 + dx3 ∧ a − b − c − d = 0}. Given two non-zero vectors u, v ∈ R3 we say that they are collinear if they lie in the same line. This means that there exists some α ∈ R such that u = αv.
(1.32)
In the same way, three non-zero vectors u, v and w are said to be coplanar if they lie in the same plane, meaning that there exist some scalars α, β ∈ R such that u = αv + βw.
(1.33)
The concepts of collinearity and coplanarity are particular cases of linear dependence. So, in a linear space V , a finite set B = {v1 , ..., vn } is linearly dependent (l.d.) if at least one of its elements, say vk , can be written as a linear combination of the others, i.e., we can find scalars β1 , β2 , ..., βk−1 , βk+1 , ..., βn such that vk = β1 v1 + β2 v2 + ... + βk−1 vk−1 + βk+1 vk+1 + ... + βn vn . If B is not linearly dependent we say that it’s linearly independent (l.i.). Theorem 1.9 (Finite linear independence) Let B = {u1 , u2 , ..., un } be a subset of a linear space V . Then, B is linearly independent iff for all α1 , α2 , ..., αn ∈ R it holds n X
αk uk = 0
⇒
α1 = α2 = ... = αn = 0.
k=1
In general, for sets that are not necessarily finite, we have the following definition. Definition 1.14 (Linear independence) A subset M of a linear space V is said to be linearly independent iff all the finite non-void subsets of M are linearly independent; otherwise M is linearly dependent. Example 1.15 In the space F (R) we consider the set M = {Cn / n ∈ N∗ }, where, for x ∈ R, C0 (x) = 1,
Cn (x) = cos(nx), n ∈ N.
It’s not difficult to verify that no element of M can be written as a linear combination of the elements of M \ {Cn }; therefore M is linearly independent.
1.6. Linear spaces and algebras
21
Example 1.16 In the space F (R) we consider the set S = {C0 + C1 , C0 + C2 , C0 +C3 }, where we are using the notation of Example 1.15. Let’s find explicitly ⟨S⟩, the linear space formed by all the linear combinations of elements of S. Let u ∈ ⟨S⟩, generic. It’s easy to see that u should have the form u = a C0 + b C1 + c C2 + d C3 ,
(1.34)
so that u(x) = a + b cos(x) + c cos(2x) + d cos(3x), x ∈ R. Now, by Theorem 1.8, there are scalars α1 , α2 , α3 ∈ R such that u = α1 (C0 + C1 ) + α2 (C0 + C2 ) + α3 (C0 + C3 ),
(1.35)
so that u(x) = α1 (1 + cos(x)) + α2 (1 + cos(2x)) + α3 (1 + cos(3x)), x ∈ R. Points (1.35) and (1.34) imply that it should hold α1 + α2 + α3 = a, α = b, 1 α2 = c, α3 = d. This system is consistent only when −a+b+c+d = 0. As u was chosen arbitrarily, we have found that ⟨S⟩ = {u = a C0 + b C1 + c C2 + d C3 / − a + b + c + d = 0}. In Functional Analysis, the linear spaces having infinite dimension are harder to handle, and therefore more interesting than those of finite dimension. So we need a way to identify them. A linear space V is said to have infinite dimension iff for every n ∈ N, it has a linearly independent subset B such that #(B) = n; otherwise, we said that V has finite dimension. The student could remember from its course of Linear Algebra that in a finitedimensional linear space, its dimension corresponds to the number of coordinates that we need to describe a vector. For example, the dimension of P3 (R) is 4 and we need 4 coefficients to describe a generic element u of P3 (R): u : R −→ R x 7−→ u(x) = a0 + a1 x + a2 x2 + a3 x3 . In the general case, we still use the concept of cardinality to define the dimension of a linear space: Definition 1.15 (Dimension, Hamel basis) Let V be a linear space and B ⊆ V . We say that B is a Hamel basis (or simply a basis) of V iff the following conditions hold i) B is linearly independent; ii) ⟨B⟩ = V . In this case, the dimension of V is dim(V ) = #(B). Thanks to the following result, the previous definition makes full sense.
22
Chapter 1. Preliminaries Theorem 1.10 (Dimension of a linear space) Let V a linear space and B1 , B2 two Hamel basis of V . Then #(B1 ) = #(B2 ).
In Chapter 4 we shall introduce the concept of Schauder basis (and the particular and even more aplicable concept of Hilbert basis) which is more suitable for applications of Functional Analysis. The concepts of Schauder and Hamel basis coincide for finite-dimensional linear spaces. Example 1.17 (A Hamel basis related to a differential equation) Let n ∈ N. Let’s consider the differential equation an y (n) (t) + an−1 y (n−1) (t) + · · · + a1 y ′ (t) + a0 y(t) = 0,
t ∈ [c, d],
(1.36)
where ak ∈ R, k = 0, ..., n, and an ̸= 0. Here we look for solutions belonging to the functional space Cn ([c, d]). From the course of Ordinary Differential Equations, the student can remember that, for a linear equation of the form (1.36), the fundamental set of solutions is an n-dimensional linear subspace of Cn ([c, d]): W = ⟨{y1 , y2 , ..., yn }⟩. Here B = {y1 , y2 , ..., yn } is a Hamel basis for W . In particular, this means that if n X u is any solution of (1.36) then there exist β1 , ..., βn ∈ R such that u = αk yk , k=1
and u(t) =
n X
αk yk (t),
t ∈ [c, d].
(1.37)
k=1
(1.37) is usually called the formula of the general solution of (1.36). Example 1.18 (A Hamel basis related to a differential equation) Let λ ∈ R. The fundamental set of solutions of the differential equation y ′′ (t) − λy(t) = 0, is W = ⟨{y1 , y2 }⟩ ⊆ C2 (R), where, for t ∈ R, √ cos t −λ , y1 (t) = 1, −t√λ e , √ sin t −λ , y2 (t) = t, t√λ e ,
t ∈ R,
if λ < 0, if λ = 0, if λ > 0, if λ < 0, if λ = 0, if λ > 0.
Then the general solution of (1.38) is √ √ u(t) = α cos t −λ + β sin t −λ , u(t) = α + βt, √ −t λ
u(t) = αe
t ∈ R,
√ t λ
+ βe
t ∈ R,
if λ = 0; ,
t ∈ R,
(1.38)
if λ > 0.
if λ < 0;
1.6. Linear spaces and algebras
23
For the finite-dimensional linear spaces it’s quite clear that they always have a basis. However, for an infinite-dimensional linear space, this not the case. We shall see that Zorn’s lemma allows us to prove this fact for the general case. Theorem 1.11 (Existence of a Hamel basis) Let V be a non-trivial linear space. Then V has a Hamel basis. Proof. We have to show that there is a linearly independent set B ⊆ V such that ⟨B⟩ = V . To apply Zorn’s lemma we take several steps to create its conditions. 1. Let’s define M = {U ∈ P(V ) / U is linearly independent}. Since V ̸= {0}, there exist some v ∈ V \ {0} so that M ̸= ∅ because {v} ∈ M . 2. Since (P(V ), ⊆) is an ordered set, the same happens with M ⊆ P(V ). 3. Now let’s prove that M is inductive, i.e., that every chain contained in M is bounded from above. So, let Q ⊆ M be totally ordered, generic. Let’s consider [ W = U. (1.39) U ∈Q
It’s clear that ∀A ∈ Q :
A ⊆ W.
So, to prove that W is an upper bound of Q, we need to verify that W is linearly independent because this implies that W ∈ M . Let’s proceed by Reduction to Absurdity. Assume then that W is not linearly independent so that there exists some D ⊆ W such that D = {u1 , u2 , ..., un },
(1.40)
D is linearly dependent.
(1.41)
By (1.39) and (1.40) it follows that ∀k ∈ In , ∃Uk ∈ Q :
uk ∈ Uk .
(1.42)
Since Q is a chain, ∀i, j ∈ In :
Ui ⊆ Uj ∨ Uj ⊆ Ui ,
so that it’s possible to rearrange the indexes k in a way that U1 ⊆ U2 ⊆ ... ⊆ Un . The last together with (1.42) implies that [ D= {uk } ⊆ Un . k∈In
Therefore, D is linearly independent as a subset of Un which is linearly independent. This contradicts (1.41) and let us conclude that W is linearly independient. Since Q was chosen arbitrarily, we have proved that M is inductive.
24
Chapter 1. Preliminaries 4. By points 1, 2 and 3 we can apply Zorn’s lemma. Therefore M has a maximal element B. In particular, B is linearly independent.
(1.43)
5. Let’s prove that B is the Hamel basis we are looking for. Let’s work again by Reduction to Absurdity. Let’s assume that V ̸= ⟨B⟩. Then there exists some u ∈ V \ ⟨B⟩. But B ∪ {u} is linearly independent, so that it belongs to M , and B ⊊ B ∪ {u}, which contradicts the maximality of B. Therefore, ⟨B⟩ = V and, by (1.43), is a Hamel basis for V . ■ Example 1.19 (Space of polynomials) Let’s consider the functional spaces P(R) and Pn (R), for some n ∈ N. We define the n-th canonical polynomial, en : R −→ R, by e0 (x) = 1, en (x) = xn , x ∈ R, n ∈ N. The sets Cn = {e0 , e1 , e2 , ..., en } and C = {em / m ∈ N∗ } are the canonical basis of Pn (R) and P(R), respectively. Therefore dim(Pn (R)) = n + 1,
dim(P(R)) = ℵ0 .
Example 1.20 Let’s consider the set B = {p1 , p2 , p3 , p4 } ⊆ P3 (R) given by p2 (x) = x − 1,
p1 (x) = 1,
p3 (x) = (x − 1)2 ,
p4 (x) = (x − 1)3 .
Since B is linearly independent and has 4 = dim(P3 (R)) elements, Theorem 1.10 implies that B is a basis of P3 (R). To finish this section let’s introduce the concept of algebra because a number of important functional spaces are richer algebraic structures. Let E be a non-void set, + and ∗ two internal operations on E. We say that (E, +, ∗) is a ring iff 1. (E, +) is an Abelian group; 2. (E, ∗) is a semigroup, i.e., ∗ is associative: ∀x, y, z ∈ E :
x ∗ (y ∗ z) = (x ∗ y) ∗ z;
3. the distributive properties hold: ∀x, y, z ∈ E :
x ∗ (y + z) = x ∗ y + x ∗ z,
∀x, y, z ∈ E :
(y + z) ∗ x = y ∗ x + z ∗ x.
Definition 1.16 (Algebra) We say that (E, +, ·, ∗) is an algebra iff 1. (E, +, ·) is a (real) linear space; 2. (E, +, ∗) is a ring; 3. The multiplications are compatible: ∀λ ∈ R, ∀x, y ∈ E :
λ · (x ∗ y) = (λ · x) ∗ y = x ∗ (λ · y).
(1.44)
1.6. Linear spaces and algebras
25
Point (1.44) expresses that the operations · and ∗ are compatible. In the context of Definition 1.16, we say that E is an algebra with unity iff ∃1 ∈ E, ∀u ∈ E :
1 ∗ u = u ∗ 1 = u.
We say that E is a commutative algebra iff ∀u, v ∈ E :
u ∗ v = v ∗ u.
An element x ∈ E is said to be invertible iff ∃y ∈ E :
xy = yx = 1.
(1.45)
Since the vector y in (1.45) is unique we denote y = x−1 . If x ∈ E is not invertible, we say that it is singular. We denote E × = {x ∈ E / x is invertible}, and observe that (E × , ∗) is a group. We say that a linear subspace A of E is a subalgebra of E if A with the multiplication restricted to it is also an algebra. Remark 1.12 If there is no confusion, both multiplication symbols are omitted. So, (1.44) can be written as ∀λ ∈ R, ∀x, y ∈ E :
λ(xy) = (λx)y = x(λy).
Remark 1.13 The structure of algebra gives the framework for the important Stone-Weierstrass Theorem which is presented in Section 4.7.2. Example 1.21 (The algebras F (R) and C(R)) The linear space F (R) is a commutative algebra with unity whenever it’s equipped with the usual multiplication of functions. Here the unity is the one-function 1f : R → R, given by 1f (x) = 1. The spaces Ck (R) are commutative subalgebras of F (R) containing 1f . Theorem 1.12 (Stability of algebras by intersection) Let E \ be an algebra and (Aλ )λ∈Λ a family of subalgebras of E. Then A = Aλ is a λ∈Λ
subalgebra of E. The proof of this result is easy and required as an exercise at the end of the chapter. Theorem 1.13 (Characterization of subalgebras by a Hamel basis) Let E be an algebra, A a linear subspace of E, and B a Hamel basis of A. Then A is a subalgebra of E iff for every u, v ∈ B, uv ∈ A. The proof of this result is required as exercise at the end of the chapter.
26
Chapter 1. Preliminaries
Example 1.22 (Algebra of polynomials) We consider the spaces P(R) and Pn (R) which appeared in Example 1.19. We know that Pn (R) ⊆ P(R) ⊆ C(R). The corresponding canonical (Hamel) basis C = {em / m ∈ N∗ } and Cn = {e0 , e1 , e2 , ..., en } verify ∀u, v ∈ C :
uv ∈ P(R);
∀u, v ∈ Cn :
uv ∈ Pn (R).
Therefore, by Theorem 1.13, it follows that Pn (R) is a subalgebra of P(R) which, in its turn, is a subalgebra of C(R).
1.7. Linear operators One of the reasons for the linear operators to be so important is that, in applications, a number of non-linear phenomena can be asymptotically modeled with linear operators. The derivative, the integral and the Fourier and Laplace transforms, seen by the student in previous courses, are examples of linear operators; see also Examples 1.23 and 1.25). Definition 1.17 (Linear Operator, isomorphism) Let V and W be linear spaces. We say that A : V → W is a linear operator iff ∀α ∈ R, ∀u, v ∈ V :
A(αu + v) = αA(u) + A(v).
(1.46)
In A is bijective we say that A is an (algebraic) isomorphism and that the spaces V and W are isomorphic. In the context of Definition 1.17, if V = W , we say that A is a linear operator on V . Whenever W = R, we say that A is a linear functional on V . Remark 1.14 Sometimes we write Au instead of A(u), for the image of u ∈ V by A. When A is operator between functional spaces it can be used A[u]. Remark 1.15 (Injective or invertible) Sometimes an injective linear operator A : V → W is referred to as invertible because the linear operator A˜ : V → Im(A) does have an inverse. Given two linear spaces V and W , we denote by L(V, W ) the set of linear spaces from V to W , i.e., L(V, W ) = {T : V −→ W / T is linear}. Thanks to the algebraic structure of W it’s possible to define the addition of operators A, B ∈ L(V, W ) by (A + B)(u) = A(u) + B(u),
u ∈ V.
(1.47)
The multiplication of an scalar λ ∈ R by an operator A ∈ L(V, W ) is straightforward: (λ · A)(u) = λ · A(u), u ∈ V. (1.48)
1.7. Linear operators
27
Proposition 1.2 (Space of linear operators) Let V and W be linear spaces. Then (L(V, W ), +, ·) is a linear space. Example 1.23 (Differentiation) Let’s consider the formula A[u] = u′ . It’s clear that it can be used to define a linear operator A : C1 (R) → C(R) which is not an isomorphism. On the other hand, the same formula can be used to define a linear operator on C∞ (R). Example 1.24 Let’s fix x0 ∈ R. The map D : C1 (R) → R given by D(u) = u′ (x0 ), is a linear functional. In Maxima the functional D can be defined and handled in the following way: (% i1) B(u,x):= diff(depends(u,x),x)[1]; (% i2) D(u):= B(u,x0); Therefore: (% i3) D(v); d v (x0 ) dx0
(% o3)
1 x0
(% o4)
(% i4) D(log);
Example 1.25 (Variable upper limit function) It’s clear that the mapping B : C(R) → C1 (R), given by Z x B[u](x) = u(t)dt, x ∈ R, 0
is a linear operator. On the other hand, the last formula can be used to define a linear operator on C∞ (R). Example 1.26 Let x0 > 0 be fixed. The mapping I : C(R) → R given by Z x0 I(u) = u(t)dt, 0
is a linear functional. In Maxima it can be defined in the following way (% i1) A(u,x):= integrate(depends(u,t)[1],t,0,x); (% i2) assume(x0>0); (% i3) I(u):= A(u,x0); Therefore:´ (% i4) I(v); Z
x0
v(t)dt
(% o4)
1 − cos (x0 )
(% o5)
0
(% i5) I(sin);
28
Chapter 1. Preliminaries
Example 1.27 Let n ∈ N and U : Pn (R) → Rn+1 the map given by U (p) = (a0 , a1 , ..., an ), where p(x) = a0 + a1 x + a2 x2 + ... + an xn , x ∈ R. It’s easy to check that U is an isomorphism. Definition 1.18 (Kernel and image) Let A : V → W be a linear operator. We call kernel and image of A, respectively, to the sets Ker(A) = {u ∈ V / A(u) = 0}, Im(A) = {w ∈ W / ∃v ∈ V : A(v) = w} = {A(v) / v ∈ V }. The following result states, in particular, that a linear operator is injective iff its kernel is the trivial space. Theorem 1.14 (Kernel, image and injectivity) Let A : V → W be a linear operator. Then we have that 1. Ker(A) is a linear subspace of V ; 2. A is injective iff Ker(A) = {0}; 3. Im(A) is a linear subspace of W .
Proof. 1. We have to prove that ∀u, v ∈ Ker(A), ∀α ∈ R :
αu + v ∈ Ker(A).
(1.49)
Let u, v ∈ Ker(A) and α ∈ R, generic. By the linearity of A, we have that A(αu + v) = αA(u) + A(v) = 0, whence αu + v ∈ Ker(A). Since u, v and α were chosen arbitrarily, we have proved (1.49). 2. a) Let’s assume that Ker(A) = {0}. Let’s prove that A is injective, i.e., that ∀u, v ∈ V : A(u) = A(v) ⇒ u = v. (1.50) Let u, v ∈ V such that A(u) = A(v). By the linearity of A, we have that 0 = A(u) − A(v) = A(u − v), so that u − v = 0 and, consequently, u = v. Since u and v were chosen arbitrarily, we have proved (1.50). b) Let’s assume that A is injective. Let’s prove that Ker(A) = {0}. Let u ∈ Ker(A) ̸= {0}, generic. We have that A(u) = 0 and A(0) = 0 so that the injectivity of A implies that u = 0. Since u was arbitrary, we have proved that Ker(A) = {0}. 3. It’s proved as point 1. ■
1.7. Linear operators
29
Example 1.28 Let’s consider A ∈ L(C1 (R), C(R)) given by A[u] = u′ . In this case, Ker(A) = f ∈ C1 (R) / f is a constant function ̸= {0f }, so that A is not injective. Let’s prove that Im(A) = C(R). Let v ∈ C(R), generic. Let’s define u : R → R by Z x u(x) = v(y) dy, x ∈ R. 0
The student can remember from its Calculus course that the continuity of v implies the derivability of u. It’s also clear that A[u] = v. Since v was chosen arbitrarily, we have proved the surjectivity of A. In the following result it’s stated that if there exists the inverse of a linear operator, it’s also a linear operator. Theorem 1.15 (The inverse of a linear operator is linear) Let V and W be linear spaces and A ∈ L(V, W ). If A is bijective then A−1 ∈ L(W, V ). The proof of this result is easy and it’s required as an exercise at the end of the chapter. Corollary 1.2 (Inverse of a product of linear operators) Let A ∈ L(V, W ) and B ∈ L(W, U ). If A and B are bijective then (BA)−1 ∈ L(U, V ) and (BA)−1 = A−1 B −1 . The proof of this result is easy and it’s required as an exercise at the end of the chapter. The concepts of eigenvalue and eigenvector of a linear operator appear in many applications of Functional Analysis. A linear operator defined on a infinite-dimensional linear space can have its spectrum formed by elements that are not necessarily eigenvalues; this phenomena does not happen in finite dimension. A nice introduction to the spectral theory can be found in [17, Ch.7-10]. Definition 1.19 (Eigenvalue and eigenvector) Let V be a linear space and T ∈ L(V ). We say that λ ∈ R is an eigenvalue of T iff there exists u ∈ V \ {0} such that T (u) = λu. In this case, we say that u is an eigenvector of T associated with λ. In the context of Definition 1.19, we call eigenspace associated with λ to ET,λ = {u ∈ V / T (u) = λ u}. ET,λ is a linear subspace of V that contains all the eigenvectors associated with λ and the zero vector. Remark 1.16 (Eigenfunction) In the context of Definition 1.19, if V is a functional space we frequently say that u as an eigenfunction associated to the eigenvalue λ.
30
Chapter 1. Preliminaries
Example 1.29 Let’s consider D ∈ L(C∞ (R)) given by D[u] = u′ . Observe that it’s not the same operator as that of Example 1.28 even though they use the same formula. If λ ∈ R is an eigenvalue of D then there should be a function u ∈ C∞ (R) \ {0} such that D[u] = λu, which is simply written as the ordinary differential equation u′ (t) = λ u(t), t ∈ R. It’s clear that the exponential function exp : R → R, exp(t) = et ,
t ∈ R,
is an eigenfunction of D associated with the eigenvalue λ = 1. Theorem 1.16 (Linear independence of eigenvectors) Let V a linear space and T ∈ L(V ). Assume that λ1 ,...,λn are eigenvalues of T , pairwise different. If for every i ∈ In , ui is an eigenvector associated to λi , then the set {u1 , u2 , ..., un } is linearly independent. Proof. For k ∈ In we write Bk = {uj / j ∈ Ik }. We have to prove that ∀k ∈ In :
Bk is linearly independent.
(1.51)
The set B1 = {u1 } is linearly independent because u1 ̸= 0 (it’s an eigenvector). Now for k ∈ In−1 let’s assume that Bk is linearly independent and let’s prove that Bk+1 is also linearly independent, i.e., we have to prove that α1 u1 + α2 u2 + ... + αk+1 uk+1 = 0 ⇒ α1 = α2 = ... = αk+1 = 0.
(1.52)
So let’s assume that α1 , ..., αk , αk+1 ∈ R are such that α1 u1 + α2 u2 + ... + αk+1 uk+1 = 0.
(1.53)
By multiplying (1.53) by λk+1 we have that α1 λk+1 u1 + α2 λk+1 u2 + ... + αk+1 λk+1 uk+1 = 0.
(1.54)
By applying T to (1.53) we obtain α1 λ1 u1 + α2 λ2 u2 + ... + αk+1 λk+1 uk+1 = 0.
(1.55)
By using (1.54) and (1.55) it follows that α1 (λk+1 − λ1 )u1 + α2 (λk+1 − λ2 )u2 + ... + αk (λk+1 − λk )uk = 0, which, by the linear independence of Bk , implies that α1 = α2 = ... = αk = 0. Therefore, by (1.53), we get that αk+1 uk+1 = 0, which in its turn implies that αk+1 = 0 because uk+1 ̸= 0 (it’s an eigenvector). Since k was chosen arbitrarily, we have proved that every subset of Bn is linearly independent so that Bn is also linearly independent. ■
1.8. Problems
31
Example 1.30 (Eigenvalues and eigenfunctions of a differential operator) Let T ∈ L(C∞ (R)) be given by d2 T [u] = 2 u. dx It’s not difficult to see that each λ ∈ R is an eigenvalue. 1. If λ = 0, the corresponding eigenspace is ET,0 = ⟨{u1 , u2 }⟩, where the eigenfunctions u1 : R → R and u2 : R → R are given by u1 (x) = 1,
x ∈ R.
u2 (x) = x,
2. If λ > 0, the eigenspace is ET,λ = ⟨{v1 , v2 }⟩, where the eigenfunctions v1 and v2 are given by v1 (x) = ex
√
λ
√
v2 (x) = e−x
,
λ
,
x ∈ R.
3. If λ < 0 we have ET,λ = ⟨{w1 , w2 }⟩, where √ √ w1 (x) = cos x −λ , w2 (x) = sin x −λ ,
x ∈ R.
With help of Theorem 1.16 we immediately see that the set {u1 , u2 , v1 , v2 , w1 , w2 } is linearly independent in the space C∞ (R).
1.8. Problems Problem 1.1 Find explicitly the sets ∞ [ 1 cos(nπ/2) A= , π+ , n n n=1
2 1/n n −n+1 . cos(nπ) ; 1 + 2 + n B= n n2 + 1 n=1 ∞ \
Problem 1.2 Let f : X → Y , A, A1 , A2 ⊆ X and B, B1 , B2 ⊆ Y . Prove that f (A1 ∪ A2 ) = f (A1 ) ∪ f (A2 ), f (A1 ∩ A2 ) ⊆ f (A1 ) ∩ f (A2 ), f
−1
(B1 ∪ B2 ) = f −1 (B1 ) ∪ f −1 (B2 ),
f −1 (B1 ∩ B2 ) = f −1 (B1 ) ∩ f −1 (B2 ), f (A) ⊆ B ⇐⇒ A ⊆ f −1 (B), f (f −1 (B)) ⊆ B, f −1 (f (A)) ⊇ A, A1 ⊆ A2 ⇒ f (A1 ) ⊆ f (A2 ), B1 ⊆ B2 ⇒ f −1 (B1 ) ⊆ f −1 (B2 ), f −1 (B c ) = (f −1 (B))c . Problem 1.3 Let A and B be two non-void sets. Let’s write M = {(C, f ) / C ⊆ A ∧ f : C → B}. We define a relation on M by (X, f ) ≤ (Y, g)
⇐⇒
X ⊆ Y ∧ g|X = f.
Prove that (M, ≤) is an ordered set. Does it have maximal elements?
32
Chapter 1. Preliminaries
Problem 1.4 On N we define the divisibility relation by ⇐⇒
a|b
∃m ∈ N : b = am.
Prove that divisibility is an order relation on N. Problem 1.5 Let A ⊆ R. 1. Prove that if A is bounded from above, then α = sup(A) iff ∀ϵ > 0, ∃x0 ∈ A :
α < x0 + ϵ.
2. Prove that if A is bounded from below, then β = inf(A) iff ∀ϵ > 0, ∃x0 ∈ A :
β > x0 − ϵ.
Problem 1.6 Prove that #(Q) = ℵ0 . Problem 1.7 (*) Try to prove that #(C(R)) = ℵ1 . Problem 1.8 Let n ∈ N, θ = e2πi/n ∈ C and U = {rk = θk / k = 0, 1, 2, ..., n − 1}. We define ⊕ : U × U → U by rk ⊕ rj = rk+j . Prove that (U, ⊕) is an Abelian group. Problem 1.9 Let A be a non-void set. We consider F (A), the set of real functions with domain A. Given f, g ∈ F (A) and λ ∈ R we define the functions f + g : A → R and λ · f : A → R by (f + g)(x) = f (x) + g(x),
(λ · f )(x) = λ · f (x),
x ∈ A.
Prove that (F (A), +, ·) is a linear space. Problem 1.10 Let V be a linear space and W ⊆ V . Prove that W is a linear subspace of V iff ∀u, v ∈ W, ∀λ ∈ R : λ · u + v ∈ W. Problem 1.11 Let −∞ < a < b < +∞. We define xn : [a, b] → R by x0 (t) = 1,
xn (t) = tn ,
t ∈ [a, b], n ∈ N.
Prove that B = {xn / n ∈ N∗ } is linearly independent in the space C([a, b]). Problem 1.12 Let f1 , f2 , f3 ∈ C2 (R). Prove that dent iff f1 (x0 ) f2 (x0 ) ∃x0 ∈ R : det f1′ (x0 ) f2′ (x0 ) f1′′ (x0 ) f2′′ (x0 )
{f1 , f2 , f3 } is linearly indepen f3 (x0 ) f3′ (x0 ) ̸= 0. f3′′ (x0 )
1.8. Problems
33
Problem 1.13 For each n ∈ N∗ , we define the function en : R → R by e0 (x) = 1,
en (x) = xn ,
x ∈ R, n ∈ N.
Prove that the set B = {en / n ∈ N∗ } is a Hamel basis of P(R), and that dim(P(R)) = ℵ0 . Problem 1.14 Let n ∈ N and p ∈ P(R) such that deg(p) = n. Prove that {p, p′ , p′′ , ..., p(n−1) , p(n) } is a basis for the space Pn (R). Problem 1.15 Let \ E be an algebra and (Aλ )λ∈Λ a family of subalgebras of E. Aλ is a subalgebra of E. Prove that A = λ∈Λ
Problem 1.16 Let E be an algebra, A a linear subspace of E, and B a Hamel basis of A. Prove that A is a subalgebra of E iff ∀u, v ∈ B :
uv ∈ A.
Problem 1.17 Let V and W be linear spaces. Prove that (L(V, W ), +, ·) is a linear space. Here the operations + and · are those of (1.47) and (1.48). Problem 1.18 Let A : V → W be a linear operator. Prove that Im(A) is a linear subspace of W . Problem 1.19 Let V, W be linear spaces and A ∈ L(V, W ). Prove that if A is bijective, then A−1 ∈ L(W, V ). Problem 1.20 Let V, W and U be linear spaces, A ∈ L(V, W ) and B ∈ L(W, U ). Assume that A and B are bijective. Prove that (BA)−1 ∈ L(U, V ) and that (BA)−1 = A−1 B −1 . Problem 1.21 Consider D ∈ L(C∞ (R)) given by D[u] = u′ . 1. Prove that D is onto. 2. Prove that D is not bijective. 3. Find all the eigenvalues and eigenspaces of D. Problem 1.22 Let U be a linear subspace of V . On V we define the relation modulo U by v1 ≡ v2 (mod U ) ⇐⇒ v1 − v2 ∈ U. The coset of an element v ∈ V with respect to U is the set v+U = {v+u /u ∈ U }. 1. Prove that the relation modulo U is an equivalence relation on V . 2. Prove that v1 + U = v2 + U
⇐⇒
v1 ≡ v2 (mod U ).
3. The partition defined by the relation modulo U , denoted V /U is called the quotient space of V by U . Prove that V /U is formed by cosets.
34
Chapter 1. Preliminaries 4. (*) Prove that the quotient space V /U becomes a linear space if it’s equipped with the operations (v1 + U ) + (v2 + U ) = (v1 + v2 ) + U,
λ · (v + U ) = (λ · v) + U.
Problem 1.23 Let a1 , a2 , a3 ∈]0, +∞[\{1} and A = (aij ) ∈ M3 (R) such that aij = logaj (ai ), for i, j ∈ {1, 2, 3}. Find the eigenvalues of the linear operator T : R3 → R3 given by T (u) = A u. Problem 1.24 Let T ∈ L(C∞ (R)) be given by T [u] = u(iv) + u′′′ + 2u′′ + 4u′ − 8u. 1. Find explicitly Ker(T ). 2. Let f ∈ C∞ (R). Show that the fundamental set of solutions of the differential equation u(iv) + u′′′ + 2u′′ + 4u′ − 8u = f, (1.56) is the coset up + Ker(T ), where up is any particular solution of (1.56). Take in consideration Exercise 1.22. 3. Find the general solution of (1.56) when f (x) = sinh(3x).
2. An introduction to topological spaces In this chapter we start our course of Functional Analysis. We shall present a significant number of topological concepts which will be useful along this course. The student should not be afraid of this, as it’s an investment with a sure return.
2.1. Definition of topology We start from a very abstract setting and quickly move to the more concrete realms of Functional Analysis. A very complete and good course of Topology is provided in [8]. Definition 2.1 (Topology, topological space) Let X be a non-void set and T ⊆ P(X). We say that T is a topology on X iff the following conditions hold 1. ∅ ∈ T and X ∈ T ; 2. if A, B ∈ T , then A ∩ B ∈ T ; [ 3. if (Aλ )λ∈Λ is a family of elements of T , then Aλ ∈ T . λ∈Λ
In this case, the pair (X, T ) is called a topological space. Example 2.1 (A very simple topological space) Let X = {a, b, c} and T = {∅, {a, b, c}, {b}, {a, b}, {c, b}}. Then (X, T ) is a topological space. In the context of Definition 2.1, the elements of T are referred to as open sets. A set\ D ⊆ X is closed if Dc is open. If (Ai )i∈I is any finite family of open sets then Ai ∈ T , i.e., a finite intersection of open sets is also an open set. On the i∈I
other hand, the third point in the definition states that an arbitrary union of open sets is also an open set. In the same way, the union of any finite family of closed sets is closed. Also, every intersection of closed sets is closed. Remark 2.1 In Functional Analysis, it’s quite common to deal with a pivote set X which can be endowed with more than one topology. For example, if (X, T1 ) and (X, T2 ) are different topological spaces then the elements of T1 and T2 are referred to as T1 -open and T2 -open, respectively. 35
36
Chapter 2. An introduction to topological spaces
Theorem 2.1 (Intersection of topologies) Let (Ti )i∈I a family of topologies \ on a non-void set X. Then T = Ti is a topology on X. i∈I
The proof of this result is easy and is required as an exercise at the end of the chapter. Definition 2.2 (Weaker - stronger topology) Let X be a non-void set, and Tα and Tβ two topologies on X. We say that Tβ is weaker than Tα iff Tβ ⊆ Tα . Equivalently, we say that Tα is stronger than Tβ . It’s clear that any topology T on X verifies T ⊆ T ⊆ P(X), where T = {∅, X} and P(X) are the trivial topology and discrete topology on X, respectively. It’s important to remark that in the discrete topological space (X, P(X)) any subset of X is open. Therefore, the trivial and discrete topologies are the weakest and the strongest of all topologies on X, respectively. Remark 2.2 Grossly speaking, the reason to deal with more than one topology on a pivote set X is that for proving the convergence of a sequence of functions to solve some specific problem (e.g. a minimization problem), it’s easier in a topology with a less number of open sets (or a greater number of compact sets). Let’s denote T˜ (X) = {T ⊆ P(X) / T is a topology on X}. It’s clear that (T˜ (X), ⊆) is an ordered set. Then all the concepts defined in Section 1.4 apply to this case. It’s important to remark that any Λ ⊆ T˜ (X) is bounded; in fact, the trivial topology is a lower bound and the discrete \topology is an upper bound of Λ. Actually, Theorem 2.1 implies that inf(Λ) = T. T ∈Λ
2.2. Topological basis and fundamental systems Let’s introduce the concepts of topological basis and subbasis. Definition 2.3 (Topological basis and subbasis) Let (X, T ) be a topological space. We say that B ⊆ T is a (topological) basis of T iff every open set is the union of elements of B. S is a subbasis of T iff S ′ , the set of finite intersections of elements of S, is a basis of T . Given a set E ⊆ P(X), the following result establishes a way to build TE , the smallest topology in T˜ (X) that contains E . The idea is to prove that E is actually a subbasis for TE .
2.2. Topological basis and fundamental systems
37
Theorem 2.2 (Generated topology) Let X be a non-void set and E ⊆ P(X). Let’s denote by TE the set of unions of elements of E ′ , the set of finite intersections of elements of E . Then TE is the smallest topology on X that contains E . TE is refered to as the topology generated by E . Proof. Let’s observe that M ∈ E ′ iff M =
\
Mk , where (Mk )k∈K is a finite
k∈K
familiy of elements of E . In the same way, N ∈ TE iff N =
[
Nλ , where
λ∈Λ
(Nλ )λ∈Λ is any familiy of elements of E ′ .
1. Since ∅ is finite and the union of the void family is ∅, it follows that TE ∋ ∅. 2. Since the intersection of the void family is X, it follows that TE ∋ X. 3. Let’s prove that ∀A, B ∈ TE : A ∩ B ∈ TE . (2.1) Let A, B ∈ TE , generic. We have that [ [ A= Aλ and B = Bω , ω∈Ω
λ∈Λ
where (Aλ )λ∈Λ and (Bω )ω∈Ω are families of elements of E ′ . Then ! ! [ [ [ Aλ ∩ Bω = Aλ ∩ Bω . A∩B = λ∈Λ
ω∈Ω
(2.2)
(λ,ω)∈Λ×Ω
Since by the definition of E ′ , Aλ ∩ Bω ∈ E ′ , for every (λ, ω) ∈ Λ × Ω, point (2.2) implies that A ∩ B ∈ TE . As A and B were chosen arbitrarily, we have proved (2.1). 4. It’s clear that the union of elements of TE is still in TE as each of their components belong to E ′ . 5. By the construction, it’s immediate that E ⊆ TE . 6. Now, let’s prove that for every topology T on X containing E we have that TE ⊆ T . Let T a topology on X that contains E, generic. T has all the finite intersections of its elements so that T ⊇ E ′ . T has all the unions of its elements; in particular, it contains all the unions of elements of E ′ , so that T ⊇ TE . We are done as T was arbitrary. ■ Remark 2.3 In the context of Theorem 2.2, E ′ is a basis of TE . Then any A ∈ TE can be built as the union of elements of E ′ . Example 2.2 (The typical topology on R) We denote by E the class of bounded open intervals, i.e., E = {]a, b[ / a ∈ R, b ∈ R, a < b}. Since a finite intersection of bounded open intervals is also a bounded open interval, we have that E = E ′ . This means that E is a basis for U = TE ,
38
Chapter 2. An introduction to topological spaces
the typical topology of R. Therefore, any open set of R can be built as the union of bounded open intervals. For example, the unbounded open interval ]a, +∞[ is open as we can write it as the union of open sets: [ ]a, +∞[= ]a, n[, n∈Z[a]
where Z[a] = {n ∈ Z / n > a}. By default we consider that R is equipped with the topology U . Given a topological space (X, T ), we say that V ⊆ X is a neighborhood of the point x0 ∈ X iff there exists an open set contained in V which contains x0 , i.e., ∃U ∈ T :
x0 ∈ U ⊆ V.
We denote N (x0 ) = {V ⊆ X / V is a neighborhood of x0 }.
Theorem 2.3 (Properties of N (x)) Let (X, T ) be a topological space and x ∈ X. The following properties hold: 1. 2. 3. 4.
if if if if
V ∈ N (x), then V ∋ x; V ∈ N (x) and V ⊆ W , then W ∈ N (x); V, W ∈ N (x), then V ∩ W ∈ N (x); V ∈ N (x), then ∃W ∈ N (x) :
y ∈ W ⇒ V ∈ N (y).
(2.3)
Proof. Points 1, 2 and 3 are immediate. To prove 4, let’s assume that V ∈ N (x) so that there is some U ∈ T such that x ∈ U ⊆ V . Now, by taking W = U we have (2.3). ■ Definition 2.4 (Fundamental system of neighborhoods) Let (X, T ) be a topological space and t ∈ X. We say that F ⊆ N (t) is a fundamental system (of neighborhoods) of t iff ∀ V ∈ N (t), ∃W ∈ F :
W ⊆ V.
In particular, we say that a fundamental system is open iff all its elements are open sets. Remark 2.4 (Basis and open fundamental systems) If B is a topological basis of the topological space (X, T ), then for every x ∈ X we have that B ∩ N (x) is an open fundamental system for x. In particular, if T = TE then for every x ∈ X, E ′ ∩ N (x) is an open fundamental system for x.
2.3. Interior, adherence and boundary of a set. Density.
39
Example 2.3 Let’s consider the space (R, U ) introduced in Example 2.2. Given t ∈ R, the following are fundamental systems of t ∈ R: F1 = E ∩ N (t) = {]a, b[ / t ∈]a, b[},
F2 = {]t − ϵ, t + ϵ[ / ϵ > 0},
F3 = {[t − ϵ, t + ϵ] / ϵ > 0}.
2.3. Interior, adherence and boundary of a set. Density. Now, let’s introduce the concept of interior of a set. Definition 2.5 (Interior point, interior of a set) Let (X, T ) be a topological space and A ⊆ X. We say that x ∈ X is an interior point of A iff ˚ A ∈ N (x). The set of interior points of A is denoted by int(A) or A. Let’s remark that x is an interior point of A iff there is an open set V such that x ∈ V ⊆ A. From this definition, it immediatly follows that ∀A ⊆ X :
int(A) ⊆ A.
(2.4)
Example 2.4 In the topological space (R, U ), we have int([a, b]) =]a, b[,
int([a, +∞[) =]a, +∞[,
int(] − ∞, b]) =] − ∞, b[.
The following result states that a set is open iff coincides with its interior. Its proof shows that int(A) is the largest open set contained in A. Theorem 2.4 (Characterization of an open set by its interior) Let (X, T ) be a topological space and A ⊆ X. Then A∈T
⇐⇒
int(A) = A.
Proof. i) Let’s assume that A is open. Because of (2.4), we just have to prove that A ⊆ int(A), i.e., that all the points of A are interior, which is equivalent to ∀x ∈ A, ∃U ∈ T :
x ∈ U ⊆ A.
Since A is open, the last holds immediatly by taking U = A. ii) Let’s assume that A = int(A). We have to prove that A is open. We have that all the elements of A are interior points: ∀x ∈ A, ∃Ux ∈ T :
x ∈ Ux ⊆ A,
which can be written as ∀x ∈ A, ∃Ux ∈ T :
{x} ⊆ Ux ⊆ A.
(2.5)
40
Chapter 2. An introduction to topological spaces Now, from (2.5) we get A=
[
{x} ⊆
x∈A
so that A =
[
[
Ux ⊆ A,
x∈A
Ux is open, as the union of open sets.
x∈A
■ As a consequence of Theorem 2.4, we have the following useful result which will let us characterize an open subset of a metric space by using open balls. Corollary 2.1 (Open sets and basis) Let (X, T ) be a topological space, A ⊆ X and B a topological basis. Then A is open iff ∀x ∈ A, ∃V ∈ B :
x ∈ V ⊆ A.
The proof of this result is required as an exercise at the end of the chapter. Before advancing to our next concept let’s state some properties that verifies the interior operation in a topological space (X, T ): A⊆B
⇒
(A ⊆ B ∧ A ∈ T )
˚ ⊆ B, ˚ A ˚ A ⊆ B,
⇒ ˚ ˚ ∩ B, ˚ \ A ∩B =A ˚ ˚∪ B ˚⊆A \ A ∪ B.
Let’s define the concept of adherence or closure of a set. Definition 2.6 (Adherent point, adherence of a set) Let (X, T ) be a topological space and A ⊆ X. We say that x ∈ X is an adherent point (or closure point) of A iff ∀B ∈ N (x) :
B ∩ A ̸= ∅.
(2.6)
The set of adherent points of A is denoted by A and is referred to as the adherence or closure of A. From this definition, it immediately follows that ∀A ⊆ X :
A ⊆ A.
(2.7)
Remark 2.5 It’s quite clear that in (2.6), N (x) can be replaced by any fundamental system of x. Remark 2.6 (Alternative notation for the adherence of a set) When a pivote set X is endowed with two topologies T and G one can write clT (A) and clG (A), to denote the adherence of A with respect to T and G, respectively. Example 2.5 In the topological space (R, U ), we have that t ∈ A iff for every ϵ > 0, [t − ϵ, t + ϵ] ∩ A ̸= ∅ or ]t − ϵ, t + ϵ[∩A ̸= ∅. By using this, we can prove that if A is bounded then inf(A) ∈ A and sup(A) ∈ A. The following result states that the adherence of a set is closed.
2.3. Interior, adherence and boundary of a set. Density.
41
Lemma 2.1 (The adherence is closed) Let (X, T ) be a topological space and A ⊆ X. Then A is closed. Proof. Let’s prove that c ˚c c A =A ,
(2.8)
c
which implies that A is open and so A is closed. c ˚c c , i.e. that 1. Let’s prove that A ⊆ A c
∀x ∈ A :
˚c c . x∈A
(2.9)
c
Let x ∈ A , generic. Then x ∈ / A and, therefore, there exists V ∈ N (x) such that V ∩ A = ∅. The last implies that V ⊆ Ac and x ∈ U ⊆ V ⊆ Ac , ˚c c for some U ∈ T . Then we have proved that x ∈ A . Since x was chosen arbitrarily, we have proved (2.9). c ˚c c 2. Let’s prove that A ⊆ A , i.e. that ˚c c ∀x ∈ A :
c
x∈A .
(2.10)
˚c c , generic. Then there exists V ∈ N (x) such that V ⊆ Ac . Let x ∈ A c Consequently, V ∩ A = ∅ so that x ∈ / A that is x ∈ A . Since x was chosen arbitrarily, we have proved (2.10). ■ The last proof shows that A is the smallest closed set that contains A. As a consequence, we have the following result. Theorem 2.5 (Characterization of a closed set by its adherence) Let (X, T ) be a topological space and A ⊆ X. Then A is closed iff A = A. The proof of this result uses Lemma 2.1 and it is required as an exercise at the end of the chapter. An immediate consequence of Theorem 2.5 is that A = A. Example 2.6 Let’s consider the topological space (R, U ). By combining Theorem 2.5 and Example 2.5, we see that if A ⊆ R is closed and bounded, then inf(A) ∈ A and sup(A) ∈ A, i.e., A has minimum and maximum. Before advancing to our next concept, let’s state some properties that verify the adherence operation in a topological space (X, T ): A ⊆ B ⇒ A ⊆ B, (A ⊆ B ∧ B c ∈ T ) ⇒ A ⊆ B, A ∪ B = A ∪ B, A ∩ B ⊆ A ∩ B. We now define the concept of boundary or frontier of a set.
42
Chapter 2. An introduction to topological spaces
Definition 2.7 (Boundary point, boundary of a set) Let (X, T ) be a topological space and A ⊆ X. We say that x ∈ X is a boundary point (or frontier point) of A iff ∀B ∈ N (x) :
B ∩ A ̸= ∅ ∧ B ∩ Ac ̸= ∅.
The set of boundary points of A is denoted by ∂A or Fr(A). From this definition, it immediately follows that ∂A = A ∩ Ac . Theorem 2.6 (Properties of the boundary) Let (X, T ) be a topological space and A ⊆ X. Then the following properties hold: 1. 2. 3. 4. 5.
˚ ∂A = A \ A; A is closed iff A ⊇ ∂A; A is both open and closed iff ∂A = ∅; A = A ∪ ∂A; ∂A is closed.
The proof of this result is required as an exercise at the end of the chapter. Definition 2.8 (Density) Let (X, T ) be a topological space and A, B ⊆ X. We say that 1. A is dense in B iff A ⊇ B; 2. A is dense iff A = X; 3. A is nowhere dense iff int(A) = ∅. Example 2.7 In the space (R, U ), the set of rational numbers, Q, is dense. Example 2.8 Let (X, T ) be a topological space and F ⊆ X closed. Then 1. F is nowhere dense iff F c is dense in X; 2. ∂F is nowhere dense. Let’s finish this section with an important and useful result that states that the density relation is transitive. Theorem 2.7 (Density is transitive) Let (X, T ) be a topological space and A, B, C ⊆ X. If A is dense in B and B is dense in C, then A is dense in C. The proof of this result is easy and it’s required as an exercise at the end of the chapter.
2.4. Separation conditions and sequences In Functional Analysis readily all the spaces of interest are Hausdorff spaces, that is, they verify the T2 separation condition. Actually, most of them verify stronger
2.4. Separation conditions and sequences
43
conditions. The T2 condition implies uniqueness for the limit of a converging sequence. The idea of a separation condition is to use neighborhoods to separate pairs of points in a topological space. Definition 2.9 (Some separation conditions) Let (X, T ) be a topological space. We say that (X, T ) is 1. a T1 -space iff satisfies the Fr´echet condition: ∀x, y ∈ X, x ̸= y, ∃U ∈ N (x) :
y∈ / U;
2. a Hausdorff space (or T2 -space or separated) iff satisfies the Hausdorff condition: ∀x, y ∈ X, x ̸= y, ∃(U, V ) ∈ N (x) × N (y) :
U ∩ V = ∅;
3. regular iff given any closed set F and any point t ∈ / F , there are open sets U and V such that t ∈ U, F ⊆ V and U ∩ V = ∅. Remark 2.7 (Regular implies Hausdorff, which implies T1 ) From the Definition 2.9, it follows that any regular space is Hausdorff and that any Hausdorff space is T1 .
Figure 2.1.: The separation conditions T1 and T2 . Source: https://mathstrek. blog
Example 2.9 The space (R, U ) is Hausdorff. In fact, if x < y we can take any 0 < ϵ < (y − x)/2 so that U =]x − ϵ, x + ϵ[∈ N (x) and V =]y − ϵ, y + ϵ[∈ N (y) verify U ∩ V = ∅. The space (R, U ) is actually regular. The following result states that a topological space is T1 if all its singletons are closed. Theorem 2.8 (Characterization of T1 -spaces) Let (X, T ) be a topologi-
44
Chapter 2. An introduction to topological spaces cal space. Then X is T1 iff ∀x ∈ X :
{x} = {x}.
(2.11)
Proof. i) Let’s assume that X is T1 . Let x ∈ X, generic. Let’s prove that {x}c is open, i.e., that all its elements are interior points. Let y ∈ {x}c , generic. Since X is T1 , there is V ∈ N (y) such that x ∈ / V. This implies that there exists U ∈ T such that y ∈ U ⊆ V ⊆ {x}c , so that y ∈ int({x}c ). Since y was arbitrary, we have proved that {x}c is open. Since x was chosen arbitrarily, we have proved that (2.11). ii) Let’s assume (2.11). Let’s prove that ∀x, y ∈ X, x ̸= y, ∃U ∈ N (x) :
y∈ / U.
(2.12)
Let x, y ∈ X, x ̸= y, generic. We have that y ∈ {x}c . Since {x}c is open, there is U ∈ T such that y ∈ U ⊆ {x}c , so that x ∈ / U . Since x and y were chosen arbitrarily, we have proved (2.12). ■ Theorem 2.9 (Hausdorff and fundamental systems) Let (X, \T ) be a Hausdorff space and x ∈ X. If F is a fundamental system of x then V = {x}. V ∈F
The proof of this result is required as an exercise at the end of the chapter. Let’s introduce the concept of convergence of a sequence. Definition 2.10 (Convergence of a sequence) Let (X, T ) be a topological space and (xn )n∈N ⊆ X. We say that the sequence (xn )n∈N converges to x ∈ X iff ∀ V ∈ N (x), ∃N ∈ N : n > N ⇒ xn ∈ V. (2.13) In this case, we say that x is limit of the sequence (xn )n∈N as n tends to infinity. In the context of Definition 2.10, we write x = lim xn , n→+∞
or, alternatively, xn −→ x,
as n → +∞.
We said at the beginning of this section that all the topological spaces which are useful in Functional Analysis and its applications verify the Hausdorff separation condition. The reason is that in a Hausdorff space a convergent sequence has only one limit.
2.4. Separation conditions and sequences
45
Theorem 2.10 (Uniqueness of the limit) Let (X, T ) be a Hausdorff space and (xn )n∈N ⊆ X. If we have that lim xn = a1
n→+∞
and
lim xn = a2 ,
n→+∞
then a1 = a2 . Proof. Let’s proceed by Reduction to Absurdity. So, let’s assume that a1 ̸= a2 . Since X is a Hausdorff space, there are V1 ∈ N (a1 ) and V2 ∈ N (a2 ) such that V1 ∩ V2 = ∅.
(2.14)
As lim xn = a1 , there exists N1 ∈ N such that n→+∞
xn ∈ V 1 ,
∀n > N1 .
(2.15)
Analogously, as lim xn = a2 , there exists N2 ∈ N such that n→+∞
xn ∈ V 2 ,
∀n > N2 .
(2.16)
Now, by taking some p ∈ N such that p > max{N1 , N2 } we have, by (2.15) and (2.16), that xp ∈ V1 ∩ V2 , which contradicts (2.14). Then, we have proved that a1 = a2 . ■ Let’s show now that in (2.13) we can replace N (x) by any fundamental system of neighborhoods of x. Theorem 2.11 (Convergence via a fundamental system) Let (X, T ) be a Hausdorff space and F a fundamental system of neighborhoods of x ∈ X. Then lim xn = x iff n→+∞
∀ V ∈ F, ∃N ∈ N :
n > N ⇒ xn ∈ V.
(2.17)
Proof. i) If lim xn = x then (2.13) holds, and since F ⊆ N (x), so does (2.17). n→+∞
ii) Let’s assume (2.17). We have to prove (2.13). Let V ∈ N (x), generic. Since F is a fundamental system of x, there is U ∈ F such that x ∈ U ⊆ V . By (2.17), there exists NU ∈ N such that n > NU ⇒ xn ∈ U ⊆ V. Since V was chosen arbitrarily, we have proved (2.13). ■ Example 2.10 (Converging real sequences) In the space (R, U ), for every x ∈ R, a fundamental system of x is Fx = {]x − ϵ, x + ϵ[ / ϵ > 0}. By Theorem 2.11, we have that lim xn = x
n→+∞
⇐⇒
∀ϵ > 0, ∃N ∈ N : n > N ⇒ xn ∈]x − ϵ, x + ϵ[
⇐⇒
∀ϵ > 0, ∃N ∈ N : n > N ⇒ |xn − x| < ϵ.
(2.18)
46
Chapter 2. An introduction to topological spaces
The student can remember, from its course of Calculus, that (2.18) was the way to define the convergence of the real sequence (xn )n∈N toward x. So now we know that this definition corresponds to the convergence in the topological space (R, U ). Remark 2.8 (Archimedean property of the real numbers) To apply (2.18), the student can remember that it’s very useful the Archimedean property of the real numbers: ∀α ∈]0, +∞[, ∀λ ∈ R, ∃N ∈ N :
α N > λ.
Example 2.11 In the space (R, U ), let’s consider the sequences (xn )n∈N and (zn )n∈N , given by xn =
15n 2n + 3
and zn = 1 +
n X (−1)k w2k k=1
(2k)!
,
where w ∈ R. From the Precalculus course, the student can remember that it’s quite easy to compute lim xn =
n→+∞
15 2
and
lim zn = cos(w).
n→+∞
On the other hand, it’s not so easy to prove that lim xn =
n→+∞
15 . 2
Let’s do it. By (2.18), we have to prove that ∀ϵ > 0, ∃N ∈ N :
15 n > N ⇒ xn − < ϵ, 2
which can be rewritten in the easier form ∀ϵ > 0, ∃N ∈ N :
n>N ⇒
45 < ϵ. 4n + 6
(2.19)
Let ϵ > 0, generic. By the Archimedean property, there is N ∈ N such that 45 < N ϵ so that 45 < N ϵ < (4N + 6)ϵ. (2.20) On the other hand, if n > N , then 4N + 6 < 4n + 6, whence, by using (2.20), we get 45 < (4n + 6)ϵ, i.e., 45 < ϵ. 4n + 6 Since ϵ was chosen arbitrarily, we have proved (2.19). In a non-void set Y , given a sequence (xn )n∈N ⊆ Y , we call subsequence of (xn )n∈N to any family (xφ(n) )n∈N , provided φ : N → N is strictly increasing.
2.5. Accumulation points. Limit superior and inferior.
47
Theorem 2.12 (Convergence of subsequences) Let (X, T ) be a Hausdorff space. If (xn )n∈N ⊆ X is convergent to L ∈ X, then any subsequence of (xn )n∈N also converges to L.
In particular, Theorem 2.12 states that if there is some subsequence which does not converge to L, then the sequence (xn )n∈N is not convergent.
Example 2.12 In the space (R, U ) we consider the sequence (yn )n∈N given by yn = cos(nπ). Since ( −1, if n is odd, yn = 1, if n is even, by Theorem 2.12, it’s clear that (yn )n∈N is not convergent.
2.5. Accumulation points. Limit superior and inferior. Let’s see which points can be limits of sequences of elements taken from a particular set.
Definition 2.11 (Accumulation point) Let (X, T ) be a topological space and A ⊆ X. We say that x ∈ X is a limit point or accumulation point of A iff ∀B ∈ N (x) : (B \ {x}) ∩ A ̸= ∅. The set of limit points of A is denoted by acc(A).
It’s not difficult to show that A = A ∪ acc(A). Theorem 2.13 (Characterization of an accumulation point) Let (X, T ) be a Hausdorff space which is a first countable space, i.e., every point has a countable fundamental system of neighborhoods. Let A ⊆ X. Then x ∈ acc(A) iff there exists (xn )n∈N ⊆ A \ {x} such that lim xn = x.
n→+∞
The proof of this result is required as a guided exercise at the end of the chapter. In Functional Analysis it’s usual to find estimations where the concepts of limit superior and limit inferior play a role.
48
Chapter 2. An introduction to topological spaces
Definition 2.12 (Limit superior and inferior) In the topological space (R, U ), the limit superior and the limit inferior of a set B ⊆ R are defined respectively by lim sup(B) = sup(acc(B)),
lim inf(B) = inf(acc(B)).
The limit superior and limit inferior of a sequence (xn )n∈N ⊆ R are given, respectively, by lim sup xn = lim sup ({xn / n ∈ N}) , n→+∞
lim inf xn = lim inf ({xn / n ∈ N}) . n→+∞
Therefore, lim sup(B) and lim inf(B) are the greatest and the smallest accumulation points of B, respectively. If acc(B) = ∅, then lim sup(B) = +∞ and lim inf(B) = −∞. Remark 2.9 (Limit superior and inferior of a sequence) In (R, U ), for a sequence (xn )n∈N ⊆ R we have lim sup xn = lim sup xm = inf sup xm , n→+∞
n→+∞
n∈N m≥n
m≥n
lim inf xn = lim n→+∞
n→+∞
inf xm
m≥n
= sup inf xm . n∈N m≥n
It’s easy to prove that lim inf xn ≤ lim sup xn , n→+∞
n→+∞
lim sup(−xn ) = − lim inf xn . n→+∞
n→+∞
Remark 2.10 (Limit superior and inferior for a sequence of sets) Let X be a set. For a sequence (Xn )n∈N ⊆ P(X), we define \ [ inf{Xn / n ∈ N} = Xn , sup{Xn / n ∈ N} = Xn . n∈N
n∈N
Then the limit superior and inferior of the sequence of sets (Xn )n∈N are defined by \ [ lim sup Xn = inf sup Xm = Xm , n→+∞
n∈N m≥n
lim inf Xn = sup inf Xm = n→+∞
n∈N m≥n
n∈N m≥n
[ \
Xm .
n∈N m≥n
Example 2.13 (Computing lim sup and lim inf) Let’s consider (xn )n∈N ⊆ R given by nπ (−1)n xn = + sin . n 4
2.6. Subspaces and continuity
49
Since 0, √ 2 , 2 nπ = 1, sin √ 4 2 − , 2 −1,
if n = 0 mod(8) ∨ n = 4 mod(8), if n = 1 mod(8) ∨ n = 3 mod(8), if n = 2 mod(8), if n = 5 mod(8) ∨ n = 7 mod(8),
if n = 6 mod(8), √ √ we have that acc({xn / n ∈ N}) = −1, − 2/2, 0, 2/2, 1 , and, consequently, lim inf xn = −1 n→+∞
and
lim sup xn = 1. n→+∞
Remark 2.11 Let B ⊆ R. Let’s recall that α = sup(B) ⇐⇒ ∀ϵ > 0, ∃x ∈ B :
α − ϵ < x < α.
(2.21)
2.6. Subspaces and continuity Let (X, T ) be a topological space and S ⊆ X. It’s not difficult to verify that TS = {V ∩ S / V ∈ T }
(2.22)
is a topology on S; it’s called the topology induced on S by T . Then the topological space (S, TS ) is a (topological) subspace of (X, T ). From (2.22), it immediatly follows that if E is a subbasis of T , then ES = {V ∩ S / V ∈ E } is a subbasis of TS ; if B is a basis of T , then BS = {V ∩ S / V ∈ B} is a basis of TS ; for t ∈ S, NS (t) = {V ∩ S / V ∈ NX (t)}; if F is a fundamental system of t ∈ S in the space (X, T ), then FS = {V ∩ S / V ∈ F} is a fundamental system of t, in the space (S, TS ); e) a set L ⊆ S is closed in TS iff L = M ∩ S, where M is closed in T ; f) for A ⊆ S, clS (A) = clX (A) ∩ S.
a) b) c) d)
Remark 2.12 If there is no confussion, we shall simply say that S is a subspace of X. Remark 2.13 The notation of the induced topology, (2.22), and of the topology generated, Theorem 2.2, are similar but have different meanings and do not provoke confusion. Theorem 2.14 (Separation of a subspace) Let (X, T ) be a topological space and S ⊆ X. If X is Hausdorff then so is S.
50
Chapter 2. An introduction to topological spaces
Proof. Let’s assume that X is Hausdorff. We have to prove that (S, TS ) is Hausdorff, i.e., that ∀x, y ∈ S, x ̸= y, ∃(U, V ) ∈ NS (x) × NS (y) :
U ∩ V = ∅.
(2.23)
ˆ ∈ NX (x) and Let x, y ∈ S, x ̸= y, generic. Since X is Hausdorff there are U ˆ V ∈ NX (y) such that ˆ ∩ Vˆ = ∅. U (2.24) ˆ ∩ S ∈ NS (x) and V = Vˆ ∩ S ∈ NS (x), whence, by (2.24), Now we define U = U U ∩ V = ∅. Since x and y were chosen arbitrarily, we have proved (2.23). ■ Example 2.14 Let’s consider the topological space (R, U ) and the subspace S = [0, 1]. Observe that the set (1/2, 1] is open in S but it’s not in R. It’s easy to check that FS = {[0, ϵ[ / 0 < ϵ ≤ 1} is a fundamental system of neighborhoods of t = 0 in the space S. Since (R, U ) is Hausdorff the same happens with ([0, 1], U[0,1] ). Remark 2.14 Let (X, T ) be a topological space, S open in X and M ⊆ S. Then 1. M is open in S iff it’s open in X; 2. M is closed in S iff it’s closed in X. Let’s define the concept of continuity of functions. Definition 2.13 (Continuous function) Let (X, T ) and (Y, G) be topological spaces, f : X −→ Y and x0 ∈ X. We say that f is continuous at the point x0 iff ∀ V ∈ NY (f (x0 )), ∃U ∈ NX (x0 ) :
x ∈ U ⇒ f (x) ∈ V.
(2.25)
We say that f is continuous in a region M ⊆ X iff f is continuous at all the points of M . Remark 2.15 (Homeomorphism) If f is a continuous bijection from the topological space (X, τ ) into the topological space (Y, υ) such that f −1 is also continuous, then we say that f is a homeomorphism. A characterization of a function that is continuous at a point is given in the following result. Theorem 2.15 (Continuity via inverse images) Let (X, T ) and (Y, G) be topological spaces, f : X → Y and x0 ∈ X. Then f is continuous at the point x0 iff ∀ V ∈ NY (f (x0 )) : f −1 (V ) ∈ NX (x0 ). (2.26)
Proof.
2.6. Subspaces and continuity
51
i) Let’s assume that f is continuous in x0 . We have to prove (2.26). Let V ∈ NY (f (x0 )), generic. Then there is U ∈ NX (x0 ) such that ∀x ∈ U :
f (x) ∈ V,
so that x0 ∈ U ⊆ f −1 (V ) = {x ∈ X / f (x) ∈ V }. Therefore, f −1 (V ) ∈ NX (x0 ). Since V was chosen arbitrarily, we have proved (2.26). ii) Let’s assume (2.26). We have to prove that f is continuous in x0 , i.e., that ∀ V ∈ NY (f (x0 )), ∃U ∈ NX (x0 ) :
x ∈ U ⇒ f (x) ∈ V.
(2.27)
Let V ∈ NY (f (x0 )), generic. We have that f −1 (V ) ∈ NX (x0 ). Let’s choose U = f −1 (V ) = {x ∈ X / f (x) ∈ V }, so that x ∈ U ⇒ f (x) ∈ V. Since V was chosen arbitrarily, we have proved (2.27). ■ Remark 2.16 As a consequence of Theorem 2.15 we have that the composition of continuous functions is also a continuous function. Specifically if f : (X, T ) → (Y, G) is continuous at x0 and g : (Y, G) → (Z, W) is continuous at f (x0 ), then g ◦ f : (X, T ) → (Z, W) is also continuous at x0 . Let (X, T ) and (Y, G) be topological spaces, f : S ⊆ X → Y and x0 ∈ S. We say that f is continuous at the point x0 iff f : (S, TS ) → (Y, G) is continuous at x0 ; this is is equivalent to ∀ V ∈ NY (f (x0 )), ∃U ∈ NX (x0 ) :
x ∈ U ∩ S ⇒ f (x) ∈ V.
The following result states that to prove the continuity at a point we just have to look through fundamental systems. Theorem 2.16 (Continuity via fundamental systems) Let (X, T ) and (Y, G) be topological spaces, f : S ⊆ X → Y and x0 ∈ S. Assume that F and C are fundamental systems of x0 and f (x0 ), respectively. Then f is continuous in x0 iff ∀ V ∈ C, ∃U ∈ F : x ∈ U ∩ S ⇒ f (x) ∈ V. (2.28) The proof of this result is required as exercise at the end of the chapter. Example 2.15 (Continuity of a functional / real function) Let’s consider a function f : S → R and x0 ∈ S, where S is a subset of the topological space (X, T ). Assume that F is a fundamental systems of x0 . Let’s recall that C = {]f (x0 ) − ϵ, f (x0 ) + ϵ[ / ϵ > 0} is a fundamental system for f (x0 ). Then f is continuous at x0 iff ∀ ϵ > 0, ∃U ∈ F :
x ∈ U ∩ S ⇒ |f (x0 ) − f (x)| < ϵ.
If X = R, equipped with the topology U , then f is continuous at x0 iff ∀ ϵ > 0, ∃δ > 0 :
(|x − x0 | < δ ∧ x ∈ S) ⇒ |f (x0 ) − f (x)| < ϵ.
(2.29)
52
Chapter 2. An introduction to topological spaces
The student can remember, from its course of Calculus, that (2.29) was the way to define the continuity of f at the point x0 . So now we know that this definition corresponds to the continuity whenever R is equipped with the topology U . Remark 2.17 (Set of continuous functions) The set of all continuous functions from a topological space (X, T ) into the topological space (Y, G) shall be denoted by C(X, Y ). Remark 2.18 (Continuity of the absolute value function) Recall that the absolute value verifies the following property ∀u, v ∈ R :
||u| − |v|| ≤ |u − v|.
(2.30)
Example 2.16 Let η : R \ {0} → R, given by η(x) = 1/x. Let’s prove that η ∈ C(R \ {0}, R), i.e., that ∀c ∈ R \ {0}, ∀ϵ > 0, ∃δ > 0 : (|x − c| < δ ∧ x ∈ R \ {0}) ⇒ |η(x) − η(c)| < ϵ.
(2.31)
Idea of the proof. Fixed c ̸= 0 and ϵ > 0, we want to achieve |η(x) − η(c)| =
1 |x − c| < ϵ, |x||c|
(2.32)
by controling the value of δ > 0. Let’s first observe that if δ is very small then the inequality |x − c| < δ implies that x ≈ c. Then, we would have |η(x) − η(c)| =
1 1 1 |x − c| ≈ 2 |x − c| < 2 δ. |x||c| |c| |c|
(2.33)
Therefore, if we had δ < |c|2 ϵ, by replacing this in (2.33), we would get (2.32). Then, the task now is to obtain something like (2.33). For this, we need to 1 control from above the number by something involving c. By applying (2.30) x we see that ||x| − |c|| ≤ |x − c| < δ, so that −δ < |x| − |c| < δ and 1 1 1 < < , |c| + δ |x| |c| − δ where we would need δ to satisfy δ < |c|.
(2.34)
Then, we have |η(x) − η(c)| =
1 1 |x − c| < δ. |x||c| |c|(|c| − δ)
(2.35)
2.6. Subspaces and continuity
53
Now, again, let’s observe that if δ is small enough, then
1 is quite close to |c| − δ
1 . Then we conjeture that for some α > 1: |c| 1 α < , |c| − δ |c| which is actually possible if δ verifies the condition α−1 . δ < |c| α
(2.36)
(2.37)
Then, by (2.36), (2.35) becomes |η(x) − η(c)| =
1 α 1 |x − c| < δ < 2 δ, |x||c| |c|(|c| − δ) |c|
(2.38)
and we would be done by choosing δ verifying δ
0, ∃δ > 0 :
(|x − c| < δ ∧ x ∈ R \ {0}) ⇒ |η(x) − η(c)| < ϵ.
(2.40)
Let ε > 0, generic. Let’s pick some α > 1. Then let’s choose δ > 0 such that α−1 ϵ|c|2 δ < min |c|, |c| , . (2.41) α α Let’s assume that x ∈ R \ {0} and that |x − c| < δ. Then, we have ||x| − |c|| ≤ |x − c| < δ and, consequently, 1 1 1 < < . |c| + δ |x| |c| − δ
(2.42)
Now, by using (2.42) and (2.41) we get |η(x) − η(c)| =
1 1 α |x − c| < δ < 2 δ < ϵ. |x||c| |c|(|c| − δ) |c|
Since ϵ was chosen arbitrarily, we have proved (2.40) so that η is continuous in c. Since c was chosen arbitrarily, we have proved that η is continuous. ■ Now we state a couple of results which are useful.
54
Chapter 2. An introduction to topological spaces Theorem 2.17 (Characterization of a continuous function) Let (X, T ) and (Y, G) be topological spaces, and f : X −→ Y . The following assertions are equivalent: 1. 2. 3. 4.
f is continuous; if E ⊆ Y is open, then f −1 (E) is open; if E ⊆ Y is closed, then f −1 (E) is closed; f (A) ⊆ f (A), for all A ⊆ X.
The proof is required as an exercise at the end of the chapter. Theorem 2.18 (Characterization of a continuous function by sequences) Let (X, T ) and (Y, G) be Hausdorff spaces, and f : X −→ Y . Then, f is continuous at x ∈ X iff ∀(xn )n∈N ⊆ X :
lim xn = x ⇒
n→+∞
lim f (xn ) = f (x).
n→+∞
The proof is required as an exercise at the end of the chapter. To finish this section, let’s state that a functional defined on a topological space is continuous iff it’s both lower and upper semicontinuous. Definition 2.14 (l.s.c. and u.s.c) Let (X, T ) be a topological space, S ⊆ X, f : S −→ R and x0 ∈ S. We say that 1. f is lower semicontinuous (l.s.c.) iff ∀ ϵ > 0, ∃V ∈ T :
x ∈ V ∩ S ⇒ f (x0 ) − ϵ < f (x);
(2.43)
2. f is upper semicontinuous (u.s.c.) iff ∀ ϵ > 0, ∃V ∈ T :
x ∈ V ∩ S ⇒ f (x) < f (x0 ) + ϵ.
(2.44)
It’s clear that in (2.43) and (2.44) we can replace T by any fundamental system of x0 . Having in consideration Example 2.15, we have the following important result. Proposition 2.1 (Continuity by l.s.c. and u.s.c.) Let f : S −→ R and x0 ∈ S, where S is a subset of the topological space (X, T ). Then, f is continuous iff it’s l.s.c. and u.s.c., at the same time.
2.7. Initial topology. Product spaces. Let’s assume that f : X −→ Y where (Y, G) is a topological space and X is a non-void set. It’s not difficult to verify that T = {f −1 (V ) / V ∈ G}
2.7. Initial topology. Product spaces.
55
is a topology on X. T is referred as the initial topology determined by the function f . Since Theorem 2.17 implies that T is the smallest topology on X for which f is continuous, it’s also called the weakest (or coarsest) topology determined by the function f . Let’s assume now that for each λ ∈ Λ, (Yλ , Gλ ) is a topological space and that we have a function fλ : X −→ Yλ . We put E = fλ−1 (U ) / λ ∈ Λ, U ∈ Gλ . (2.45) The topology TE , generated by E , on X is referred to as the initial topology determined by the family (Gλ , fλ )λ∈Λ . Since Theorem 2.17 implies that TE is the smallest topology on X for which all the functions fλ , λ ∈ Λ, are continuous, sometimes it’s also called the weakest (or coarsest) topology associated to the family of functions (fλ )λ∈Λ . Remark 2.19 (Basis and fundamental system for the initial topology) By Remark 2.3, it follows that a topological basis of T is ( n ) \ −1 ′ E = fλk (Uk ) / Uk ∈ Gλk , (2.46) k=1
i.e., the set of finite intersections of inverse images of open sets. Moreover, by Remark 2.4, for each x ∈ X, an open fundamental system of neighborhoods of x is E ′ ∩ N (x). Theorem 2.19 (Hausdorff initial topology) Assume that for each λ ∈ Λ, fλ : X → Yλ , where (Yλ , Gλ ) is a Hausdorff space. Let’s suppose that the family (fλ )λ∈Λ separates points of X, i.e., ∀x, y ∈ X, x ̸= y, ∃λ ∈ Λ :
fλ (x) ̸= fλ (y).
(2.47)
Then (X, TE ) is Hausdorff, where TE is the initial topology associated to the family (fλ )λ∈Λ . Proof. Let’s denote by N (x) the set of neighborhoods of x ∈ X, and by Nλ (y) the set of neighborhoods of y ∈ Yλ . We have to prove that ∀x, y ∈ X, x ̸= y, ∃(U, V ) ∈ N (x) × N (y) :
U ∩ V = ∅.
(2.48)
Let x, y ∈ X, x ̸= y, generic. By (2.47), there is some λ0 ∈ Λ such that fλ0 (x) ̸= fλ0 (y). Since (Yλ0 , Gλ0 ) is Hausdorff, there are S ∈ Nλ0 (fλ0 (x)) and T ∈ Nλ0 (fλ0 (y)) such that S ∩ T = ∅. Since fλ0 is continuous, we have that U = fλ−1 (S) ∈ N (x) and V = fλ−1 (T ) ∈ N (y). Moreover, 0 0 U ∩ V = fλ−1 (S) ∩ fλ−1 (T ) = fλ−1 (S ∩ T ) = fλ−1 (∅) = ∅. 0 0 0 0 Since x and y were chosen arbitrarily, we have proved (2.48). ■ In the next result, we present a characterization of convergence in the initial topology.
56
Chapter 2. An introduction to topological spaces Theorem 2.20 (Convergence of a sequence in the initial topology) Let x ∈ X and (xn )n∈N ⊆ X, where X is endowed with the initial topology associated to the family of functions fλ : X → (Y, Gλ ). Then, xn −→ x, as n −→ +∞, iff ∀λ ∈ Λ : lim fλ (xn ) = fλ (x). (2.49) n→+∞
Proof. i) Let’s assume that xn → x, as n → +∞. Let λ ∈ Λ, generic. Since fλ is continuous, it follows that fλ (xn ) → fλ (x), as n → +∞. Since λ was chosen arbitrarily, we have proved (2.49). ii) Let’s assume (2.49). By Theorem 2.11 and Remark 2.19, we have to prove that ∀U ∈ E ′ ∩ N (x), ∃N ∈ N : n > N ⇒ xn ∈ U. (2.50) Let U ∈ E ′ ∩ N (x), generic. Then we have that x∈U =
m \
fλ−1 (Uk ), k
k=1
for some Uk ∈ Gλk , λk ∈ Λ. In particular, ∀k ∈ Im :
fλk (x) ∈ Uk .
By (2.49), for each k ∈ Im there is Nk ∈ N such that n > Nk ⇒ fλk (xn ) ∈ Uk . Then for N = max{N1 , ..., Nm } we have that n > N ⇒ ∀k ∈ Im : fλk (xn ) ∈ Uk , so that n > N ⇒ ∀k ∈ Im : xn ∈ fλ−1 (Uk ), k whence xn ∈ U , whenever n > N . Since U was chosen arbitrarily, we have proved (2.50). ■ Theorem 2.21 (Continuity and the initial topology) Let (Z, H) be a topogical space and ψ : Z → X, where X is endowed with the initial topology associated to the family of functions fλ : X → (Y, Gλ ). Then ψ is continuous iff ∀λ ∈ Λ : fλ ◦ ψ is continuous. The proof of this result is required as an exercise at the end of the chapter. Let’s consider now the product set X=
Y λ∈Λ
Xλ ,
2.7. Initial topology. Product spaces.
57
where (Xλ , Tλ )λ∈Λ is a family of topological spaces. Let’s equip X with T , the initial topology determined by the projection functions prλ∗ : X −→ Xλ∗ given by prλ∗ (x) = xλ∗ . Then we call (X, T ) the product of the family (Xλ , Tλ )λ∈Λ ; T is referred to as the product topology. By the construction (2.45), it follows that a baseY of the product topology is built by the elementary sets, i.e., the sets of the form Aλ , where λ∈Λ
∀λ ∈ Λ :
Aλ ∈ Tλ ;
#{λ ∈ Λ / Aλ ̸= Xλ } < +∞. It’s not difficult to verify that if all the spaces (Xλ , Tλ ) are Hausdorff, then so is X. Example 2.17 (The standard topology on RN ) Let N ∈ N. The topological space (RN , U N ) is the result of doing the product of N copies of the space (R, U ). In particular, this means that a basis for U N is built by N -dimensional cubes, i.e., N Y sets of the form ]ak , bk [. An open fundamental system of neighborhoods for a k=1
point x = (x1 , x2 , ..., xN ) ∈ RN is (N Y F= ]xk − ϵ, xk + ϵ[ /
) ϵ>0 .
(2.51)
k=1
Example 2.18 (Continuity of the addition in R) Let’s prove that the addition operation on R is continuous, i.e., that the function σ : R × R −→ R (x, y)
7−→ σ(x, y) = x + y,
(2.52)
is continuous whenever R × R and R are provided with the topologies U 2 and U , respectively. Let (x0 , y0 ) ∈ R × R, generic. By considering (2.51) and Theorem 2.16, we have to prove that ∀ϵ > 0, ∃δ > 0 : (|x − x0 | < δ ∧ |y − y0 | < δ) ⇒ |(x + y) − (x0 + y0 )| < ϵ.
(2.53)
Idea of the proof. Fixed ϵ > 0, we want to achieve |x + y − x0 − y0 | < ϵ, by controling the value of δ > 0. Let’s observe that by using the triangle inequality for the absolute value we have |x + y − x0 − y0 | = |(x − x0 ) + (y − y0 )| ≤ |x − x0 | + |y − y0 | ≤ 2δ, and we would be done by choosing δ verifying δ < ϵ/2.
58
Chapter 2. An introduction to topological spaces
Proof.
Let’s prove that σ is continuous at (x0 , y0 ) ∈ R × R, i.e., that
∀ϵ > 0, ∃δ > 0 :
(|x−x0 | < δ ∧ |y−y0 | < δ) ⇒ |(x+y)−(x0 +y0 )| < ϵ. (2.54)
Let ε > 0, generic. Let’s choose δ > 0 such that δ < ϵ/2. Now, by using the triangle inequality for the absolute value, we get |x + y − x0 − y0 | = |(x − x0 ) + (y − y0 )| ≤ |x − x0 | + |y − y0 | ≤ 2δ < ϵ. Since ϵ was chosen arbitrarily, we have proved (2.54) so that σ is continuous at (x0 , y0 ). Since (x0 , y0 ) was chosen arbitrarily, we have proved that the addition on R is continuous. ■ Example 2.19 (Continuity of the multiplication in R) Let’s prove that the multiplication operation on R is continuous, i.e., that the function θ : R×R (x, y)
−→ R 7−→ θ(x, y) = x · y,
(2.55)
is continuous whenever R × R and R are provided with the topologies U 2 and U , respectively. Let (x0 , y0 ) ∈ R × R, generic. By considering (2.51) and Theorem 2.16, we have to prove that ∀ϵ > 0, ∃δ > 0 :
(|x − x0 | < δ ∧ |y − y0 | < δ) ⇒ |(x · y) − (x0 · y0 )| < ϵ. (2.56)
Idea of the proof. Fixed ϵ > 0, we want to achieve |xy − x0 y0 | < ϵ,
(2.57)
by controling the value of δ > 0. Let’s first observe that if δ is very small then the inequalities |x − x0 | < δ and |y − y0 | < δ imply that x ≈ x0 and y ≈ y0 . Then, using the triangle inequality for the absolute value we have |xy − x0 y0 | =
|xy − xy0 + xy0 − x0 y0 |
=
|x(y − y0 ) + (x − x0 )y0 | ≤ |x| |y − y0 | + |x − x0 | |y0 |
0, generic. Let’s choose δ > 0 such that r ϵ ϵ , . δ < min 2(|x0 | + |y0 |) 2
(2.62)
Now, by using the triangle inequality for the absolute value, we get |xy − x0 y0 | = |xy − xy0 + xy0 − x0 y0 | = |x(y − y0 ) + (x − x0 )y0 | ≤ |x| |y − y0 | + |x − x0 | |y0 | < |x| δ + |y0 |δ < (δ + |x0 |)δ + |y0 |δ = δ 2 + (|x0 | + |y0 |) δ < ϵ. Since ϵ was chosen arbitrarily, we have proved (2.61) so that θ is continuous in (x0 , y0 ). Since (x0 , y0 ) was chosen arbitrarily, we have proved that the multiplication on R is continuous. ■ We finish this section with a characterization of Hausdorff spaces which uses the concept of product space. Theorem 2.22 (Chacterization of a Hausdorff space via product space) Let (X, T ) be a topological space and let ∆ = {(x, x) / x ∈ X}. Then X is a Hausdorff space iff ∆ is closed in X × X. The proof is required as an exercise at the end of the chapter.
2.8. Compact sets Given B, a subset of the topological space (X, T ), we say that a family (Aλ )λ∈Λ is a covering of B iff [ B⊆ Aλ . λ∈Λ
If #(Λ) < +∞, we say that the covering is finite. If #(Λ) = ℵ0 , we say that the covering is countable. We say that the covering is open iff for every λ ∈ Λ, Aλ ∈ T .
60
Chapter 2. An introduction to topological spaces
Example 2.20 Given B subset of the topological space (X, T ) we pick, for every x ∈ B, some Ax ∈ N (x). Therefore, for every x ∈ B, x ∈ Ux ⊆ Ax , where Ux ∈ T . Then (Ax )x∈B is a covering of B and (Ux )x∈B is an open covering of B. Example 2.21 Clearly {] − n, n[}n∈N is a countable open covering of R, when it’s equipped with the topology U . Definition 2.15 (Compact set and sequentially compact set) Let (X, T ) be a topological space and B ⊆ X. We say that B is 1. compact iff for every open covering of B, (Aλ )λ∈Λ , there is I ∈ Λ finite such that (Aλ )λ∈I is also a covering of B; 2. sequentially compact iff every sequence in X has a convergent subsequence. In other words, a set B is compact iff from every open covering is possible to extract a finite subcovering of B. We say that a set A is relatively compact iff A is compact. Remark 2.20 (Bolzano-Weierstrass property) Condition 2 in Definition 2.15 is usually referred to as the Bolzano-Weierstrass property (or B-W property ). Later we will see that in a metric space a set is compact iff it’s sequentially compact. Remark 2.21 (Locally compact space) A topological space is said to be locally compact iff every point has a compact neighborhood. Example 2.22 In the space (R, U ) the set B =]0, 1] is not compact. Observe, for example, that {]1/n, 1 + 1/n[}n∈N is an open covering of B but that it’s not possible to extract a finite subcovering of B. The following is a basic and important theorem. At least one way to prove it should be known by a math student. Theorem 2.23 (A finite closed interval is compact) Let a, b ∈ R such that a < b. Then [a, b] is compact in (R, U ). Proof. We have to prove that from every open covering of [a, b], it’s possible to extract a finite subcovering. Let (Aλ )λ∈Λ ⊆ U be an open covering of [a, b], generic. Then [ ∀y ∈ [a, b] : y ∈ Aλ . (2.63) λ∈Λ
Let’s define ( M=
) x ∈ [a, b] / ∃F ⊆ Λ : #(F ) < +∞ ∧ [a, x] ⊆
[
Aλ
.
(2.64)
λ∈F
We shall achieve our goal by showing that β = sup(M ) ∈ M and that β = b.
2.8. Compact sets
61
1. Let’s prove that M ̸= ∅. Since [a, a] = {a}, we take F = {λ∗ } where, by (2.63), λ∗ ∈ Λ is chosen in a way that a ∈ Aλ∗ . Then, we have that [ [a, a] ⊆ Aλ , λ∈F
which means that a ∈ M so that M ̸= ∅. 2. M is bounded because M ⊆ [a, b], so that there exists β = sup(M ) ∈ R.
(2.65)
3. Let’s prove that β ∈ M . The idea is to use (2.21), that is, we can find elements of M that are as close as we want to β. Since ∀x ∈ M :
a ≤ x ≤ b,
it follows that a ≤ β ≤ b, i.e., β ∈ [a, b]. Therefore, by (2.63), there is some λ0 ∈ Λ such that β ∈ Aλ0 . Since the open intervals constitute a topological basis for U and Aλ0 ∈ U , we can find c, d ∈ R such that β ∈]c, d[⊆ Aλ0 .
(2.66)
Now, thanks to (2.65) and (2.21), there is x ∈ M such that c < x < β < d. Since x ∈ M , there is G ⊆ Λ finite and such that [ [a, x] ⊆ Aλ .
(2.67)
(2.68)
λ∈G
Then, by putting F = G ∪ {λ0 }, we obtain, by (2.66), (2.67) and (2.68), that [ [a, β] ⊆ [a, x] ∪ Aλ0 ⊆ Aλ , λ∈F
whence β ∈ M . 4. Let’s prove that β = b. Let’s proceed by Reduction to Absurdity. So let’s assume that β < b. (2.69) Since β ∈ M , there is F ⊆ Λ finite such that [ [a, β] ⊆ Aλ . λ∈F
We choose λ0 ∈ F such that β ∈ Aλ0 Since Aλ0 is open, there exist c, d ∈ R such that β ∈]c, d[⊆ Aλ0 . (2.70) By (2.69) and (2.70), we choose z such that β < z < min{b, d}.
(2.71)
Then we have that [a, z] ⊆
[
Aλ ,
λ∈F
so that z ∈ M . Then, the condition of supremum of β is contradicted by (2.71), which let us conclude that β = b.
62
Chapter 2. An introduction to topological spaces
■ It’s not difficult to prove that a finite union of compact sets is compact. Also, any intersection of compact sets is compact. Theorem 2.24 (Compact implies closed) Let (X, T ) be a topological space. If A ⊆ X is compact, then A is closed. The proof of this result is required as an exercise at the end of the chapter. Example 2.23 By using Theorem 2.24, it can be proved that in (R, U ) a subset is compact iff it’s closed and bounded. Therefore, a subset of R is relatively compact iff it’s bounded. The following result states a very important property of continuous functions: they transfer compact sets into compact sets. Theorem 2.25 (Continuity and compacity) Let (X, T ) and (Y, G) be Hausdorff spaces, and f : X −→ Y continuous. If A ⊆ X is compact, then f (A) is compact. Proof. Let (Vλ )λ∈Λ ⊆ G be an open covering of f (A), generic. Then [ f (A) ⊆ Vλ λ∈Λ
and, by (1.14) and (1.18), we get ! A⊆f
−1
[
Vλ
λ∈Λ
[
=
Hλ ,
λ∈Λ
where, by Theorem 2.17, ∀λ ∈ Λ :
Hλ = f −1 (Vλ ) ∈ T .
Therefore, (Hλ )λ∈Λ is an open covering of A. Since A is compact, there is F ⊆ Λ finite such that (Hλ )λ∈F is also an open covering of A. Then, again by (1.14) and (1.18), ! [ [ [ −1 −1 A⊆ Hλ = f (Vλ ) = f Vλ , λ∈F
λ∈F
f (A) ⊆
λ∈F
[
Vλ ,
λ∈F
so that (Vλ )λ∈F is a finite open covering of f (A). Since (Vλ )λ∈Λ was chosen arbitrarily, we have proved that f (A) is compact. ■ Remark 2.22 (Compactnes and existence of extremum) Let f : (X, T ) −→ (R, U ) be continuous. If A ⊆ X is compact, there are xm , xM ∈ A such that f (xm ) = min f (x), x∈A
f (xM ) = max f (x). x∈A
This follows from the fact that f (A) is a compact subset of R so that it’s bounded and closed.
2.8. Compact sets
63
Given a set X, we say that a family (Fλ )λ∈Λ ⊆ P(X) has the finite intersection property iff \ Fλ ̸= ∅, λ∈J
for every finite J ⊆ Λ. Example 2.24 Let X be a set and (Fn )n∈N ⊆ P(X) \ {∅} decreasing, i.e., Fn ⊇ Fn+1 , for every n ∈ N. Then (Fn )n∈N has the finite intersection property. Theorem 2.26 (Compacity by the finite intersection property) Let (X, T ) be a Hausdorff space. Then the following assertions are equivalent 1. X is compact; 2. if (Fλ )λ∈Λ is a family of closed sets having the finite intersection property, then we have that \ Fλ ̸= ∅. λ∈Λ
Proof. i) Let’s reason by contradiction. So let’s assume that X is compact and that (Fλ )λ∈Λ is\ a family of closed sets having the finite intersection property but for which Fλ = ∅. Then λ∈Λ
!c c
X=∅ =
\
Fλ
=
λ∈Λ
[
Fλc ,
λ∈Λ
so that (Fλc )λ∈Λ is an open covering of X. Since X is compact, there c is J ⊆ [ Λ finite and such that (Fλ )λ∈J is an open covering of X, i.e., c X= Fλ . Then λ∈J
!c c
∅=X =
[ λ∈J
Fλc
=
\
Fλ ,
λ∈J
which contradicts the finite intersection property of (Fλ )λ∈Λ . ii) Let’s reason by contradiction. So let’s assume that point 2 holds and that X is not compact. Then there is an open covering of X, say (Uλ )λ∈Λ , from which it’s not possible to extract a finite subcovering, i.e. [ ∀J ⊆ Λ : #(J) < +∞ ⇒ Uλ ̸= X. (2.72) λ∈J
Now, (Uλc )λ∈Λ is a family of closed sets and, by (2.72), \ ∀J ⊆ Λ : #(J) < +∞ ⇒ Uλc ̸= ∅, λ∈J
64
Chapter 2. An introduction to topological spaces c which means that property. Therefore, \ (Uλ )λ∈Λ has the finite intersection [ c we have that Uλ ̸= ∅ or, equivalently, Uλ ̸= X, which contradicts λ∈Λ
λ∈Λ
the condition of covering of (Uλ )λ∈Λ . ■
2.9. Problems Problem 2.1 Let X be a non-void set and a ∈ X. Prove that Λ = {A ⊆ X/ A = ∅ ∨ A ∋ a}. is a topology on X. Problem 2.2 Let X be a non-void set and a ∈ X. We define T = {A ⊆ X/ A = ∅ ∨ #(Ac ) < +∞}. 1. Prove that T is a topology on X. 2. Assume now that X = R and for each n\∈ N we put An = {n}c . Prove that, for every n ∈ N, An ∈ T and that An ∈ / T. n∈N
Problem 2.3 Let (Tλ )λ∈Λ a family of topologies on X. Prove that T =
\
Tλ
λ∈Λ
is also a topology on X. Problem 2.4 Let (Tω )ω∈Ω a family of topologies on X and E =
[
Tω . Prove
ω∈Ω
that TE = sup Tω . ω∈Ω
Problem 2.5 Let’s consider the space (R, U ), where U = TE with E = {]a, b)[ /a < b}. Prove that the following sets are fundamental systems of t ∈ R: F1 = {[t − ϵ, t + ϵ] /
F2 = {]t − ϵ, t + ϵ[ /
ϵ > 0},
ϵ > 0}.
Problem 2.6 Let (X, T ) be a topological space, A ⊆ X and B a topological basis. Prove that A is open iff ∀x ∈ A, ∃V ∈ B :
x ∈ V ⊆ A.
Problem 2.7 Prove that in a topological space (X, T ) the following properties hold for its subsets: A⊆B (A ⊆ B ∧ A ∈ T )
⇒
˚ ⊆ B, ˚ A ˚ A ⊆ B,
⇒ ˚ ˚ ∩ B, ˚ \ A ∩B =A ˚ ˚∪ B ˚⊆A \ A ∪ B.
2.9. Problems
65
Problem 2.8 Let (X, T ) be a Hausdorff space and t \ ∈ X. Prove \that if F1 and U. V = F2 are fundamental systems of t, then we have that V ∈F1
U ∈F2
Problem 2.9 Consider the topological space (R, U ). 1. Prove that t ∈ A iff ∀ϵ > 0 :
]t − ϵ, t + ϵ[∩A ̸= ∅.
2. Prove that if A is bounded, then inf(A) ∈ A and sup(A) ∈ A. Problem 2.10 Let (X, T ) be a topological space and A ⊆ X. Prove that A is closed iff A = A. Problem 2.11 Let (X, T ) be a topological space and suppose that U is open and A ⊆ X. Prove that U ∩ A = ∅ iff U ∩ A = ∅. Problem 2.12 Let (X, T ) be a topological space and A ⊆ X. Prove that the following properties hold: ˚ 1. ∂A = A \ A; 2. 3. 4. 5.
A is closed iff A ⊇ ∂A; A is both open and closed iff ∂A ̸= ∅; A = A ∪ ∂A; ∂A is closed.
Problem 2.13 Prove that Q is dense in (R, U ). Problem 2.14 Let (X, T ) be a topological space and F ⊆ X closed. Prove that 1. F is nowhere dense iff F c is dense in X; 2. ∂F is nowhere dense. Problem 2.15 Let (X, T ) be a topological space and A, B, C ⊆ X. Assume that A is dense in B and B is dense in C. Prove that A is dense in C. Problem 2.16 Let (X, T \ ) be a Hausdorff space, x ∈ X and F a fundamental system of x. Prove that V = {x}. V ∈F
Problem 2.17 Prove that the space (R, U ) is regular. Problem 2.18 Is it possible to prove Theorem 2.10 by replacing the condition T2 by T1 on the topological space? Problem 2.19 In the topological space (R, U ), prove that the sequence (xn )n∈N converges to L ∈ R, as n → +∞: xn = β n , L = 0, where β ∈] − 1, 1[; n2 + 1 , L = 1; n2 xn = a1/n , L = 1, where a ∈]0, +∞[. xn =
66
Chapter 2. An introduction to topological spaces
Problem 2.20 Let (X, T ) be a topological space and S ⊆ X. Prove that 1. if B is a basis of T , then BS = {V ∩ S / V ∈ B} is a basis of TS ; 2. if F is a fundamental system of t ∈ S in the space (X, T ), then FS = {V ∩ S / V ∈ F} is a fundamental system of t, in the space (S, TS ); 3. for A ⊆ S, clS (A) = clX (A) ∩ S. Problem 2.21 Let (X, T ) be a topological space and S ⊆ X. Prove that if X is regular then S is regular. Problem 2.22 Let’s consider the topological space (R, U ) and the subspace S = [a, b], a < b. Consider FS = {[b − ϵ, b] / 0 ≤ ϵ ≤ b − a}. Is FS a fundamental system of neighborhoods of t = b in the space ([a, b], U[a,b] )? Problem 2.23 Let (X, T ) and (Y, G) be topological spaces, f : S ⊆ X −→ Y and x0 ∈ S. Assume that F and C are fundamental systems of x0 and f (x0 ), respectively. Prove that f is continuous in x0 iff ∀ V ∈ C, ∃U ∈ F :
x ∈ U ∩ S ⇒ f (x) ∈ V.
Problem 2.24 Let f : (X, T ) −→ (Y, G) be continuous at x0 ∈ X and g : (Y, G) −→ (Z, W) be continuous at f (x0 ). Prove that g ◦ f : (X, T ) → (Z, W) is continuous at x0 . Problem 2.25 Assume that f : S ⊆ R −→ R is a Lipschitz function, i.e., ∃L > 0, ∀x, y ∈ S : 1. Prove that f is continuous. 2. Prove that ∀x, y ∈ R :
|f (x) − f (y)| < L|x − y|.
||x| − |y|| ≤ |x − y|,
and conclude that R ∋ x 7→ abs(x) = |x| ∈ R is continuous. Problem 2.26 Let (X, T ) and (Y, G) be topological spaces, and f : X −→ Y . Prove that the following assertions are equivalent: 1. 2. 3. 4.
f is continuous; if E ⊆ Y is open, then f −1 (E) is open; if E ⊆ Y is closed, then f −1 (E) is closed; f (A) ⊆ f (A), for all A ⊆ X.
Problem 2.27 Let (X, T ) and (Y, G) be Hausdorff spaces, and f : X −→ Y . Prove that f is continuous at x ∈ X iff ∀(xn )n∈N ⊆ X :
lim xn = x ⇒
n→+∞
lim f (xn ) = f (x).
n→+∞
2.9. Problems
67
Problem 2.28 For any topological space (W, H) and t ∈ W , let’s write A (t) = {A ⊆ W / t ∈ A}. Let (X, T ) and (Y, G) be topological spaces, and f : X −→ Y . Prove that f is continuous at a point c ∈ X iff A ∈ A (c) ⇒ f (A) ∈ A (f (c)). Problem 2.29 For each λ ∈ Λ, let fλ : X −→ Yλ , where (Yλ , Gλ ) is a topological space. We put E = fλ−1 (U ) / λ ∈ Λ, U ∈ Gλ . Prove that the generated topology TE on X is the smallest topology on X for which all the functions fλ , λ ∈ Λ, are continuous. Problem 2.30 Let (Z, H) and ψ : Z −→ X, where X is endowed with the initial topology associated with the family of functions fλ : X → (Y, Gλ ), λ ∈ Λ. Prove that ψ is continuous iff fλ ◦ ψ is continuous, for every λ ∈ Λ. Problem 2.31 Let (X, T ) be a topological space and let ∆ = {(x, x) / x ∈ X}. Prove that X is a Hausdorff space iff ∆ is closed in X × X. Problem 2.32 Prove that the division of real numbers is continuous when R × R and R are provided with the typical topologies. Problem 2.33 Let a, b ∈ R, a < b. Prove that ]a, b[ is not compact in (R, U ). Problem 2.34 Let (X, T ) be a topological space and A ⊆ X compact. Prove that A is closed. Problem 2.35 Prove that in (R, U ) a subset is compact iff it’s closed and bounded. Problem 2.36 Let f : (X, T ) −→ (R, U ) be continuous. Assume that A ⊆ X is compact. Prove that there are xm , xM ∈ X such that ∀x ∈ A :
f (xm ) ≤ f (x) ≤ f (xM ).
Problem 2.37 Let f : [0, 1] −→ R such that ∀n ∈ N, ∃xn ∈ [0, 1] :
f (xn ) > n.
Is f continuous? Problem 2.38 Let (X, T ) be a first countable space, i.e., every point has a countable fundamental system of neighborhoods. Let A ⊆ X. 1. Prove that for each x ∈ X, there is a fundamental system of neighborhoods of x, Cx = {Un / n ∈ N}, such that Un ⊇ Un+1 , n ∈ N. 2. Prove that if a sequence (xn )n∈N ⊆ X is such that xn ∈ Un , for every n ∈ N, then xn → x, as n → +∞. 3. Prove that x ∈ acc(A) iff there exists (xn )n∈N ⊆ A\{x} such that xn −→ x, as n −→ +∞.
3. An introduction to metric spaces 3.1. Definition of metric In our study of Functional Analysis, the next step is the concept of metric space which, as we shall see, is a particular case of a topological space. Definition 3.1 (Metric, metric space) Let X be a non-void set and d : X × X −→ R. We say that d is a metric (or distance function) on X iff the following conditions hold: 1. ∀x, y ∈ X : d(x, y) = 0 ⇐⇒ x = y; 2. (symmetry) ∀x, y ∈ X : d(x, y) = d(y, x); 3. (triangle inequality) ∀x, y, z ∈ X : d(x, y) ≤ d(x, z) + d(z, y). In this case, the pair (X, d) is called a metric space. The real number d(x, y) is referred to as the distance between x and y. An immediate consequence of the definition is that a metric is non-negative: ∀x, y ∈ X :
d(x, y) ≥ 0.
A very useful inequality is stated in the following result. Theorem 3.1 (A very useful inequality) Let (X, d) be a metric space. Then ∀a, b, c, d ∈ X :
|d(a, b) − d(c, d)| ≤ d(a, c) + d(b, d).
(3.1)
In a metric space (X, d), for a given center x0 ∈ X and radius r > 0, we call ball (or open ball) to the set B(x0 , r) = {y ∈ X / d(x0 , y) < r}, closed ball to the set B(x0 , r) = {y ∈ X / d(x0 , y) ≤ r}, sphere to the set S(x0 , r) = {y ∈ X / d(x0 , y) = r}. Definition 3.2 (d-open sets) Let (X, d) be a metric space. We say that a set M ⊆ X is d-open iff ∀x ∈ M, ∃r > 0 :
B(x, r) ⊆ M.
(3.2)
We shall denote by Td the collection of d-open sets of X. It’s clear that any ball is a d-open set. In the following result we prove that any metric space is a topological space. Theorem 3.2 (Topology induced by a metric) Let (X, d) be a metric space. Then Td is a topology on X. Td is referred to as the topology induced by the
69
70
Chapter 3. An introduction to metric spaces metric d.
Proof. 1. It’s clear that ∅ ∈ T and X ∈ T . 2. Now let’s prove that ∀A, B ∈ Td :
A ∩ B ∈ Td .
(3.3)
Let A, B ∈ Td , generic. Let x ∈ A ∩ B, generic. Since A is d-open, there is r1 > 0 such that x ∈ B(x, r1 ) ⊆ A. Since B is d-open, there is r2 > 0 such that x ∈ B(x, r2 ) ⊆ B. Therefore, by taking r = min{r1 , r2 }, we get x ∈ B(x, r) ⊆ A ∩ B. As x was chosen arbitrarily, we have proved that A ∩ B ∈ Td . Since A, B were chosen arbitrarily, we have proved (3.3). 3. Let’s prove that the union of d-open sets [ is also a d-open set. Let (Aλ )λ∈Λ ⊆ Td , generic. Let’s prove that C = Aλ is d-open. Let x ∈ C, generic. λ∈Λ
Then there exists λ0 ∈ Λ such that x ∈ Aλ0 . Since Aλ0 is d-open, there is r > 0 such that x ∈ B(x, r) ⊆ Aλ0 ⊆ C. Since x was chosen arbitrarily, the last proves that C ∈ Td . We conclude by the arbitrariness of (Aλ )λ∈Λ . ■ Remark 3.1 (The balls constitute a topological basis) Thanks to Theorem 3.2, all the topological concepts are relevant for a metric space. In particular, by (3.2), it’s clear that Bd = {B(x, r) / x ∈ X ∧ r > 0} is a basis for Td , so that any d-open set can be built as the union of balls. In the same way, given x ∈ X, the set Bd (x) = {B(x, r) / r > 0} is an open fundamental system of neighborhoods of x. Remark 3.2 (Metrics and topologies) Two different metrics on the same set, can induce the same topology. However, a metric defines only one topology; as a consequence, from now on, we will speak of open sets instead of d-open sets. Example 3.1 (The typical metric on R) The student can easily verify that the function d : R × R −→ R given by d(x, y) = |x − y|, verifies the requirements of Definition 3.1. Let’s observe that any open finite interval is a ball: ]a, b[= B (x0 , r) ,
x0 =
a+b b−a ,r= . 2 2
As a consequence, U = Td , the typical topology of R (see Example 2.2). Example 3.2 (The typical metric on RN ) The student can remember from the basic course of Linear Algebra that the function d : RN × RN −→ R defined by v uN uX d(x, y) = t (xk − yk )2 , k=1
3.1. Definition of metric
71
where x = (x1 , ..., xN ) and y = (y1 , ..., yN ), verifies the requirements of Definition 3.1 and we also have that, U N = Td . Example 3.3 (A topology induced by two different metrics) Let (X, d) be a metric space. The function dˆ : X × X −→ R defined by ˆ y) = min{1, d(x, y)}, d(x, is a new metric on X. It can be proved that Td = Tdˆ. Example 3.4 Let X be a non-void set and f : X −→ R injective. The function d : X × X −→ R given by d(x, y) = |f (x) − f (y)|, is a metric on X. By using this, it can be proved that ρ : R × R −→ R given by x y − , ρ(x, y) = 1 + |x| 1 + |y| is a metric on R. Example 3.5 Let’s recall that RN denotes the set of all real sequences. Let’s consider the function ρ : RN × RN −→ R given by ρ(x, y) =
+∞ X n=1
an
|xn − yn | , 1 + |xn − yn |
where x = (xn )n∈N , y = (yn )n∈N ∈ RN . Here the sequence (an )n∈N ⊆]0, +∞[ is +∞ X chosen verifying an < +∞. Then (RN , ρ) is a metric space. n=1
Example 3.6 (The metric space l1 (R)) Let’s consider the set of real sequences whose series are absolutely convergent: ( ) +∞ X 1 N l (R) = x = (xn )n∈N ∈ R / |xn | < +∞ . n=1
We shall prove that the function d : l1 (R) × l1 (R) −→ R defined by d(x, y) =
+∞ X
|xn − yn |,
n=1
where x = (xn )n∈N and y = (yn )n∈N , is a metric on l1 (R), referred to as the l1 -metric. By default, we consider that the set l1 (R) is equiped with the l1 -metric and we speak of the space l1 (R) or simply l1 . However, sometimes we equip the set l1 (R) with a different metric (e.g. that of the Example 3.5); in this case we are not dealing with the space l1 (R). Let’s prove that d is a metric.
72
Chapter 3. An introduction to metric spaces 1. First, let’s show that ∀x, y ∈ l1 (R) :
d(x, y) = 0 ⇐⇒ x = y.
(3.4)
Let x = (xn )n∈N , y = (yn )n∈N ∈ l1 (R), generic. a) If x = y, we have that xn = yn , for all n ∈ N. Then d(x, y) = +∞ X |xn − yn | = 0. n=1 +∞ X
b) If d(x, y) =
|xn − yn | = 0, then necessarily |xn − yn | = 0, for every
n=1
n ∈ N, so that x = y. Since x, y were chosen arbitrarily, we have proved (3.4). 2. The fact that d(x, y) = d(y, x), for all x, y ∈ l1 (R) follows immediately from the following well-known property: |a − b| = |b − a|, for every a, b ∈ R. 3. Let’s prove that ∀x, y, z ∈ l1 (R) :
d(x, y) ≤ d(x, z) + d(z, y).
(3.5)
1
Let x = (xn )n∈N , y = (yn )n∈N , z = (zn )n∈N ∈ l (R), generic. Then we have that d(x, y) =
=
+∞ X n=1 +∞ X
|xn − yn | ≤ |xn − zn | +
n=1
+∞ X n=1 +∞ X
(|xn − zn | + |zn − yn |) |zn − yn | = d(x, z) + d(z, y).
n=1
Since x, y, z were chosen arbitrarily, we have proved (3.5).
3.2. Definitions of norm and interior product The concept of norm, that we immediately introduce, generalizes the idea of the size of a vector, and it’s the most usual way to provide a metric to a linear space. Definition 3.3 (Norm, normed space) Let V be a vector space and ∥ · ∥ : V −→ R. We say that ∥ · ∥ is a norm on V iff the following conditions hold: 1. ∀u ∈ V : ∥u∥ = 0 ⇐⇒ u = 0; 2. ∀λ ∈ R, ∀u ∈ V : ∥λ · u∥ = |λ| · ∥u∥; 3. (triangle inequality) ∀u, v ∈ V : ∥u + v∥ ≤ ∥u∥ + ∥v∥. In this case, the pair (V, ∥ · ∥) is called a normed space. The real number ∥u∥ is referred to as the norm (or the size) of the vector u. An immediate consequence of the definition is that the norm is non-negative: ∀u ∈ V :
∥u∥ ≥ 0.
The metric induced by the norm ∥ · ∥ is the function d : V × V −→ R given by d(u, v) = ∥u − v∥, that is, the distance between u and v is the size of the vector u − v. Therefore, every normed space becomes automatically a metric space.
3.2. Definitions of norm and interior product
73
Remark 3.3 (Seminorm) Let V be a linear space. We say that ||| · ||| : V −→ R is a seminorm iff we have that 1. ∀v ∈ V : |||v||| ≥ 0; 2. ∀λ ∈ R, ∀v ∈ V : |||λ v||| = |λ| · |||v|||; 3. ∀v, w ∈ V : |||v + w||| ≤ |||v||| + |||w|||. It’s clear that every norm is a seminorm. Example 3.7 (A seminorm which is not a norm) Let r > 0. The functional ||| · ||| : C([−r, r]) −→ R, given by |||f ||| = |f (0)|, is a seminorm which is not a norm. The concept of inner-product, that we immediately introduce, generalizes the idea of scalar product of three-dimensional vectors. Definition 3.4 (Inner-product space) Let V be a vector space and (·, ·) : V × V −→ R. We say that (·, ·) is an inner-product on V iff the following conditions hold. 1. (·, ·) is linear in the first variable: ∀u, v, w ∈ V :
(u + v, w) = (u, w) + (v, w);
∀λ ∈ R, ∀u, v ∈ V :
(λ u, v) = λ (u, v);
2. (symmetry) ∀u, v ∈ V : (u, v) = (v, u); 3. (positivity) ∀u ∈ V : (u, u) ≥ 0; 4. ∀u ∈ V : (u, u) = 0 ⇐⇒ u = 0.. In this case, (V, (·, ·)) is called an inner-product space. The real number (u, v) is referred to as the inner-product of the vectors u and v. By the symmetry, it follows that (·, ·) is also linear in the second variable. An inner-product space becomes a normed space. To prove this, we first state a very important inequality: Lemma 3.1 (Cauchy-Bunyakovsky-Schwarz inequality) Let V be an innerproduct space. Then p p (3.6) ∀u, v ∈ V : |(u, v)| ≤ (u, u) · (v, v).
Proof. 1. Let’s first prove that ∀u, v ∈ V :
(u, u) = (v, v) = 1 ⇒ |(u, v)| ≤ 1.
(3.7)
Let u, v ∈ V with (u, u) = (v, v) = 1, generic. Then 0 ≤ (u − v, u − v) = (u, u) + (v, v) − 2(u, v) = 2(1 − (u, v)), 0 ≤ (u + v, u + v) = (u, u) + (v, v) + 2(u, v) = 2(1 + (u, v)),
(3.8)
74
Chapter 3. An introduction to metric spaces whence −1 ≤ (u, v) ≤ 1. Since u, v were chosen arbitrarily, we have proved (3.7). 2. Now let’s prove (3.6). Let p u, v ∈ V p , generic. If u = 0 or v = 0, then we immediatly have |(u, v)| ≤ (u, u) · (v, v). Then let’s assume that u ̸= 0 and v ̸= 0 so that the vectors 1 u, u ˆ= p (u, u)
vˆ = p
1 (v, v)
v,
are such that (ˆ u, u ˆ) = (ˆ v , vˆ) = 1. Therefore, (3.7) implies that |(ˆ u, vˆ)| ≤ 1, i.e., ! 1 1 u, p v ≤ 1, p (u, u) (v, v) p p which implies that |(u, v)| ≤ (u, u) · (v, v). Since u and v were chosen arbitrarily, we have proved (3.6). ■ Remark 3.4 The inequality (3.6) is frequently referred to simply as CauchySchwarz inequality. For short, we shall call it CBS-inequality or CBS along this document. Theorem 3.3 (Euclidean norm) Let V be p an inner-product space. Then the function ∥ · ∥ : V → R given by ∥u∥ = (u, u), is a norm on V , referred to as the Euclidean norm induced by the inner-product. Proof. Conditions 1 and 2 in Definition 3.3 are immediate. So let’s prove that ∀u, v ∈ V :
∥u + v∥ ≤ ∥u∥ + ∥v∥.
(3.9)
Let u, v ∈ V , generic. By using CBS inequality we have that ∥u + v∥2 = (u + v, u + v) = (u, u)2 + (v, v)2 + 2(u, v) ≤ ∥u∥2 + ∥v∥2 + 2∥u∥ ∥v∥ = (∥u∥ + ∥v∥)2 . Since u and v were chosen arbitrarily, we have proved (3.9).
■
Remark 3.5 With the notation of Theorem 3.3, CBS inequality can be written as ∀u, v ∈ V :
|(u, v)| ≤ ∥u∥ ∥v∥.
Example 3.8 (R has inner-product) In the set R, the usual multiplication is an inner-product that induces the norm given by the absolute value function: √ |x| = x2 , x ∈ R. As expected, the distance between two real numbers, x, y, is given by |x − y|.
3.2. Definitions of norm and interior product
75
Example 3.9 (Euclidean norm on RN ) The student can remember from from the basic course of Linear Algebra that the function (·, ·) : RN × RN −→ R defined by N X (x, y) = xk yk , (3.10) k=1
where x = (x1 , ..., xN ) and y = (y1 , ..., yN ), is an inner-product on RN . This inner-product induces the classical norm: v uN uX x2k , ∥x∥2 = t k=1
and also provides the metric presented in Example 3.2. In this case, the CBS inequality corresponds to the Schwarz inequality !2 ! N ! N N X X X 2 2 xk yk ≤ xk yk . k=1
k=1
k=1
Example 3.10 (The space l2 (R)) Let’s consider the set of real sequences which are square-summable: ) ( +∞ X 2 2 N |xn | < +∞ . l (R) = x = (xn )n∈N ∈ R / n=1
By using the information of Example 3.9, it’s not difficult to show that the map (·, ·) : l2 (R) × l2 (R) −→ R, given by (x, y) =
+∞ X
xn yn ,
n=1
where x = (xn )n∈N and y = (yn )n∈N , is an inner-product on l2 (R). By default, we consider that l2 (R) is equipped with this inner-product. The following result presents a way to identify if a norm is an Euclidean norm, that is, if it’s induced by some inner-product. Theorem 3.4 (Parallelogram equality) Let (V, ∥ · ∥) be a normed space. Then ∥ · ∥ is induced by an inner-product iff the parallelogram equality holds: ∀u, v ∈ V : ∥u + v∥2 + ∥u − v∥2 = 2 ∥u∥2 + ∥v 2 ∥ . The proof is straightforward and is required as an exercise at the end of the chapter. Example 3.11 (Non-Euclidean norms on RN ) On RN the following formulas N X define non-Euclidean norms: ∥x∥0 = max |xk | and ∥x∥1 = |xk |, where k=1,...,n
x = (x1 , ..., xN ).
k=1
76
Chapter 3. An introduction to metric spaces
Example 3.12 (l1 (R)-norm) In Example 3.6 we introduced the l1 -metric on the space of real sequences whose series are absolutely convergent, l1 (R). The l1 metric is induced by the l1 -norm, ∥ · ∥ : l1 (R) −→ R, given by ∥x∥ =
+∞ X
|xn |,
n=1
where x = (xn )n∈N . Since this norm does not verify the parallelogram equality, it is not induced by an inner-product. Remark 3.6 (Subspaces) Any linear subspace of a normed space (or inner-product space) is a subspace whenever it’s equipped with the corresponding restriction of the norm (or the inner-product). Example 3.13 (Bounded real functions) Let A be a non-void set. We consider the set B(A) = {u ∈ F (A) / u is bounded}. Here, F (A) is the functional space which was introduced in Example 1.8. Then v ∈ B(A) iff ∃M > 0, ∀x ∈ A :
|f (x)| < M.
It’s easy to check that B(A) is a linear subspace of F (A). Let’s prove that the map ∥ · ∥∞ : B(A) → R, given by ∥u∥∞ = sup |u(x)|, x∈A
is a norm on B(A). Proof. 1. Let’s prove that ∀u ∈ B(A) :
∥u∥∞ = 0 ⇐⇒ u = 0.
(3.11)
Let u ∈ B(A), generic. a) Let’s assume that u = 0. Then u(x) = 0, for every x ∈ A, so that ∥u∥∞ = 0. b) Let’s assume ∥u∥∞ = 0. Then ∀y ∈ A :
0 ≤ |u(y)| ≤ sup |u(x)| = 0, x∈A
whence, u(y) = 0, for every y ∈ A, i.e., u = 0. Since u was chosen arbitrarily, we have proved (3.11). 2. Let’s prove that ∀λ ∈ R, ∀u ∈ B(A) :
∥λ · u∥∞ = |λ| · ∥u∥∞ .
(3.12)
Let λ ∈ R and u ∈ B(A), generic. Then, for x ∈ A, we have |λ · u(x)| = |λ| |u(x)| so that ∥λ u∥∞ = sup |λ · u(x)| = |λ| sup |u(x)| = |λ| ∥u∥∞ . x∈A
x∈A
Since u was chosen arbitrarily, we have proved (3.12).
3.2. Definitions of norm and interior product
77
3. Let’s prove that ∀u, v ∈ B(A) :
∥u + v∥∞ ≤ ∥u∥∞ + ∥v∥∞ .
(3.13)
Let u, v ∈ B(A), generic. Then, for x ∈ A, we have |(u + v)(x)| = |u(x) + v(x)| ≤ |u(x)| + |v(x)| so that ∥u + v∥∞ = sup |(u + v)(x)| ≤ sup (|u(x)| + |v(x)|) x∈A
x∈A
≤ sup |u(x)| + sup |v(x)| = ∥u∥∞ + ∥v∥∞ . x∈A
x∈A
Since u and v were chosen arbitrarily, we have proved (3.13). ■ Example 3.14 (The space C([a, b])) Let a, b ∈ R such that a < b. The student can remember from the Calculus course that given a function u ∈ C([a, b]), there are points t1 , t2 ∈ [a, b] such that ∀t ∈ [a, b] :
u(t1 ) ≤ u(t) ≤ u(t2 ).
Therefore, by taking M = max{|u(t1 )|, |u(t2 )|}, it holds ∀t ∈ [a, b] :
|u(t)| < M,
i.e., u is bounded. Since u was arbitrary, we have shown that C([a, b]) ⊆ B([a, b], R), where we are using the notation of Example 3.13. Actually, by Example 1.10, we have that C([a, b]) is a linear subspace of B([a, b], R). By default, it’s assumed that C([a, b]) = (C([a, b]), ∥ · ∥∞ ), and we speak of the space C([a, b]). Then, by default, C([a, b]) will be equipped with the infinity norm or L∞ -norm, which is that inherited from B([a, b], R). It’s easy to check that the infinity norm is not induced by an inner-product. We shall speak of the set or linear space C([a, b]) whenever it’s not equipped with the ∥ · ∥∞ -norm. e 2 ([a, b])) Let a, b ∈ R such that a < b. The stuExample 3.15 (The space L dent can remember from the Calculus course that any function v ∈ C([a, b]) is Riemann integrable in [a, b]. Since the composition of continuous functions is also a continuous function, v 2 also belongs to C([a, b]). It’s easy to check that (·, ·)2 : C([a, b]) × C([a, b]) −→ R, given by Z b (u, v)2 = u(t) v(t) dt, a
is an inner-product, usually referred as the L2 -product. The corresponding norm is usually referred to as the L2 -norm or the quadratic norm: !1/2 Z b
|u(t)|2 dt
∥u∥2 =
.
a
e 2 ([a, b]) = (C([a, b]), ∥ · ∥2 ). We shall use the notation L
78
Chapter 3. An introduction to metric spaces
Example 3.16 Let’s consider the functions f : [0, π] −→ R and g : [0, π] −→ R, given by (% i1) f(x):= sin(2*x); (% i2) g(x):=x; As we know, the functions f and g are continuous so that they belong to the linear space C([0, π]). We want to compute the distance between f and g in the norms ∥ · ∥∞ and ∥ · ∥2 . In the following picture, let’s see the value ∥f − g∥∞ as the y-highest value: (% i3) plot2d(abs(f(x)-g(x)),[x,0,%pi]);
Figure 3.1.: The function [0, π] ∋ x 7−→ |f (x) − g(x)| ∈ R.
This means that ∥f − g∥∞ = sup |f (x) − g(x)| ≈ 3.5 x∈[0,π]
In this particular case, it’s possible to analitically find the exact value of the L∞ distance between f and g. For this, by seeing the Figure 3.1, we have ∥f − g∥∞ = |(f − g)(x0 )|, where x0 is the critical value of the function f − g, which belongs to the interval ]π/2, π[. We find that the critical points of f − g are (% i4) x3: %pi/6; x4: (5*%pi)/6; π 6
(x3)
5π 6
(x4)
Then the exact value of ∥f − g∥∞ is (% i5) D: max(abs(f(x3)-g(x3)),abs(f(x4)-g(x4))); √ 5π 3 + 6 2
(D)
3.3. Basic properties of metric spaces
79
Figure 3.2.: The function [0, π] ∋ x 7−→ |f (x) − g(x)|2 ∈ R.
Refer now to Figure 3.3. The L2 -distance of the functions f and g is the square root of the area below the curve with formula |f (x) − g(x)|: (% i6) plot2d((abs(f(x)-g(x)))ˆ2,[x,0,%pi]); The exact value of ∥f − g∥2 is (% i7) D2: sqrt(integrate((f(x)-g(x))ˆ2,x,0,%pi)); r 2π 3 + 9π 6
(D2)
Finally, let’s observe that the values of ∥f − g∥∞ and ∥f − g∥2 are approximately (% i8) D, numer; D2, numer; 3.484019281775933
(% o8)
3.87915126548123
(% o9)
so that ∥f − g∥∞ < ∥f − g∥2 .
3.3. Basic properties of metric spaces Let A be a subset of the metric space (X, d). The distance from a point x0 ∈ X to the set A is defined by d(x0 , A) = inf d(x0 , y). y∈A
The distance between two subsets of X, A and B, is defined by d(A, B) = inf{d(x, y) / x ∈ A ∧ y ∈ B}. The diameter of the set A is given by δ(A) = sup d(x, y). x,y∈A
A set is said to be bounded iff its diameter is finite, i.e., if it can be included in a ball.
80
Chapter 3. An introduction to metric spaces
Remark 3.7 Let (X, d) be a metric space. It’s not difficult to show that ∀x, y ∈ X : |d(x, A) − d(y, A)| ≤ d(x, y), ∀A, B ∈ X : A ⊆ B ⇒ δ(A) ≤ δ(B). Since the balls form a topological basis for Td , we have the following characterizations: x ∈ int(A) ⇐⇒ ∃r > 0 : B(x, r) ⊆ A ⇐⇒ d(x, Ac ) > 0; x ∈ A ⇐⇒ ∀r > 0 : B(x, r) ∩ A ̸= ⇐⇒ d(x, A) = 0; x ∈ acc(A) ⇐⇒ ∀r > 0 : (B(x, r) \ {x}) ∩ A ̸= ∅; x ∈ ∂A ⇐⇒ (d(x, A) = 0 ∧ d(x, Ac ) = 0).
Theorem 3.5 (Metric implies Hausdorff) Let (X, d) be a metric space. Then (X, Td ) is Hausdorff. Idea of the proof. Metric spaces are less abstract than the topological ones; in particular, some geometric intuition can be used. To prove the theorem we just have to see Figure 2.1 to get a clue of what has to be done. Proof. Let x, y ∈ X, x ̸= y, generic. Let’s choose some r ∈]0, d(x, y)/2[. Then B(x, r) ∈ N (x) and B(y, r) ∈ N (y), and B(x, r) ∩ B(y, r) = ∅. In fact, if there were some z ∈ B(x, r) ∩ B(y, r), we would have r < d(x, y) ≤ d(x, z) + d(z, y)
0, ∃N ∈ N :
n > N ⇒ d(x, xn ) < ϵ.
(3.14)
3.3. Basic properties of metric spaces
81
Proof. The result is an immediate consequence of Theorem 2.11 having in mind that F = {B(x, ϵ) / ϵ > 0} is a fundamental system of neighborhoods of x. ■ Remark 3.9 Observe that (3.14) is quite similar to what was shown in Example 2.10 for the space of real numbers. By using (3.14) is quite easy to show that a convergent sequence is bounded. Corollary 3.2 (Continuity in terms of a metric) Let (X, d) and (Y, ρ) be metric spaces, x0 ∈ X and f : X −→ Y . Then f is continuous at x0 iff ∀ϵ > 0, ∃δ > 0 :
d(x, x0 ) < δ ⇒ ρ(f (x), f (x0 )) < ϵ.
(3.15)
Proof. The result is an immediate consequence of Theorem 2.16 noticing that F1 = {B(x, δ) / δ > 0} and F2 = {B(f (x), ϵ) / ϵ > 0} are fundamental systems of neighborhoods of x and f (x), respectively. ■ Remark 3.10 Let (X, d) and (Y, ρ) be metric spaces, x0 ∈ S ⊆ X and f : S −→ Y . Then f is continuous at x0 iff ∀ϵ > 0, ∃δ > 0 :
(x ∈ S ∧ d(x, x0 ) < δ) ⇒ ρ(f (x), f (x0 )) < ϵ.
Given two metric spaces (X, d) and (Y, ρ) there are many ways to define a metric η on X ×Y in such a way that the induced topology, Tη , coincides with the product topology (see Section 2.7). The easiest choice is η : (X × Y ) × (X × Y ) → R given by η ((x1 , y1 ), (x2 , y2 )) = d(x1 , x2 ) + ρ(y1 , y2 ). In this way, it’s possible to speak of the metric product space (X n , d(n) ) where for x = (x1 , x2 , ..., xn ), y = (y1 , y2 , ..., yn ) ∈ X n : d(n) (x, y) =
n X
d(xk , yk ).
k=1
Theorem 3.6 (Continuity of a metric) Let (X, d) be a metric space. Then d is a continuous function on (X 2 , d(2) ). This result is required as an exercise at the end of the chapter. Remark 3.11 (Density in a metric space) Let (X, d) be a metric space and A, B ⊆ X. It’s not difficult to show that A is dense in B iff ∀x ∈ B, ∀ϵ > 0, ∃x0 ∈ A :
d(x, x0 ) < ϵ.
Definition 3.5 (Separable metric space) We say that a metric space (X, d) is separable iff there exists A ⊆ X, countable and dense. Example 3.17 The spaces R and RN with the typical metrics (see Examples 3.1 and 3.2) are separable.
82
Chapter 3. An introduction to metric spaces Proposition 3.1 Let (X, d) be a metric space and A ⊆ X. If X is separable then so is A.
Proof. Since X is separable, there is B = {xn / n ∈ N} dense in X. Then C = B ∩ A is dense in A. ■
3.4. Complete metric spaces. Baire’s theorem. Let (X, d) be a metric space. We say that (xn )n∈N ⊆ X is a Cauchy sequence (or fundamental sequence) iff ∀ϵ > 0, ∃N ∈ N :
n, m > N ⇒ d(xn , xm ) < ϵ.
Theorem 3.7 (Convergent implies Cauchy) Let (X, d) be a metric space. If (xn )n∈N ⊆ X is convergent in X then it’s a Cauchy sequence. Proof. Let’s assume that (xn )n∈N ⊆ X is convergent to some x ∈ X, i.e., ∀α > 0, ∃P ∈ N :
p > P ⇒ d(xp , x) < α.
(3.16)
n, m > N ⇒ d(xn , xm ) < ϵ.
(3.17)
We have to prove that ∀ϵ > 0, ∃N ∈ N :
Let ϵ > 0, generic. By (3.16), for α ∈]0, ϵ/2[ there is P ∈ N such that p > P implies that d(xn , x) < α. Then, by choosing N = P , we have for n, m > N that d(xn , xm ) ≤ d(xn , x) + d(x, xm ) < 2α < ϵ. Since ϵ was chosen arbitrarily, we have proved (3.17). ■ In Theorem 3.7 is essential that the limit of the sequence is an element of the space. As it will be shown in Example 3.18, there are Cauchy sequences that do not converge. Definition 3.6 (Complete metric space) We say that a metric space (X, d) is complete iff every Cauchy sequence is convergent in X. A subset A of X is complete if (A, d|A×A ) is complete. In this case, every Cauchy sequence of elements of A converges to an element of A. Example 3.18 (Q is incomplete, R is complete) Let’s consider the metric space (Q, d) where d(r, s) = |r − s|, r, s ∈ Q. Let x = 0.a1 a2 a3 ... be a irrational number, where ak ∈ {0, 1, ..., 9}, k ∈ N. Let’s consider the rational numbers x1 = 0.a1 , x2 = 0.a1 a2 , x3 = 0.a1 a2 a3 , x4 = 0.a1 a2 a3 a4 , and so on. It’s easy to verify that (xn )n∈N is a Cauchy sequence, but it’s not convergent since x ∈ / Q. Therefore, (Q, d) is not complete. On the other hand, as the student can remember from his Calculus course, the set R, equipped with the metric induced by the absolute value function, is a complete metric space.
3.4. Complete metric spaces. Baire’s theorem.
83
Theorem 3.8 (Closed set in a complete space) Let (X, d) be a complete metric space and A ⊆ X. Then A is closed iff A is complete. Proof. i) Let’s assume that A is closed. We have to prove that every sequence of elements of A converges to an element of A. Let (xn )n∈N ⊆ A be a Cauchy sequence, generic. Since X is complete, there is x ∈ X such that xn −→ x, as n −→ +∞. Now, by Theorem 2.13, we know that x ∈ A. Since A is closed, we actually have that x ∈ A. Since (xn )n∈N was chosen arbitrarily, we have proved that A is complete. ii) Let’s assume that A is complete. We have to prove that A is closed, i.e., that A ⊆ A. Let x ∈ A, generic. By Theorem 2.13, there is (xn )n∈N ⊆ A such that xn −→ x, as n −→ +∞. Now, Theorem 3.7 implies that (xn )n∈N ⊆ A is a Cauchy sequence and, since A is complete, x ∈ A. Since x was chosen arbitrarily, we have proved that A ⊆ A so that A is closed. ■ The next example shows that the completeness of a space is a property which depends on the metric. On the other hand, the convergence depends on the topology. Example 3.19 (Completeness depends on the metric) Consider the function ρ : R × R → R given by x y ρ(x, y) = − . 1 + |x| 1 + |y| We saw in Example 3.4 that ρ is a metric on R. We have that U = Td (where d is the typical metric) coincides with Tρ . For this, it’s proved that a ball in one of the metrics is open in the other metric and conversely. On the other hand, observe that the sequence (xn )n∈N ⊆ R given by xn = n, is ρ-Cauchy, but it’s not ρ-convergent. Remark 3.12 (Strategy to prove completeness) To prove that a particular metric space (X, d) is complete, we have to show that a generic Cauchy sequence, (xn )n∈N ⊆ X is convergent in X. For this, a usual path is as follows: 1. 2. 3. 4.
fix a good notation to avoid any confusion; construct an element x, the candidate to be the limit of X; prove that x ∈ X; prove that xn −→ x, as n −→ +∞.
Sometimes the steps 3 and 4 are interchanged. Definition 3.7 (Banach and Hilbert spaces) A normed space is a Banach space iff it’s complete whenever it’s equipped with the metric induced by a norm. An inner-product space is a Hilbert space iff it’s a Banach space whenever it’s equipped with the norm induced by an inner-product.
84
Chapter 3. An introduction to metric spaces
Example 3.20 (RN is a Hilbert space) In Example 3.9, we saw that RN is an inner-product space when it’s equipped with the usual inner-product: (x, y) =
N X
ψk ηk ,
k=1
where x = (ψ1 , ..., ψN ) and y = (η1 , ..., ηN ). Let’s apply the strategy mentioned in Remark 3.12 to prove that RN is complete so that it’s a Hilbert space. Proof. Let (xn )n∈N ⊆ RN be a Cauchy sequence, generic. Then, ∀ϵ > 0, ∃M ∈ N : v uN 2 uX (n) (m) ψk − ψk < ϵ, m, n > M ⇒ ∥xn − xm ∥ = t
(3.18)
k=1 (n)
(n)
(n)
where, for each n ∈ N, we have written xn = (ψ1 , ψ2 , ..., ψN ). We have to prove that (xn )n∈N is convergent. (n)
1. Let’s prove that, for each k0 ∈ IN , the sequence (ψk0 )n∈N ⊆ R is conver(n)
gent. Since R is complete, it’s enough to prove that (ψk0 )n∈N is a Cauchy sequence in R. Let k0 ∈ IN , generic. We have to show that (n) (m) ∀ϵ > 0, ∃M ∈ N : m, n > M ⇒ ψk0 − ψk0 < ϵ.
(3.19)
Let ϵ > 0, generic. By (3.18), there is M ∈ N such that n, m > M implies that v N r 2 2 u uX (n) (m) (n) (m) (n) (m) ψk − ψk ψk0 − ψk0 ≤t ψk0 − ψk0 = k=1
= ∥xn − xm ∥ < ϵ. Since ϵ was chosen arbitrarily, we have proved (3.19). Since R is complete and k0 was chosen arbitrarily, we have proved that ∀k0 ∈ IN , ∃ψk0 ∈ R :
(n)
lim ψk0 = ψk0 .
n→+∞
(3.20)
2. Let’s prove that xn −→ x, as n −→ +∞, where x = (ψ1 , ψ2 , ..., ψN ) ∈ RN , i.e., let’s prove that ∀ϵ > 0, ∃M > 0 :
n > M ⇒ ∥xn − x∥ < ϵ.
(3.21)
Let ϵ > 0, generic. Let’s choose ϵ 0 < ϵ∗ < √ . N
(3.22)
3.4. Complete metric spaces. Baire’s theorem.
85
By (3.20), for each k ∈ IN , there is Mk ∈ N such that n > Mk implies (n) (3.23) ψk − ψk < ϵ∗ . Now we take M = max{M1 , M2 , ..., MN }. Then by (3.22) and (3.23), for n > M we have that v v v uN uN N 2 u uX u u X ϵ2 X (n) t t 2 ∥x − xn ∥ = ψk − ψk < ϵ∗ < t = ϵ. N k=1
k=1
k=1
Since ϵ was chosen arbitrarily, we have proved (3.21). ■ Before introducing a new important example, let’s recall an important result from the course of Calculus. Definition 3.8 (Point convergence, uniform convergence) Let S ⊆ R be an interval and f ∈ F (S). We say that a sequence (fn )n∈N ⊆ F (S) is 1. point convergent to f iff ∀x ∈ S :
lim fn (x) = f (x);
n→+∞
(3.24)
2. uniformly convergent to f iff ∀ϵ > 0, ∃N ∈ N, ∀x ∈ S :
n > N ⇒ |fn (x) − f (x)| < ϵ.
(3.25)
Remark 3.13 (Uniform implies poit convergence) Let’s recall that (3.24) means that ∀x ∈ S, ∀ϵ > 0, ∃N ∈ N :
n > N ⇒ |fn (x) − f (x)| < ϵ.
(3.26)
The difference between (3.26) and (3.25) is that in (3.26) N depends on both x and ϵ, but in (3.25) N depends only on ϵ. In other words, in (3.25) the choice of N is uniform. As a consequence, uniform convergence implies point convergence. Theorem 3.9 (Uniform convergence and continuity) Let S ⊆ R be an interval. If (fn )n∈N ⊆ C(S) is a sequence that converges uniformly to a function f : S → R, then f ∈ C(S). The student can find a proof of this result in [3]. Example 3.21 (C([a, b]) is a Banach space) In Example 3.14 we saw that C([a, b]) is a normed space whenever it’s equipped with the L∞ -norm: ∥u∥∞ = sup |u(t)|. t∈[a,b]
Let’s apply the strategy mentioned in Remark 3.12 to prove that X = (C([a, b]), ∥· ∥∞ ) is complete so that it’s a Banach space.
86
Chapter 3. An introduction to metric spaces
Proof.
Let (un )n∈N ⊆ X be a Cauchy sequence, generic. Then,
∀ϵ > 0, ∃M ∈ N :
m, n > M ⇒ ∥un − um ∥∞ = sup |un (t) − um (t)| < ϵ.
(3.27)
t∈[a,b]
We have to prove that (un )n∈N is convergent. 1. Let’s prove that, for each t0 ∈ [a, b], the sequence (un (t0 ))n∈N ⊆ R is convergent. Since R is complete, it’s enough to prove that (un (t0 ))n∈N is a Cauchy sequence in R. Let t0 ∈ [a, b], generic. We have to show that ∀ϵ > 0, ∃M ∈ N :
m, n > M ⇒ |un (t0 ) − um (t0 )| < ϵ.
(3.28)
Let ϵ > 0, generic. By (3.27), there is M ∈ N such that n, m > M implies that |un (t0 ) − um (t0 )| ≤ sup |un (t) − um (t)| = ∥un − um ∥∞ < ϵ. t∈[a,b]
As ϵ was chosen arbitrarily, we have proved (3.28). Since R is complete and t0 was chosen arbitrarily, we have proved that ∀t0 ∈ [a, b], ∃u(t0 ) ∈ R :
lim un (t0 ) = u(t0 ).
n→+∞
(3.29)
2. By (3.29), we can define a function u : [a, b] −→ R by u(t) = lim un (t). n→+∞
(3.30)
Let’s prove that u ∈ C([a, b]). By (3.30) we can let n → +∞ in (3.27) and obtain ∀ϵ > 0, ∃M ∈ N :
m > M ⇒ sup |u(t) − um (t)| ≤ ϵ, t∈[a,b]
whence ∀ϵ > 0, ∃M ∈ N, ∀t0 ∈ [a, b] :
m>M ⇒
|u(t0 ) − um (t0 )| ≤ sup |u(t) − um (t)| ≤ ϵ, t∈[a,b]
i.e., (un )n∈N converges uniformly to u. Then, by Theorem 3.9, the function u belongs to C([a, b]). 3. Let’s prove that lim un = u. (3.31) n→+∞
Again, by letting n −→ +∞ in (3.27), we get ∀ϵ > 0, ∃M ∈ N : which is equivalent to 3.31.
m > M ⇒ ∥u − um ∥∞ ≤ ϵ,
3.4. Complete metric spaces. Baire’s theorem.
87 ■
e 1 ([a, b]) and L e 2 ([a, b])) Let a, b ∈ Example 3.22 (The incomplete spaces L R such that a < b. It’s easy to check that ∥ · ∥1 : C([a, b]) → R, given by Z
b
|u(t)| dt,
∥u∥1 = a
is a norm, usually referred as the L1 -norm. The spaces e 1 ([a, b]) = (C([a, b]), ∥ · ∥1 ), L e 2 ([a, b]) = (C([a, b]), ∥ · ∥2 ), L are not complete. e 1 ([a, b]) is not comExample 3.23 From Example 3.22, we know that the space L plete. Let’s show graphically this fact by considering the sequence (xm )m∈N ⊆ C([0, 1]) given, for each m ∈ N, by (% i1) a(m):= 1/2 + 1/m; (% i2) x[m](t):= if ta(m) then 1 else m*t-m/2; (% i4) plot2d([x[1](t),x[5](t),x[10](t),x[20](t)],[t,0,1],[y,-0.1,1.4]); See Figure 3.3.
Figure 3.3.: The functions x1 (blue), x5 (red), x10 (green) end x20 (violet).
The sequence (xm )m∈N ⊆ C([0, 1]) is a Cauchy sequence in the norm ∥ · ∥1 and, as m −→ +∞ it comes closer and closer to the function ( 0, if t ∈ [0, 1/2], x(t) = 1, if t ∈]1/2, 1], which is not continuous and, therefore, does not belong to C([0, 1]). To finish this section, let’s state and prove a very useful result. Theorem 3.10 (Baire’s theorem) Let (X, d) be a complete metric space +∞ [ and (Xn )n∈N a sequence of closed nowhere dense subsets of X. Then Xn n=1
88
Chapter 3. An introduction to metric spaces is also nowhere dense, i.e., int
+∞ [
! = ∅.
Xn
(3.32)
n=1
Proof. Let’s write, for n ∈ N, Ωn = Xnc , which is open and dense in X. We +∞ \ shall prove that G = Ωn is dense in X. In fact, in this case we would have n=1
X=G=
+∞ \
+∞ [
Ωn =
i.e.,
!c Xn
Xn
,
n=1
n=1 +∞ [
!c
is dense in X and therefore
n=1
+∞ [
Xn is nowhere dense, implying
n=1
(3.32). Let’s prove that G is dense by showing that ∀U ∈ Td \ {∅} :
U ∩ G ̸= ∅.
(3.33)
Let U ⊆ X be a non-void and open set. We choose x0 ∈ U and r0 > 0 such that B(x0 , r0 ) ⊆ U. Since Ω1 is dense, we can choose x1 ∈ B(x0 , r0 ) ∩ Ω1 and r1 ∈]0, r0 /2[ such that B(x1 , r1 ) ⊆ B(x0 , r0 ) ∩ Ω1 . In this way, for n ∈ N we can find (xn )n∈N ⊆ X and (rn )n∈N ⊆]0, +∞[ such that rn ∈]0, rn−1 /2[ and B(xn+1 , rn+1 ) ⊆ B(xn , rn ) ∩ Ωn+1 . Therefore, (xn )n∈N is a Cauchy sequence. Since X is complete, there is y ∈ X such that xn −→ y, as n −→ +∞, and y ∈ B(xn , rn ),
∀n ∈ N ∪ {0}.
In particular, this shows that y ∈ U ∩ G. Since U was chosen arbitrarily, we have proved (3.33). ■ Remark 3.14 (How to apply Baire’s theorem) Baire’s theorem is usually applied in the following way. Let X ̸= ∅ be a complete [ metric space and (Xn )n∈N a sequence of closed parts of X such that X = Xn . Then there exists some n∈N
n0 ∈ N such that Xn0 is not nowhere dense, i.e., int(Xn0 ) ̸= ∅.
3.5. Isometries. Completion of a metric space.
89
3.5. Isometries. Completion of a metric space. In this section we shall prove that every metric space can be completed. For this, let’s first introduce the concept of isometry. Definition 3.9 (Isometry, isometric spaces) Let (X, d) and (Y, ρ) be metric spaces and T : X −→ Y . We say that T is an isometry iff ∀x, y ∈ X :
ρ (T (x), T (y)) = d(x, y).
If, in addition, T is bijective, we say that X and Y are isometric spaces. Isometric spaces have the same properties as long as they depend on the metric. Theorem 3.11 (Completion of a metric space) Let (X, d) be a metric space. ˆ which has a subspace W such that ˆ d) Then there exist a metric space (X, 1. X and W are isometric; ˆ 2. W is dense in X. ˆ is unique except for isometries. Moreover, X ˜ is any complete metric space having a dense subspace W ˜ isometric with So, if X ˆ and X ˜ are isometric. X, then X Scheme of the proof. We shall repeat, as much as possible, the procedure to complete Q with the usual metric to obtain R. The path considers five steps: 1. 2. 3. 4. 5. 6.
ˆ construct the set X; ˆ construct the metric dˆ on X; ˆ construct an isometry T : X −→ W ⊆ X; ˆ prove that W = X; ˆ is complete; ˆ d) prove that (X, prove the uniqueness except for isometries.
Proof. ˆ Let’s consider the set 1. Construction of the set X. C = {(xn )n∈N ∈ X N / (xn )n∈N is a Cauchy sequence}.
(3.34)
On C we define a relation ∼ by (xn )n∈N ∼ (yn )n∈N
⇐⇒
lim d(xn , yn ) = 0.
n→+∞
(3.35)
It’s clear that ∼ is reflexive, symmetric and transitive so that ∼ is an equivˆ = C / ∼, the set of equivalence alence relation on C . Let’s denote by X ˆ by x classes defined by ∼. Let’s denote the elements of X ˆ, yˆ, etc., so that x ˆ = {(yn )n∈N ∈ C / (xn )n∈N ∼ (yn )n∈N }.
90
Chapter 3. An introduction to metric spaces ˆ ×X ˆ → R by ˆ on X. ˆ We will define dˆ : X 2. Construction of the metric d ˆ x, yˆ) = lim d(xn , yn ). d(ˆ
(3.36)
n→+∞
ˆ generic. We have to check that: Let x ˆ, yˆ ∈ X, a) the limit (3.36) does exist for given representants (xn )n∈N and (yn )n∈N of the classes x ˆ and yˆ, respectively; b) the limit (3.36) does not depend on the representants of the classes x ˆ and yˆ; ˆ c) dˆ is a metric on X. Let’s do it. a) Let’s take (xn )n∈N ∈ x ˆ and (yn )n∈N ∈ yˆ, generic. Since R (with the typical metric) is complete, it’s enough to prove that the real sequence (d(xn , yn ))n∈N is a Cauchy sequence, i.e., that ∀ϵ > 0, ∃N ∈ N :
m, n > N ⇒ |d(xn , yn )−d(xm , ym )| < ϵ. (3.37)
Let ϵ > 0, generic. Since (xn )n∈N ∈ C , there is some N1 ∈ N such that ϵ m, n > N1 ⇒ d(xn , xm ) < . 2 Since (yn )n∈N ∈ C , there is some N2 ∈ N such that m, n > N2 ⇒ d(yn , ym )
0, generic. Let’s take (xn )n∈N ∈ x Let x ˆ∈X ˆ. Since (xn )n∈N is a Cauchy sequence, there is N ∈ N such that m, n ≥ N ⇒ d(xn , xm )
1/2 + 1/m,
3.7. Banach fixed point theorem
95
converges in the space L1 ([0, 1]) to the non-continuous function ( 0, if t ∈ [0, 1/2], x(t) = 1, if t ∈]1/2, 1].
3.7. Banach fixed point theorem The Banach fixed point theorem, also know as the principle of contraction mappings, is an existence and uniqueness theorem that appears in a big number of applications of Functional Analysis. Let’s work on it. Definition 3.11 (Lipschitz continuity, contraction) Let (X, d) and (Y, ρ) be metric spaces and T : X −→ Y . We say that T is Lipschitz continuous iff there is α > 0 such that ∀x, y ∈ X :
ρ(T (x), T (y)) ≤ α d(x, y).
(3.50)
In particular, if α < 1, we say that T is a contraction. In the context of the previous definition, we call Lipschitz factor to the number αT = inf{α > 0 / (3.50) holds}. In particular, if αT < 1, we call it contraction factor. It’s clear that any Lipschitz function is continuous. Let’s recall that given a function A : H −→ H, we say that x ∈ H is a fixed point of A iff A(x) = x. Theorem 3.14 (Banach fixed point theorem) Let (X, d) be a complete metric space and T : X −→ X a contraction. Then T has one and only one fixed point, i.e. ∃!x ∈ X : T (x) = x. Proof. Since T is a contraction, there is some α ∈ (0, 1) such that ∀x, y ∈ X :
d(T (x), T (y)) ≤ α d(x, y).
Let’s take some x0 ∈ X. For n ∈ N let’s put xn = T n (x0 ). Let’s prove that (xn )n∈N is a Cauchy sequence so that it’s convergent to some x ∈ X, as X is complete. This x will be the fixed point we are looking for. Let’s first observe that for m ∈ N, d(xm+1 , xm ) = d(T m (x1 ), T m (x0 )) ≤ αm d(x1 , x0 ).
(3.51)
Then, for m, n ∈ N, n > m: d(xm , xn ) ≤ d(xm , xm+1 ) + d(xm+1 , xm+2 ) + ... + d(xn−1 , xn ) ≤ (αm + αm+1 + ... + αn−1 ) d(x0 , x1 ) = αm
1 − αn−m d(x0 , x1 ) 1−α
≤
αm d(x0 , x1 ), 1−α
96
Chapter 3. An introduction to metric spaces
which clearly shows that
lim
m,n→+∞
d(xm , xn ) = 0, i.e., (xn )n∈N is a Cauchy se-
quence. Let’s denote by x its limit, i.e., lim d(xn , x) = 0. Then, n→+∞
0 ≤ d(x, T (x)) ≤ d(x, xn ) + d(xn , T (x)) ≤ d(x, xn ) + α d(xn−1 , x) −→ 0, as n −→ +∞, whence d(x, T (x)) = 0 so that x = T (x). Now, let’s assume that y ∈ X is such that y = T (y). Then d(x, y) = d(T (x), T (y)) ≤ αd(x, y), so that d(x, y) = 0 and so x = y. ■ Corollary 3.3 (Fixed point via a contractive power) Let (X, d) be a complete metric space and T : X → X such that T m is a contraction for some m ∈ N. Then T has one and only one fixed point. The proof of this corollary is not difficult. It’s required as exercise at the end of the chapter. In applications, it’s not unusual to deal with a mapping which is contractive only on a subset Y of the domain X, a complete metric space. If Y is closed, then it’s also complete so that to apply Theorem 3.14 some additional property is needed to ensure that elements appearing at each iteration remain in Y . Corollary 3.4 (Contraction on a ball) Let (X, d) be a complete metric space and T : X −→ X. Suppose there are x0 ∈ X and r > 0 such that ∀x, y ∈ B(x0 , r) :
d(T (x), T (y)) ≤ α d(x, y),
for some α ∈]0, 1[. Moreover, assume that d(x0 , T (x0 )) < (1 − α)r.
(3.52)
Then the sequence (xn )n∈N ⊆ X given by xn = T n (x0 ), converges to an element x ∈ B(x0 , r), which is the unique fixed point of T in B(x0 , r). The proof is required as an exercise at the end of the chapter. The idea is to use (3.52) to show that xn ∈ B(x0 , r), for all n ∈ N. Remark 3.17 In the statement of Corollary 3.4, it’s possible to replace condition (3.52) by T (B(x0 , r)) ⊆ B(x0 , r). x 1 + . It’s not difficult 2 x to show that T is a contraction such√that T ([1, +∞[) ⊆ [1, +∞[ so that it has a unique fixed point x. Actually, x = 2, see Figure 3.4.
Example 3.25 Let T : [1, +∞[−→ R given by T (x) =
Example 3.26 A function g ∈ C1 (R) such that ∥g ′ ∥∞ = sup |g ′ (x)| < α < 1, x∈R
has, by Theorem 3.14, a unique fixed point. In fact, by using the Mean Value Theorem for derivatives, it can be proved that g is contractive.
3.7. Banach fixed point theorem
97
Figure 3.4.: The intersection between the functions T and [1, +∞[∋ x 7→ Id(x) = x ∈ R determines a fixed point of T .
Example 3.27 (Fredholm equation of the second kind) Let’s consider the problem of finding a function x : [a, b] → R verifying the Fredholm equation of the second kind: Z b
x(t) − µ
K(t, s) x(s)ds = v(t),
t ∈ [a, b],
(3.53)
a
where µ ∈ R and the functions v ∈ C([a, b]) and K ∈ C([a, b] × [a, b]) are prescribed. To take advantage of the completeness property, we shall look for a solution in the Banach space C([a, b]) = (C([a, b]), ∥ · ∥∞ ). Solving (3.53) is equivalent to find a fixed point problem x = T [x], where T : C([a, b]) −→ C([a, b]) is the operator given by b
Z T [x](t) = v(t) + µ
K(t, s) x(s)ds,
t ∈ [a, b].
a
To apply the Banach fixed point theorem, we need T to be contractive. Let’s first observe that since K is continuous and [a, b] × [a, b] is compact, it’s bounded, i.e., there is γ > 0 such that ∀t, s ∈ [a, b] :
|K(t, s)| < γ.
Therefore, for generic elements x, y ∈ C([a, b]): ∥T [x] − T [y]∥∞ = sup |T [x](t) − T [y](t)| t∈[a,b]
Z b = |µ| · sup K(t, s) (x(s) − y(s)) ds t∈[a,b] a Z b ≤ |µ| · sup |K(t, s)| · |x(s) − y(s)| ds t∈[a,b]
a
Z ≤ |µ| γ · sup |x(s) − y(s)| · s∈[a,b]
b
ds a
= |µ| γ (b − a) ∥x − y∥∞ , whence it follows that T is a contraction if |µ| < 1/(γ(b − a)).
98
Chapter 3. An introduction to metric spaces
Example 3.28 (Solving a Fredholm equation of the second kind) Let’s consider the problem of finding a function x ∈ C([0, 1]) verifying the Fredholm equation of the second kind: Z 1 x(t) − µ K(t, s) x(s)ds = v(t), t ∈ [0, 1], (3.54) 0
where (% i1) %mu: 1/2; and the functions v ∈ C([0, 1]) and K ∈ C([0, 1] × [0, 1]) are given by (% i2) v(t):= exp(t); (% i3) K(t,s):= sin(t)*sin(2*s); Then the conditions of Example 3.27 hold. Therefore, the only solution of (3.54) is the fixed point of the operator T : C([0, 1]) −→ C([0, 1]) given by Z 1 T [x](t) = v(t) + µ K(t, s) x(s)ds, t ∈ [0, 1], 0
which in Maxima can be handled as (% i4) T(x,t):= v(t) + %mu * integrate(K(t,s)*depends(x,s),s,0,1); Z 1 T (x, t) := v(t) + µ K (t, s) depends (x, s) ds
(% o4)
0
To start the iteration process, let’s consider x0 ∈ C([0, 1]) given by (% i5) x0(t):=1; Then the first iterations provide (% i6) define(x1(t),float(T(x0,t)[1])); x1(t) := 0.3540367091367856 sin (t) + 2.718281828459045t
(% o6)
(% i7) define(x2(t),float(T(x1,t)[1])); x2(t) := 0.7437279765884749 sin (t) + 2.718281828459045t
(% o7)
(% i8) define(x3(t),float(T(x2,t)[1])); x3(t) := 0.8211236806699073 sin (t) + 2.718281828459045t
(% o8)
By using the Cauchy criterion, we can see the value of ∥x3 − x2 ∥∞ as the highest point in the graphic (% i9) plot2d(abs(x3(t)-x2(t)),[t,0,1]); See Figure 3.5. Let’s get some additional iterations:
3.7. Banach fixed point theorem
99
Figure 3.5.: The distance ∥x3 − x2 ∥∞ is equal to |x3 (1) − x2 (1)|.
(% (% (% (% (%
i10) define(x4(t),float(T(x3,t)[1])); i11) define(x5(t),float(T(x4,t)[1])); i12) define(x6(t),float(T(x5,t)[1])); i13) define(x7(t),float(T(x6,t)[1])); i14) define(x8(t),float(T(x7,t)[1])); x8(t) := 0.8402986055733744 sin (t) + 2.718281828459045t
(% o14)
Again, by using the Cauchy criterion, we can see the value of ∥x8 − x7 ∥∞ as the highest point in the graphic (% i15) plot2d(abs(x8(t)-x7(t)),[t,0,1]); See Figure 3.6.
Figure 3.6.: The distance ∥x8 − x7 ∥∞ is equal to |x8 (1) − x7 (1)|.
Then by observing that the function x8 − x7 , given by (% i16) x8(t)-x7(t); 2.3916507665866510−5 sin (t) is strictly increasing, we find that D = ∥x8 − x7 ∥∞ = x8 (1) − x7 (1):
(% o16)
100
Chapter 3. An introduction to metric spaces
(% i17) D: x8(1)-x7(1), numer; 2.01250472584568710−5
(D)
Then we can consider that x(t) ≈ 0.8402986055733744 sin (t) + 2.718281828459045t ,
t ∈ [0, 1].
See Figure 3.7.
Figure 3.7.: An aproximation to the solution of the Fredholm equation of the second kind (3.54).
Example 3.29 (A nonlinear integral equation) Let’s consider the problem of finding a function x ∈ C([a, b]) verifying the nonlinear integral equation: Z b x(t) − µ K(t, s, x(s)) ds = v(t), t ∈ [a, b], (3.55) a
where µ ∈ R and the functions v ∈ C([a, b]) and K ∈ C([a, b] × [a, b] × R) are prescribed. Let’s assume that there is β > 0 such that ∀t, s ∈ [a, b], ∀w1 , w2 ∈ R :
|K(t, s, w1 ) − K(t, s, w2 )| ≤ β|w1 − w2 |.
Then if |µ| < β/(b − a), by applying Banach fixed point theorem it can be proved that the equation (3.55) has a unique solution x ∈ C([a, b]). To finish this section, let’s consider the initial value problem (IVP) ( x′ (t) = f (t, x(t)), t ∈ [t0 − a, t0 + a], x(t0 ) = x0 .
(3.56)
Picard’s theorem, learned by the student in its introductory course of Ordinary Differential Equations, says, grossly speaking, that if we restrict ourselves to a small neighborhood of the point (t0 , x0 ) ∈ R2 , then there is a unique solution for the IVP. Let’s make explicit this statement and prove it by using the Banach fixed point theorem.
3.7. Banach fixed point theorem
101
Theorem 3.15 (Picard’s theorem for IVP) Let f ∈ C([t0 − a, t0 + a] × [x0 − b, x0 + b]) such that for some k > 0 and all t ∈ [t0 − a, t0 + a] and all s1 , s2 ∈ [x0 − b, x0 + b]: |f (t, s1 ) − f (t, s2 )| ≤ k|s1 − s2 |. Let c > 0 be an upper bound of |f | and b 1 0 < β < min a, , . c k
(3.57)
Then there is one and only one x ∈ C1 ([t0 − β, t0 + β]), which verifies (3.56).
Idea of the proof. We shall use condition (3.57) to show that the IVP has a solution as a fixed point of an operator, defined in a ball of C([t0 − β, t0 + β]), whose formula is Z t T [x](t) = x0 + f (τ, x(τ )) dτ, t0
which comes from the integration in [t0 , t] of (3.56). Observe that x = T [x]
⇒
x ∈ C1 ([t0 − β, t0 + β]).
Concretely this will be an application of Corollary 3.4 with the condition of Remark 3.17.
Proof. 1. To take advantage of the completeness property, we work on the space C([t0 − β, t0 + β]) = (C([t0 − β, t0 + β]), ∥ · ∥∞ ). Then let’s consider the operator T : C([t0 −β, t0 +β]) −→ C([t0 −β, t0 +β]) given by Z t T [x](t) = x0 + f (τ, x(τ )) dτ. t0
2. Let’s consider the constant function x ˆ0 : [t0 − β, t0 + β] → R such that x0 , cβ) is a closed set in C([t0 − β, t0 + β]), x ˆ0 (t) = x0 . Since C˜ = B(ˆ Theorem 3.8 implies that C˜ is complete. ˜ i.e., that there is α ∈]0, 1[ such 3. Let’s prove that T is a contraction on C, ˜ that for all x, y ∈ C: ∥T [x] − T [y]∥∞ ≤ α∥x − y∥∞ .
102
Chapter 3. An introduction to metric spaces ˜ generic. Then, for t ∈ [t0 − β, t0 + β]: Let x, y ∈ C, Z t [f (τ, x(τ )) − f (τ, y(τ ))] dτ |T [x](t) − T [y](t)| = t0 Z t ≤ |f (τ, x(τ )) − f (τ, y(τ ))| dτ t0
Z
t
|x(τ ) − y(τ )| dτ ≤ k · |t − t0 | · ∥x − y∥∞
≤k· t0
≤ α · ∥x − y∥∞ , where 0 < α = kβ < 1. Therefore, ∥T [x] − T [y]∥∞ =
|T [x](t) − T [y](t)| ≤ α · ∥x − y∥∞ ,
sup t∈[t0 −β, t0 +β]
and we conclude by the arbitrariness of x and y. 4. Now, let’s prove the condition mentioned in Remark 3.17 which, in our context, is ˜ ⊆ C. ˜ T (C) (3.58) ˜ Let x ∈ C, generic. For t ∈ [x0 − β, x0 + β] we have that Z t f (τ, x(τ )) dτ ≤ c · |t − t0 | ≤ cβ, |T [x](t) − x ˆ0 (t)| = t0
whence ∥T [x] − x ˆ0 ∥ =
sup
|T [x](t) − x ˆ0 (t)| ≤ cβ,
t∈[t0 −β, t0 +β]
˜ Since x was chosen arbitrarily, we have proved (3.58). so that T [x] ∈ C. 5. By Corollary 3.4, with the condition of Remark 3.17, we have proved that there is one and only one x ∈ C˜ such that x = T [x], i.e. Z t x(t) = x0 + f (τ, x(τ )) dτ, t ∈ [t0 − β, t0 + β], t0
which, in particular, shows that x ∈ C1 ([t0 − β, t0 + β]) because x is the primitive of a continuous function. ■ Remark 3.18 Banach fixed point theorem shows that the Picard iteration Z t xn+1 (t) = x0 + f (τ, xn (τ )) dτ t0
defines a sequence (xn )n∈N ⊆ C([t0 − β, t0 + β]) that ∥ · ∥∞ -converges to x, the solution of the problem (3.56).
3.8. Compact sets in metric spaces The following result is fundamental in Functional Analysis.
3.8. Compact sets in metric spaces
103
Theorem 3.16 (Compactness by B-W property) Let (X, d) be a metric space and A ⊆ X. Then A is compact iff it is sequentially compact (it has the B-W property). To prove Theorem 3.16, we shall need a couple of lemmas, whose demonstrations are required as exercises at the end of the chapter. Lemma 3.2 (Consequence of B-W property) Let (X, d) be a metric space and A ⊆ X having the Bolzano-Weierstrass property. If (Uλ )λ∈Λ is an open covering of A, then there exists ϵ > 0 such that ∀a ∈ A, ∃λ ∈ Λ :
B(a, ϵ) ⊆ Uλ .
In a metric space (X, d) a set A ⊆ X is totally bounded iff ∀ϵ > 0, ∃a1 , ..., an ∈ X :
A⊆
n [
B(ak , ϵ).
(3.59)
k=1
So, A is totally bounded iff given any radius ϵ > 0, we can find a finite number of points a1 , a2 , ..., an ∈ X such that {B(ak , ϵ) / k = 1, .., n} is a covering of A. Remark 3.19 It’s clear that a totally bounded set is bounded. However, the opposite is not necessarily true. Example 3.30 (A bounded set which is not totally bounded) In the space Hilbert space l2 (R) (see Example 3.10), the unit closed ball B(0, 1) is a bounded set which is not totally bounded. To show this, let’s consider the sequence of points (xn )n∈N ⊆ B(0, 1) given by x1 = (1, 0, 0, 0, ...), x2 = (0, 1, 0, 0, ...), x3 = (0, 0, 1, 0, ...), etc. It’s clear that for n, m ∈ N, n ̸= m, we have that √ ∥xn − xm ∥ = 2. Then for 0 < ϵ 0 :
B(x, r) ∩ Cp ̸= ∅.
Then we can choose m1 ∈ N such that xm1 ∈ B(x, 1) ∩ C1 . In the same way, we can choose (xmn )n∈N ⊆ N stricly increasing and a subsequence (xmn )n∈N of (xn )n∈N such that xmn+1 ∈ B (x, 1/(n + 1)) ∩ Cmn +1 ,
n ∈ N.
Then, for every n ∈ N, we have that d(x, xmn ) < 1/n, whence it follows, having in mind (3.60), that xmn −→ x ∈ A, as n −→ +∞. Since (xn )n∈N was chosen arbitrarily, we have proved that A is sequentially compact. ii) Let’s assume that A has the Bolzano-Weierstrass property. We have to prove that A is compact, i.e., from every open covering of A we can extract a finite subcovering. Let (Uλ )λ∈Λ be an open covering of A. By Lemma 3.2 there is ϵ > 0 such that for a ∈ A, B(a, ϵ) ⊆ Uλ , for some λ ∈ Λ. By Lemma 3.3, A is totally bounded, i.e., there are points a1 , a2 , .., an ∈ X such that A⊆
n [
B(ak , ϵ).
(3.61)
k=1
Now, for each k ∈ {1, ..., n}, we choose λk ∈ Λ such that B(ak , ϵ) ⊆ Uλk .
(3.62)
3.8. Compact sets in metric spaces
105
From (3.61) and (3.62), it follows that A ⊆
n [
Uλk , so that (Uλk )k∈In is
k=1
a subcovering of (Uλ )λ∈Λ . Since (Uλ )λ∈Λ was chosen arbitrarily, we have proved that A is compact. ■ The following result is also useful. Theorem 3.17 (Compact implies complete) Let (X, d) be a metric space and A ⊆ X. If A is compact then A is complete. The proof is easy and is required as an exercise at the end of the chapter. Let’s recall that in (R, U ) a set is compact iff it’s closed and bounded. The same happens in (RN , U N ). However, this is not the case in an arbitrary metric space. The conditions “bounded” and “closed” have to be strengthened as the following result states. Theorem 3.18 (Characterization of compact sets) Let (X, d) be a metric space and A ⊆ X. Then A is compact iff A is complete and totally bounded.
Proof. i) This was already stated in Theorem 3.17 and Lemma 3.3. ii) Let’s assume that A is complete and totally bounded. We have to prove that A is compact. Let’s do this by proving that A has the Bolzano-Weierstrass property. Since A is complete it’s enough to show that every sequence of elements of A has a Cauchy subsequence. Let’s first remark that A is totally bounded, so that (ϵ)
(ϵ)
∀ϵ > 0, ∃a1 , a2 , ..., a(ϵ) nϵ ∈ X :
A⊆
n [
(ϵ) B ak , ϵ .
(3.63)
k=1
Let (xn )n∈N ⊆ A, generic. Let’s take a sequence (ϵn )n∈N ⊆]0, +∞[ such that lim ϵn = 0. (3.64) n→+∞
n o (1) (1) (1) By (3.63), for ϵ1 , there is P1 = a1 , a2 , ..., an1 ⊆ X, such that o n (1) C1 = B ak , ϵ1 / k = 1, ..., n1 is a covering of A. Therefore, there is b1 ∈ P1 such that S1 = B(b1 , ϵ1 ) has infinite elements of the sequence (xn )n∈N . Hence, we can extract from (xn )n∈N a subsequence (x(1) n )n∈N ⊆ S1 .
106
Chapter 3. An introduction to metric spaces n o (2) (2) (2) For ϵ2 , there is P2 = a1 , a2 , ..., an2 ⊆ X, such that n o (2) C2 = B ak , ϵ2 / k = 1, ..., n2 is a covering of A. Therefore, there is b2 ∈ P2 such that S2 = B(b2 , ϵ2 ) (1) has infinite elements of the sequence (xn )n∈N . Hence, we can extract from (1) (xn )n∈N a subsequence (x(2) n )n∈N ⊆ S2 . o n (q) (q) (q) By continuing the process, for q ∈ N, there is Pq = a1 , a2 , ..., anq ⊆ X, such that n o (q) Cq = B ak , ϵq / k = 1, ..., nq is a covering of A. Therefore, there is bq ∈ Pq such that Sq = B(bq , ϵq ) has (q−1) )n∈N . Hence, we can extract from infinite elements of the sequence (xn (q−1) )n∈N a subsequence (xn (x(q) n )n∈N ⊆ Sq . (n)
We take a subsequence of (xn )n∈N by (xn )n∈N , i.e., the “diagonal” of the (k) sequences (xn )n∈N , k = 1, 2, .... Then we have that ∀n ∈ N, ∀m ≥ n :
x(m) m ∈ B (bn , ϵn ) ,
so that ∀n ∈ N, ∀m ≥ n :
(m) (n) (m) d x(n) , x ≤ d x , b + d x , b ≤ 2ϵn . n n n m n m (n)
The last inequality, together with (3.64), implies that (xn )n∈N is a Cauchy sequence. Since (xn )n∈N was chosen arbitrarily, we have proved that A is compact. ■ The following corollary is an immediate consequence of Theorem 3.18. Corollary 3.5 (Relative compactness and completeness) Let (X, d) be a complete metric space and A ⊆ X. Then A is relatively compact iff it’s totally bounded. Let’s finish this section with a very important result which helps to identify sets that are relatively compact in the space C([a, b]) = (C([a, b]), ∥ · ∥∞ ). For this, we need the concept of equicontinuity. Theorem 3.19 (Ascoli-Arzel` a) Let Φ ⊆ (C([a, b]), ∥ · ∥∞ ). Then Φ is relatively compact iff it’s bounded and equicontinuous: ∀ϵ > 0, ∃δ > 0, ∀x, z ∈ [a, b], ∀ϕ ∈ Φ : |x − z| < δ ⇒ |ϕ(x) − ϕ(z)| < ϵ.
(3.65)
3.8. Compact sets in metric spaces
107
Remark 3.20 (Continuity on a compact is uniform) Let’s recall that any ψ ∈ C([a, b]) is uniformly continuous: ∀ϵ > 0, ∃δ > 0, ∀x, z ∈ [a, b] :
|x − z| < δ ⇒ |ψ(x) − ψ(z)| < ϵ.
(3.66)
Then, the equicontinuity property, mentioned in Theorem 3.19, means that the election of δ in (3.66) should be independent of ψ. ´ Proof. i) Let’s assume that Φ is relatively compact. By Theorem 3.16, Φ has the B-W property. Therefore, by Lemma 3.3, Φ is totally bounded. This implies that Φ is also bounded. Let’s prove the equicontinuity of Φ. Let ϵ > 0, generic. Since Φ is totally n [ bounded, there are ϕ1 , ..., ϕn ∈ C([a, b]) such that Φ ⊆ B(ϕk , ϵ/3), so k=1
that ∀ϕ ∈ Φ, ∃k = kϕ ∈ In , ∀x ∈ [a, b] : |ϕ(x) − ϕk (x)| < ∥ϕ − ϕk ∥∞
0 such that ϵ (3.68) ∀x, z ∈ [a, b] : |x − z| < δk ⇒ |ϕk (x) − ϕk (z)| < . 3 Let’s pick δ = min δk . Let x, z ∈ [a, b] and ϕ ∈ Φ, generic. Then, whenever k∈In
|x − z| < δ, we have, by (3.67) and (3.68), that |ϕ(x) − ϕ(z)| = |[ϕ(x) − ϕkϕ (x)] + [ϕkϕ (x) − ϕkϕ (z)] + [ϕkϕ (z) − ϕ(z)]| ≤
|ϕ(x) − ϕkϕ (x)| + |ϕkϕ (x) − ϕkϕ (z)| + |ϕkϕ (z) − ϕ(z)| ϵ ϵ ϵ + + = ϵ. < 3 3 3
Since x, y and ϕ were chosen arbitrarily, we have shown that ∀x, z ∈ [a, b], ∀ϕ ∈ Φ :
|x − z| < δ ⇒ |ϕ(x) − ϕ(z)| < ϵ.
Since ϵ was chosen arbitrarily, we have proved (3.65). ii) Let’s assume that Φ is bounded and equicontinuous. By Corollary 3.5, to prove that Φ is relatively compact we need to show that it is totally bounded, i.e., that ∀ϵ > 0, ∃ϕ1 , ..., ϕn ∈ C([a, b]) :
Φ⊆
n [
B(ϕk , ϵ),
k=1
which can be written as ∀ϵ > 0, ∃ϕ1 , ..., ϕn ∈ C([a, b]), ∀ϕ ∈ Φ, ∃i ∈ In :
∥ϕ − ϕi ∥∞ < ϵ. (3.69)
108
Chapter 3. An introduction to metric spaces The idea is to define the functions ϕ1 , ..., ϕn as simple as possible by using the properties of boundedness and equicontinuity of Φ. Let’s first observe that since Φ is bounded, there is M > 0 such that ∀ϕ ∈ Φ, ∀x ∈ [a, b] :
|ϕ(x)|∥ϕ∥∞ ≤ M.
Let ϵ > 0, generic. Since Φ is equicontinuous, there is δ > 0 such that ∀x, z ∈ [a, b], ∀ϕ ∈ Φ :
|x − z| < δ ⇒ |ϕ(x) − ϕ(z)|
0, x ∈ A ⇐⇒ ∀r > 0 : B(x, r) ∩ A ̸= ∅; ⇐⇒ d(x, A) = 0, x ∈ acc(A) ⇐⇒ ∀r > 0 : (B(x, r) \ {x}) ∩ A ̸= ∅, x ∈ ∂A ⇐⇒ (d(x, A) = 0 ∧ d(x, Ac ) = 0). Problem 3.14 Let A a subset of the metric space (X, d) and x ∈ acc(A). Prove that any neighborhood of x contains infinitely many points. Problem 3.15 Let (X, d) be a metric space and A, B ⊆ X. Prove that A is topologically dense in B iff ∀x ∈ B, ∀ϵ > 0, ∃y ∈ A :
d(x, y) < ϵ.
Problem 3.16 Find a metric space (X, d) where B(x, r) does not coincide with B(x, r) for some x ∈ X and r > 0. Problem 3.17 Let (X, d) be a metric space. Consider the function dˆ : X ×X −→ ˆ y) = min{1, d(x, y)}. R, defined by d(x, 1. Prove that dˆ is a new metric on X. 2. Prove that Td = Tdˆ. ˆ complete? 3. Assume that (X, d) is complete. Is (X, d) Problem 3.18 Let X be a non-void set and f : X −→ R injective. Prove that the function d : X × X → R given by d(x, y) = |f (x) − f (y)|, is a metric on X. Problem 3.19 Consider the function ρ : R × R −→ R given by ρ(x, y) = |f (x) − f (y)| where f (t) = t/(1 + |t|). 1. Prove that ρ is a metric on R. 2. Prove that U coincide with Tρ . 3. By using the sequence (xn )n∈N ⊆ R given by xn = n, prove that (R, ρ) is not complete. Problem 3.20 Let A be a subset of the metric space (X, d). Prove that ∀x, y ∈ X :
|d(x, A) − d(y, A)| ≤ d(x, y).
Problem 3.21 Let (X, d) be a metric space, x ∈ X and (xn )n∈N ⊆ X such that ∀ϵ > 0, ∃N ∈ N :
n > N ⇒ d(x, xn ) < ϵ.
Prove that lim xn = x in the topological sense. n→+∞
112
Chapter 3. An introduction to metric spaces
Problem 3.22 Let (an )n∈N ⊆]0, +∞[ be such that
+∞ X
an < +∞. Let’s consider
n=1
the function ρ : RN × RN → R given by ρ(x, y) =
+∞ X
an
n=1
|xn − yn | , 1 + |xn − yn |
where x = (xn )n∈N , y = (yn )n∈N ∈ RN . 1. Prove that (RN , ρ) is a metric space. 2. Prove that (RN , ρ) is bounded. Problem 3.23 Let (X, d) be a metric space. 1. Prove that ∀a, b, x, y ∈ X :
|d(a, b) − d(x, y)| ≤ d(a, x) + d(b, y).
2. Prove that d is a continuous function on (X 2 , d(2) ). Problem 3.24 Let X be a non-void set and dn a metric on X, for every n ∈ N. +∞ X an < +∞, we define d : X N × X N → R Given (an )n∈N ⊆]0, +∞[ such that n=1
by d(x, y) =
+∞ X
an · min{1, dn (xn , yn )},
n=1
where x = (xn )n∈N , y = (yn )n∈N ∈ X N . Prove that d is a metric on X N . Problem 3.25 Prove that the spaces R and RN with the typical metrics are separable. Problem 3.26 Let (X, d) and (Y, ρ) be metric spaces, f : X −→ Y and x ∈ X. Prove that f is continuous at x iff ∀(xn )n∈N ⊆ X :
lim xn = x ⇒
n→+∞
lim f (xn ) = f (x).
n→+∞
Problem 3.27 Let’s consider the metric space (Q, d) where d(r, s) = |r−s|, r, s ∈ Q. Let x = 0.a1 a2 a3 ...an an+1 ... be an irrational number, where ak ∈ {0, 1, ..., 9}, k ∈ N. For each n ∈ N, consider the rational number xn = 0.a1 a2 a3 a4 ...an . 1. Prove that that (xn )n∈N is a Cauchy sequence but not convergent. 2. Prove that (xn )n∈N converges to x in (R, U ). Problem 3.28 Let (xn )n∈N be a Cauchy sequence in the metric space (X, d). Prove that (xn )n∈N is convergent iff there exists a convergent subsequence. Problem 3.29 Let (X, d) be a metric space and (xn )n∈N ⊆ X such that xn −→ x, as n −→ +∞. Prove that B = {xn / n ∈ N} ∪ {x} is complete.
3.9. Problems
113
Problem 3.30 Prove that a metric space (X, d) is complete iff every decreasing sequence (Fn )n∈N of closed subsets such that lim δ(Fn ) = 0 has a non-void n→+∞
intersection. Problem 3.31 Let (X, d) be a metric space and g : X −→ X a Lipschitz function. 1. Prove that g is continuous. 2. Prove that if g is a contraction, so is g n , n ∈ N. Problem 3.32 Consider the proof of Theorem 3.14. 1. Prove the prior estimate: ∀n ∈ N :
d(xn , x) ≤
αn d(x0 , x1 ) 1−α
(3.75)
2. Prove the posterior estimate ∀n ∈ N :
d(xn , x) ≤
α d(xn−1 , xn ). 1−α
3. In the iteration process, how many steps should be done to achieve an error less than ϵ > 0? Problem 3.33 Let (X, d) be a complete metric space and T : X −→ X such that T m is a contraction for some m ∈ N. Prove that T has one and only one fixed point. Problem 3.34 Let (X, d) be a complete metric space and T : X −→ X. Suppose that there are x0 ∈ X and r > 0 such that ∀x, y ∈ B(x0 , r) :
d(T (x), T (y)) ≤ α d(x, y),
for some α ∈]0, 1[. Moreover, assume that d(x0 , T (x0 )) < (1 − α)r.
(3.76)
1. Show that for all n ∈ N, xn = T n (x0 ) ∈ B(x0 , r). Hint. Reproduce the proof of Theorem 3.14 and use (3.76) to show that d(x0 , xn ) ≤
1 d(x0 , x1 ) < r. 1−α
2. Prove that in B(x0 , r), T has one and only one fixed point. Problem 3.35 Let T : [1, +∞[−→ R, given by T (x) =
x 1 + . 2 x
1. Prove that T ([1, +∞[) ⊆ [1, +∞[. 2. Prove that T is a contraction and find the contraction factor. 3. Without solving the equation T (x) = x,
x ∈ [1, +∞[,
prove that T has a unique fixed point x.
114
Chapter 3. An introduction to metric spaces
4. Consider the iteration process defined by xn = T n (x0 ),
(3.77)
for some x0 ∈ [1, +∞[. By using (3.75), estimate how many steps should be done to achieve an error less than some ϵ > 0. 5. By using Maxima, apply (3.77). How many steps are necessary to approximate x with an error less than ϵ = 1 ∗ 10−6 . Problem 3.36 Let g ∈ C1 (R) such that ∥g ′ ∥∞ < α < 1.
(3.78)
1. By using the Mean Value Theorem for derivatives, prove that g is contractive. 2. Prove that g has a unique fixed point. 3. Write, in a precise way, a version of Corollary 3.4 for a function g : I ⊆ R → R which has a bound like (3.78). ´ Problem 3.37 Let’s consider the function f : R −→ R given by f (x) = x3 +x−1. 2/3 311/2 + 33/2 − 22/3 This function has a unique root at the point x = 1/3 . Let’s 21/3 31/2 311/2 + 33/2 consider a numerical scheme to approximate the value of x with the form xn = g(xn−1 ),
n ∈ N.
(3.79)
1. Is the scheme supported by Banach fixed point theorem? Consider r 1 x 3 g1 (x) = , g (x) = 1 − x , g (x) = . 2 3 1 + x2 1 + x2 2. By using Maxima, depart from x0 = 1 and apply (3.79) to g1 , g2 and g3 . What happens with the obtained approximations of the real value x? Compare the results of the three schemes. Problem 3.38 (Newton’s method) Let f ∈ C2 ([a, b]) and let x ˆ ∈]a, b[ be a simple root of f . Prove that Corollary 3.4 works well for the function g(x) = x −
f (x) , f ′ (x)
in some neighborhood of x ˆ. This provides a way to numerically approximate x ˆ. Problem 3.39 (Volterra integral equation) Let’s consider the problem of finding a function x ∈ C([a, b]) verifying the Volterra integral equation: Z t x(t) − µ K(t, s) x(s)ds = v(t), t ∈ [a, b], (3.80) a
where µ ∈ R and the functions v ∈ C([a, b]) and K ∈ C([a, b] × [a, t]) are prescribed. Consider the operator T : C([a, b]) −→ C([a, b]) given by Z t T [x](t) = v(t) + µ K(t, s) x(s)ds. a
3.9. Problems
115
1. Prove that for all x, y ∈ C([a, b]) and all t ∈ [a, b]: |T [x](t) − T [y](t)| ≤ |µ| c (t − a) · ∥x − y∥∞ , where c > 0 is an upper bound of the function |K| in the triangular region [a, b] × [a, t]. 2. Prove that for all n ∈ N, all x, y ∈ C([a, b]) and all t ∈ [a, b]: |T n [x](t) − T n [y](t)| ≤ |µ|n cn
(t − a)n · ∥x − y∥∞ . n!
3. Prove that for all n ∈ N and all x, y ∈ C([a, b]): ∥T [x] − T [y]∥∞ ≤ αn ∥x − y∥∞ , |µ|n cn (b − a)n . n! 4. Prove that the Volterra integral equation has one and only one solution x ∈ C([a, b]). 5. By using Maxima, find an approximation of x whenever a = 0, b = 1, where αn =
v(t) = et , K(t, s) = sin(t) sin(2s),
t, s ∈ [a, b].
Problem 3.40 Let’s consider the problem of finding a function x ∈ C([a, b]) verifying the nonlinear integral equation: Z b x(t) − µ K(t, s, x(s)) ds = v(t), t ∈ [a, b], (3.81) a
where µ ∈ R and the functions v ∈ C([a, b]) and K ∈ C([a, b] × [a, b] × R) are prescribed. Let’s assume that there is β > 0 such that ∀t, s ∈ [a, b], ∀w1 , w2 ∈ R :
|K(t, s, w1 ) − K(t, s, w2 )| ≤ β|w1 − w2 |.
Prove that if |µ| < β −1 /(b − a), then (3.81) has a unique solution. Problem 3.41 Let f : R2 −→ R given by f (t, x) = | sin(x)| + t. 1. Prove that for all t ∈ R, the function f (t, ·) is a Lipschitz function. ∂f 2. Prove that the function is not defined for x = 0. ∂x Problem 3.42 Consider the initial value problem ( x′ (t) = 1 + x2 (t), t ∈ R x(0) = 0.
(3.82)
1. Find the exact solution x of (3.82). 2. By using the appropiate Maxima command, find p17 , the Taylor-McClaurin polynomial of x of degree n = 17. 3. By considering the function x0 : R → R given by x0 (t) = 0, apply (with help of Maxima) Picard’s iteration to obtain x1 , ..., x5 .
116
Chapter 3. An introduction to metric spaces
4. Compute the distance between x and p17 in the space C([−1/2, 1/2]). 5. Compute the distance between xk and p17 in the space C([−1/2, 1/2]), for k = 1, 3, 5. 6. Compute the distance between x and xk in the space C([−1/2, 1/2]), for k = 1, 3, 5. Problem 3.43 Provethat in the Hilbert sequence space l2 (R), the fundamental parallelepiped, Π = x = (xn )n∈N ∈ l2 (R) /∀n ∈ N : |xn | ≤ 1/2n , is a totally bounded set. Problem 3.44 Let (X, d) be a metric space and A ⊆ X having the B-W property. Let (Uλ )λ∈Λ be an open covering of A. Prove that ∃ϵ > 0, ∀a ∈ A, ∃λ ∈ Λ :
B(a, ϵ) ⊆ Uλ .
Problem 3.45 Let (X, d) be a metric space and A ⊆ X having the B-W property. Prove that for every ϵ > 0, there exist a1 , a2 , ..., an ∈ A such that {B(ak , ϵ) / k = 1, .., n} is a covering of A. Problem 3.46 Let (X, d) be a metric space and A ⊆ X. Prove that if A is compact, then A is complete. Problem 3.47 Let (X, d) be a metric space such that every bounded closed subset is compact. Prove that (X, d) is complete. Problem 3.48 Let (X, d) be a metric space and A ⊆ X. 1. Prove that if A is totally bounded set then it’s bounded. 2. Find an example of a bounded set which is not necessarily totally bounded. Problem 3.49 Let (X, d) be a metric space and A ⊆ X complete and totally bounded. Without using the Bolzano-Weierstrass property, try to prove that A is compact. Problem 3.50 (Generalized Ascoli-Arzel` a theorem) Let (X, d) and (Y, ρ) be two compact metric spaces. Let’s denote by C(X, Y ) the set of continuous functions from X into Y . 1. Prove that the mapping D : C(X, Y ) × C(X, Y ) −→ R, given by D(f, g) = supx∈X ρ(f (x), g(x)), is a metric on C(X, Y ). 2. Let Φ ⊆ C(X, Y ). Prove that Φ is relatively compact iff it’s bounded and equicontinuous: ∀ϵ > 0, ∃δ > 0, ∀x, z ∈ X, ∀ϕ ∈ Φ :
d(x, z) < δ ⇒ ρ(ϕ(x), ϕ(z)) < ϵ.
Hint. Adapt the proof of Theorem 3.19.
4. Banach and Hilbert spaces 4.1. Introduction In Section 3.2, we already introduced the concepts of norm and inner-product. Let’s recall a couple of concepts. Let V be a real linear space. (V, ∥ · ∥) is a normed space iff the functional ∥ · ∥ : V → R verifies 1. 2. 3. 4.
∀u ∈ V : ∥u∥ ≥ 0; ∀u ∈ V : ∥u∥ = 0 ⇐⇒ u = 0; ∀λ ∈ R, ∀u ∈ V : ∥λ · u∥ = |λ| · ∥u∥; ∀u, v ∈ V : ∥u + v∥ ≤ ∥u∥ + ∥v∥.
An immediate consequence of this definition is that ∀u, v ∈ V :
| ∥u∥ − ∥v∥ | ≤ ∥u − v∥,
(4.1)
which shows that the norm is Lipschitz continuous. The metric induced by the norm ∥ · ∥ is given by d(u, v) = ∥u − v∥, that is, the distance between u and v is the size of the vector u − v. The following inequality, which was already stated in (3.1) for metric spaces, is very useful: ∀u, v, w, q ∈ V :
| ∥u − v∥ − ∥w − q∥ | ≤ ∥u − w∥ + ∥v − q∥.
Remark 4.1 (Norms on a pivote space) On a linear space V , it’s not rare to find several norms ∥ · ∥1 , ..., ∥ · ∥n . Keep in mind that we are dealing with different normed spaces (V, ∥ · ∥1 ),..., (V, ∥ · ∥n ). In this case, it’s usual to call V the pivote space. It can happen that a sequence (un )n∈N ⊆ V converges in the space (V, ∥ · ∥1 ) but diverges in the space (V, ∥ · ∥2 ). For example, the spaces e 1 (a, b) = (C([a, b]), ∥ · ∥1 ) are different; e.g., C([a, b]) = (C([a, b]), ∥ · ∥∞ ) and L the first is complete and second is not complete. We say that (V, (·, ·)) is an inner-product space iff the the functional (·, ·) : V × V → R verifies 1. 2. 3. 4.
∀u, v, w ∈ V : (u + v, w) = (u, w) + (v, w); ∀λ ∈ R, ∀u, v ∈ V : (λ u, v) = λ (u, v); ∀u, v ∈ V : (u, v) = (v, u); ∀u ∈ V : (u, u) ≥ 0.
p The norm induced by the inner-product (·, ·) is given by ∥u∥ = (u, u), that is, the square root of the inner-product of u with itself. The Cauchy-BunyakovskySchwarz inequality, CBS-inequality for short, is given by ∀u, v ∈ V :
|(u, v)| ≤ ∥u∥ · ∥v∥. 117
118
Chapter 4. Banach and Hilbert spaces
Remark 4.2 (Choosing a norm or inner-product) On a functional space, the choice of a norm or an inner-product depends on the particular application we are dealing with. Therefore, there is no such thing as a good or a bad norm. In an inner-product space V the angle, θ ∈ [0, π] between two vectors u, v ∈ V \ {0} is given by (u, v) cos(θ) = . (4.2) ∥u∥ ∥v∥ Observe that CBS inequality ensures that −1 ≤ cos(θ) ≤ 1. Two vectors u, v ∈ V are said to be orthogonal, denoted u ⊥ v, iff (u, v) = 0. A set B ⊆ V is said to be an orthogonal set if ∀u, v ∈ B
u ̸= v ⇒ (u, v) = 0.
B is said to be orthonormal iff ∀u, v ∈ B : where δαβ denotes Kronecker’s delta: ( δαβ =
(u, v) = δuv ,
1, if α = β, 0, if α ̸= β.
Example 4.1 (Angle between vectors) In the space L2 ([−1, 1]), the inner-product and norm are given, respectively, by (% i1) Prod(u,v):= integrate(u(t)*v(t),t,-1,1); (% i2) N(u):= sqrt(Prod(u,u)); The set B = {e0 , e1 , e2 } ⊆ L2 ([−1, 1]), where (% i5) e0(t):= 1/sqrt(2); e1(t):= t*sqrt(3/2); e2(t):= (tˆ2*3*sqrt(5)-sqrt(5))/(2*sqrt(2)) ; 1 e0(t) := √ 2 r e1(t) := t
3 2
√ √ t2 3 5 − 5 √ e2(t) := 2 2 is orthonormal. In fact, (% i8) Prod(e0,e1); Prod(e0,e2); Prod(e1,e2); gives 0 as result in the three cases. Computing (% i11) Prod(e0,e0); Prod(e1,e1); Prod(e2,e2);
(% o3)
(% o4)
(% o5)
4.2. Equivalence of norms
119
gives 1 as result in the three cases. Now, let’s use (4.2) to compute the L2 ([−1, 1])angle between the functions sin and exponential: (% (% (% (%
i12) p: Prod(exp,sin); i13) N1: N(exp); i14) N2: N(sin); i15) %theta: acos(p/(N1*N2)), numer; 1.07949565242777
(% o15)
4.2. Equivalence of norms Let V be a linear space and ∥ · ∥α and ∥ · ∥β two norms on V . We say that a) ∥ · ∥α dominates ∥ · ∥β (or that ∥ · ∥β is dominated by ∥ · ∥α ) iff ∃c > 0, ∀u ∈ V :
∥u∥β ≤ c∥u∥α ;
(4.3)
b) ∥ · ∥α and ∥ · ∥β are comparable iff one of the norms dominates the other; c) ∥ · ∥α and ∥ · ∥β are equivalent iff the norms dominate each other. Therefore, the norms ∥ · ∥α and ∥ · ∥β are equivalent iff ∃c1 , c2 > 0, ∀u ∈ V :
c1 ∥u∥α ≤ ∥u∥β ≤ c2 ∥u∥α .
The following theorem is very useful for dealing with a number of Functional Spaces which have the same pivote linear space. Theorem 4.1 (Topologies of comparable norms) Let V be a linear space. Assume that, on V , ∥ · ∥α is a norm that dominates the norm ∥ · ∥β , i.e., there is c > 0 such that ∀u ∈ V : ∥u∥β ≤ c∥u∥α . (4.4) Then, 1. it holds the relation (V, ∥ · ∥α ) ⊆ (V, ∥ · ∥β );
(4.5)
2. the inclusion mapping Id : (V, ∥ · ∥α ) → (V, ∥ · ∥β ), given by Id(u) = u, is Lipschitz continuous; 3. Tβ , the topology induced by ∥ · ∥β , is weaker than Tα , the topology induced by ∥ · ∥α , i.e., Tβ ⊆ Tα . (4.6)
Proof. 1.&2. From (4.4) it immediatly follows (4.5) as well as ∀u, v ∈ V :
∥Id(u) − Id(v)∥β = ∥u − v∥β ≤ c∥u − v∥α .
(4.7)
120
Chapter 4. Banach and Hilbert spaces
3. We will prove this point in two ways. The first is very short because it uses the full power of the topological concepts developed in Chapter 2. The second, which is longer, uses a metric-normed approach, i.e., the tools of Chapter 3. a) Since the identity function, Id : (V, ∥ · ∥α ) → (V, ∥ · ∥β ), is continuous, by Theorem 2.17, we have that ∀A ∈ Tβ :
Id−1 (A) = A ∈ Tα ,
whence, Tβ ⊆ Tα . b) Second way. Since the balls constitute a topological basis of the topology induced by the corresponding metric, to prove (4.7), we need to show that every β-ball is an α-open set. Let u ∈ V and r > 0, generic. The ball Bβ (u, r) is β-open. We have to prove that Bβ (u, r) is α-open, i.e., ∀v ∈ Bβ (u, r), ∃δ > 0 : Bα (v, δ) ⊆ Bβ (u, r). (4.8) Let v ∈ Bβ (u, r), generic. Let’s denote k = ∥v − u∥β < r. Now we take
(4.9)
r−k . c
(4.10)
∥w − v∥α < δ.
(4.11)
0 0, ∀α1 , ..., αn ∈ R : αk uk ≥ c |αk |. (4.12)
k=1
k=1
Remark 4.5 In the context of Lemma 4.1, it’s clear that on the space W = ⟨B⟩, a second norm ∥ · ∥∗ is defined by ∥u∥∗ =
n X
|tk |,
k=1
where u = t1 u1 + ... + tn un . Therefore, Lemma 4.1 states that any norm on ⟨B⟩ inherited from V dominates ∥ · ∥∗ : ∃c > 0, ∀u ∈ W :
∥u∥ ≥ c∥u∥∗ .
(4.13)
Proof. 1. Let’s prove that (4.12) and (4.13) are equivalent to ∃c > 0, ∀u ∈ S∗ : ( where S∗ = S∥·∥∗ =
t1 u1 + ... + tn un /
∥u∥ ≥ c, n X
(4.14) )
|tk | = 1 .
k=1
a) Let’s assume that (4.12) holds. Let u ∈ S∗ , generic. Then u has the n X form u = γ1 u1 + ... + γn un , with |γk | = 1. Then, by (4.12), it k=1
follows that ∥u∥ = ∥γ1 u1 + ... + γn un ∥ ≥ c
n X
|γk | = c.
k=1
Since u was chosen arbitrarily, we have proved (4.14).
122
Chapter 4. Banach and Hilbert spaces b) Let’s assume that (4.14) holds. Let u ∈ W , generic. Then u has the form u = h1 u1 + ... + hn un . If u = 0, then (4.12) and (4.13) immediately hold. Let’s assume that u ̸= 0 so that there is some j0 ∈ In such that hj0 ̸= 0. Let’s put, for each k ∈ In , tk = hk /
k X
|hj |.
(4.15)
j=1
Then it’s clear that
n X
|tk | = 1. Therefore, by (4.15) and (4.14), we
k=1
have that ∥u∥ = ∥h1 u1 + ... + hn un ∥ =
n X j=1
n n
X
X
|hj | · tk uk ≥ c |hj |.
j=1
k=1
Since u was chosen arbitrarily, we have proved (4.12). 2. Let’s prove (4.14) by Reduction to Absurdity. So let’s assume that ∀c > 0, ∃u ∈ S∗ :
∥u∥ < c.
(4.16)
From (4.16), it follows that there exists a sequence (vm )m∈N ⊆ S ∗ such that lim ∥vm ∥ = 0.
(4.17)
m→+∞
Let’s denote (1)
(1)
(2)
v1 = β1 u1 + β2 u2 + ... + βn(1) un , (m)
v m = β1
(2)
v2 = β1 u1 + β2 u2 + ... + βn(2) un , (m)
u1 + β2
u2 + ... + βn(m) un , etc.
For each m ∈ N, we have that ∥vm ∥∗ =
n X
(m)
|βk
| = 1, so that
k=1
∀m ∈ N, ∀k0 ∈ In :
(m)
|βk0 | ≤ ∥vm ∥∗ = 1. (4.18) (m) Point (4.18) implies that for each k0 ∈ In , βk0 ⊆ [−1, 1]. m∈N (m) Since [0, 1] is compact in R, from β1 we extract a subsequence m∈N (m ) β 1 r1 convergent to some β1 ∈ R. Then vmr1 r ∈N is a r1 ∈N
1
subsequence of(vm )m∈N . (mr1 ) (m ) From β2 we extract a subsequence β2 r2 converr1 ∈N r2 ∈N gent to some β2 ∈ R. Then vmr2 r ∈N is a subsequence of vmr1 r ∈N . 2 1 After n extractions, we obtain vmrn r ∈N , subsequence of (vm )m∈N n such that n X (m ) v m rn = β k rn u k , rn ∈ N, k=1
4.3. Finite-dimensional normed spaces. Weierstrass Approximation Theorem. 123 and ∀k ∈ In :
m rn
lim
rn →+∞
βk
= βk .
(4.19)
Points (4.17) and (4.19) imply that βk = 0, for every k ∈ In , so that n X
|βk | = 0.
(4.20)
k=1
On the other hand, vmrn n X k=1
rn ∈N
|βk | =
is a sequence in S∗ , so that lim
rn →+∞
n X
(mrn )
|βk
| = 1,
k=1
which contradicts (4.20). Then (4.14) is true. ■ As a first consequence of Lemma 4.1, we have that every finite-dimensional normed space is complete: Theorem 4.2 (Finite dimension implies completeness and closedness) Let (V, ∥·∥) be a normed space and W a subspace of V with dim(W ) < +∞. Then W is complete. In particular, W is closed. The proof of this result is required as an exercise at the end of the chapter. In Theorem 4.2, the finite dimension of W is decisive as there are many examples where an infinite-dimensional linear subspace of a normed space is neither closed nor complete. To exemplify this, let’s restate the Weierstrass Approximation Theorem, which the student learned in his basic Numerical Methods course, in the context of the space C([a, b]). Let’s recall that given I ⊆ R, we denote by P(I) the space of real polynomials with domain I, which is an infinite-dimensional linear subspace of C(I). Theorem 4.3 (Weierstrass Approximation Theorem) We have that P([a, b]) = C([a, b]).
(4.21)
Point 4.21 means that ∀f ∈ C([a, b]), ∃(pn )n∈N ⊆ P([a, b]) :
lim ∥f − pn ∥∞ = 0.
n→+∞
In particular, Theorem 4.3 shows that P([a, b]) is a linear subspace of C([a, b]), which is neither closed nor complete. Another consequence of Lemma 4.1 is that in a finite-dimensional normed space, all the norms are equivalent: Theorem 4.4 (Finite dimension implies equivalence of norms) Let V be a linear space with dim(V ) = n, and ∥ · ∥α and ∥ · ∥β two norms on V . Then ∥ · ∥α and ∥ · ∥β are equivalent.
124
Chapter 4. Banach and Hilbert spaces
Proof. Let’s take B = {u1 , u2 , ..., un }, a basis of V and consider, as in Theorem 4.1 and Remark 4.5, the norm ∥ · ∥∗ : V → R given by
n n
X
X
∥u∥∗ = αk uk = |αk |.
k=1
∗
k=1
Let’s prove that any norm ∥ · ∥ on V is equivalent to ∥ · ∥∗ . By (4.13), we already know that ∥ · ∥ dominates ∥ · ∥∗ . So it remains to prove that ∥ · ∥∗ dominates ∥ · ∥. For this, let’s consider γ = max ∥uk ∥. Let u ∈ V , generic. We have that k∈In
∥u∥
=
n n
X
X
αk uk ≤ ∥αk uk ∥
k=1
=
n X
k=1
|αk | · ∥uk ∥ ≤ γ
k=1
n X
|αk | = γ∥u∥∗ ,
k=1
which proves that ∥ · ∥∗ dominates ∥ · ∥ because u was chosen arbitrarily. By what we have just proved, ∥ · ∥α ∼ ∥ · ∥∗ and ∥ · ∥β ∼ ∥ · ∥∗ . Therefore, by Remark 4.3, ∥ · ∥ α ∼ ∥ · ∥β . ■ Example 4.3 In Numerical Analysis, there are many ways to define a norm on the space of matrices of m rows and n columns, Mmn (R). By Theorem 4.4, it follows that all these norms are equivalent. Concerning compactness, we have the following characterization. Theorem 4.5 (Compactness in finite dimension) Let (V, ∥·∥) be a normed space with dim(V ) < +∞, and A ⊆ V . Then A is compact iff it’s closed and bounded. The proof of this result is easy and it’s required as an exercise at the end of the chapter. The last theorem of this section is as remarkable as useful. It states that the closed unit ball is compact if and only if the normed space has finite dimension. To prove it, we will use the following result. Lemma 4.2 (Riesz’s lemma) Let (V, ∥ · ∥) be a normed space. Suppose that Y and Z are linear subspaces of V such that Y = Y ⊊ Z. Then ∀θ ∈]0, 1[, ∃z ∈ S(0, 1) ∩ Z, ∀y ∈ Y :
∥z − y∥ ≥ θ.
Proof. Let’s consider some point v ∈ Z \ Y and denote γ = inf ∥v − y∥. y∈Y
Since Y is closed, it follows that γ > 0.
(4.22)
4.3. Finite-dimensional normed spaces. Weierstrass Approximation Theorem. 125 Let θ ∈]0, 1[, generic. Since γ is an infimum and γ < γ/θ, there is y0 ∈ Y such that γ γ ≤ ∥v − y0 ∥ ≤ . (4.23) θ Now we take z = λ (v − y0 ) ∈ Z, where λ = 1/∥v − y0 ∥, so that ∥z∥ = 1 and, from (4.23), 1 θ ≤λ≤ . (4.24) γ γ So it remains to prove that ∀y ∈ Y :
∥z − y∥ ≥ θ.
(4.25)
Let y ∈ Y , generic. Then, by using (4.22) and (4.24):
∥z − y∥ = ∥λ (v − y0 ) − y∥ = λ v − (y0 − λ−1 y) ≥ λγ ≥ θ. We conclude by the arbitrariness of y and θ.
■
Theorem 4.6 (Compactness of the closed unit ball) Let (Z, ∥ · ∥) be a normed space. Then B(0, 1) is compact iff dim(Z) < +∞.
Proof. i) Let’s assume that dim(Z) < +∞. Then, by Theorem 4.5, it follows that B(0, 1) is compact as it’s closed and bounded. ii) Let’s reason by Reduction to Absurdity. Then let’s assume that B(0, 1) is compact and that dim(Z) = +∞. Let’s take θ = 1/2. Let’s pick some z1 ∈ S(0, 1) and define Y1 = ⟨{z1 }⟩ By Lemma 4.2, there is z2 ∈ S(0, 1) such that 1 ∀y ∈ Y1 : ∥z2 − y∥ ≥ , 2 so, in particular, ∥z2 − z1 ∥ ≥ 1/2. Now we define Y2 = ⟨{z1 , z2 }⟩. By Lemma 4.2, there is z3 ∈ S(0, 1) such that ∀y ∈ Y2 :
∥z2 − y∥ ≥
1 , 2
so, in particular, ∥z3 −z1 ∥ ≥ 1/2 and ∥z3 −z2 ∥ ≥ 1/2. Since dim(Z) = +∞, by working in this way, we build a sequence (zn )n∈N ⊆ S(0, 1) ⊆ B(0, 1) such that 1 ∀m, n ∈ N : m ̸= n ⇒ ∥zn − zm ∥ ≥ , 2 so that (zn )n∈N ⊆ B(0, 1) can not have a Cauchy subsequence, much less a convergent subsequence. This is a contradiction because we assumed B(0, 1) to be compact. ■
126
Chapter 4. Banach and Hilbert spaces
4.4. Additional examples of Banach and Hilbert spaces In Chapter 3, we introduced a number of examples of normed spaces and innerproduct spaces. Some of them are Banach and Hilbert spaces, i.e., they are complete as metric spaces. Let’s present now some additional examples of spaces which are useful in practice.
4.4.1. The space (Rn , ∥ · ∥p ) Let p ≥ 1. In the linear space Rn , we consider the functional ∥ · ∥p : Rn −→ R given by !1/p n X p ∥x∥p = xk , k=1
where x = (x1 , ..., xn ). It’s immediate that ∀x ∈ Rn :
∥x∥p = 0 ⇐⇒ x = 0; n
∀λ ∈ R, ∀x ∈ R :
∥λ · x∥p = |λ| · ∥x∥p .
So, to prove that ∥ · ∥p is a norm on Rn , we need to show that the triangle inequality holds: ∀x, y ∈ Rn :
∥x + y∥p ≤ ∥x∥p + ∥y∥p .
(4.26)
The cases p = 1 and p = 2 were already considered in Examples 3.11 and 3.9, respectively. Let’s observe that if x = (x1 , ..., xn ) and y = (y1 , ..., yn ), then (4.26) can be written as the Minkowski inequality. Theorem 4.7 (Minkowski inequality for finite sums) It holds n X
!1/p |xk + yk |
p
≤
k=1
n X
!1/p p
|xk |
+
k=1
n X
!1/p |yk |
p
,
(4.27)
k=1
for every x1 , y1 , ..., xn , yn ∈ R. To prove Theorem 4.7, we need a couple of lemmas that, together with the machinery used in their proofs, are important by themselves. Remark 4.6 For 1 < p < +∞, we denote by p′ the conjugate exponent of p: 1 1 + ′ = 1, p p
p′ =
p . p−1
(4.28)
For p = 1, we use the symbol p′ = +∞. Remark 4.7 Let’s recall that a function f : R −→ R is convex iff ∀x1 , x2 ∈ R, ∀t ∈ [0, 1] :
f (tx1 + (1 − t)x2 ) ≤ tf (x1 ) + (1 − t)f (x2 ). (4.29)
4.4. Additional examples of Banach and Hilbert spaces
127
Whenever f ∈ C2 (R), (4.29) is equivalent to ∀x ∈ R :
f ′′ (x) ≥ 0.
In case of f being a strictly increasing convex function, by denoting f −1 the inverse of f : R → Im(f ), we have that f −1 is concave, i.e., ∀x1 , x2 ∈ Im(f ), ∀t ∈ [0, 1] : f −1 (tx1 + (1 − t)x2 ) ≥ tf −1 (x1 ) + (1 − t)f −1 (x2 ). Example 4.4 The exponential function R ∋ x 7→ exp(x) = ex ∈ R is strictly increasing and convex, so that the logarithmic function ]0, +∞[∋ x 7→ ln(x) ∈ R is concave. See Figure 4.1.
Figure 4.1.: The functions exponential (blue) and logarithmic (red).
Lemma 4.3 (Young’s inequality) Let a, b ≥ 0 and p > 1. Then ′
ap bp ab ≤ + ′. p p
(4.30)
Proof. In case a = 0 or b = 0, (4.30) immediately holds. Therefore, let’s assume that a > 0 and b > 0. Since ln :]0, +∞[→ R is concave, we have that ∀x1 , x2 ∈]0, +∞[, ∀t ∈ [0, 1] : ln(tx1 + (1 − t)x2 ) ≥ t ln(x1 ) + (1 − t) ln(x2 ).
(4.31)
Therefore, by taking, t = 1/p and 1 − t = 1/p′ , we have, from (4.31), that x2 x1 1/p 1/p′ ∀x1 , x2 ∈]0, +∞[: ln x1 x2 ≤ ln + ′ , p p whence, ∀x1 , x2 ∈]0, +∞[: ′
1/p
x1
1/p′
x2
so that for x1 = ap and x2 = bp we get (4.30).
≤
x1 x2 + ′, p p ■
128
Chapter 4. Banach and Hilbert spaces
Lemma 4.4 (H¨ older inequality for finite sums) For all x1 , y1 , ..., xn , yn ∈ R, it holds n X
n X
|xk yk | ≤
k=1
!1/p |xk |
n X
p
k=1
!1/p′ p′
|yk |
.
(4.32)
k=1
Proof. Let x1 , y1 , ..., xn , yn ∈ R, generic. If x1 = x2 = ... = xn = 0 ∨ y1 = y2 = ... = yn = 0, then immediately holds (4.32). Therefore, let’s assume that there are i0 , j0 ∈ In such that xi0 ̸= 0 and yj0 ̸= 0. By Young’s inequality, we have, for each k ∈ In , that ′
|xk |p |yk |p |xk yk | ≤ + , p p′ so that
n X
|xk yk | ≤
k=1
n n ′ 1X 1 X |xk |p + ′ |yk |p . p p k=1
(4.33)
k=1
Then for each k ∈ In , we can replace xk by λxk with λ > 0, so that (4.33) becomes n n n X ′ λ−1 X λp−1 X p |xk yk | ≤ |xk | + ′ |yk |p . (4.34) p p k=1
k=1
k=1
Now, let’s consider the function f :]0, +∞[−→ R given by f (λ) =
n n ′ λp−1 X λ−1 X |xk |p + ′ |yk |p . p p k=1
(4.35)
k=1
By using (4.28) and the standard optimization technique learned in the Calculus course, we find that f has a global minimum at !1/p n !−1/p n X X p′ p λ0 = |yk | |xk | . k=1
k=1
By using (4.28), we find that the global minimum of f is !(p−1)/p !1/p n n X X p′ p |yk | |xk | n n X X ′ 1 1 k=1 k=1 p f (λ0 ) = · |xk | + ′ · |yk |p !(p−1)/p · !1/p · n n p p X X k=1 k=1 ′ |xk |p |yk |p k=1
=
=
1 p
n X
k=1
!1/p |xk |p
k=1 n X k=1
|yk |p
′
k=1
!1/p |xk |p
n X
!1/p′
n X k=1
1 + ′ p
n X k=1
!1/p |xk |p
n X
!1/p′ |yk |p
′
k=1
!1/p′ |yk |p
′
.
(4.36)
4.4. Additional examples of Banach and Hilbert spaces
129
Then by putting λ = λ0 in (4.34), and using (4.36), we get (4.32) and conclude by the arbitrariness of the numbers xk and yk . ■ Proof. [of Theorem 4.7] Let x1 , y1 , ..., xn , yn ∈ R, generic. If x1 = x2 = ... = xn = 0 ∨ y1 = y2 = ... = yn = 0, then immediately holds (4.27). Therefore, let’s assume that there are i0 , j0 ∈ In such that xi0 ̸= 0 and yj0 ̸= 0. By the triangle inequality, we have that n X
!1/p p
≤
|xk + yk |
n X
!1/p p
(|xk | + |yk |)
.
k=1
k=1
Therefore, we can prove (4.27) by showing that n X
!1/p (|xk | + |yk |)p
≤
n X
!1/p |xk |p
n X
+
k=1
k=1
!1/p |yk |p
.
(4.37)
k=1
It’s clear that (|a| + |b|)p = (|a| + |b|)p−1 |a| + (|a| + |b|)p−1 |b|.
∀a, b ∈ R :
Therefore, by putting a = xk and b = yk , and summing in k, we get n X
n X
(|xk | + |yk |)p =
k=1
(|xk | + |yk |)p−1 |xk | +
k=1
n X
(|xk | + |yk |)p−1 |yk |
(4.38)
k=1
By using H¨ older inequality and (4.28), we have that n X
(|xk | + |yk |)p−1 |xk | ≤
k=1
n X p′ (|xk | + |yk |)p−1
!1/p′
n X
k=1
=
n X
!1/p |xk |p
k=1
!1/p′
n X
p
(|xk | + |yk |)
k=1
!1/p p
|xk |
,
(4.39)
.
(4.40)
k=1
and n X
p−1
(|xk | + |yk |)
|yk | ≤
k=1
n X
!1/p′
n X
p
(|xk | + |yk |)
k=1
!1/p |yk |
p
k=1
Then (4.38), (4.39) and (4.40) imply that n X
(|xk |+|yk |)p ≤
k=1
n X
!1/p′ (|xk | + |yk |)p
k=1
n X
!1/p |xk |p
+
k=1
whence we obtain (4.37) by dividing the last inequality by We conclude by the arbitrariness of the numbers xk and
n X
!1/p |yk |p
,
k=1 n X
!1/p′ (|xk | + |yk |)p
k=1 yk .
. ■
130
Chapter 4. Banach and Hilbert spaces
4.4.2. The space lp (R) Let p ≥ 1. Let’s consider the set of real sequences which are p-summable: ( ) +∞ X p p l (R) = x = (xn )n∈N ∈⊆ R / |xn | < +∞ . n=1
It’s clear that ∀λ ∈ R, ∀x ∈ lp (R) :
λx ∈ lp (R).
However, for lp (R) to be a linear space it should also hold ∀x, y ∈ lp (R) :
x + y ∈ lp (R),
(4.41)
which is not immediate and then has to be proved. In the set lp (R), we consider the mapping ∥ · ∥p : lp (R) −→ R, given by !1/p
+∞ X
∥x∥p =
|xn |p
,
n=1
where x = (xn )n∈N . It’s immediate that ∀x ∈ lp (R) :
∥x∥p = 0 ⇐⇒ x = 0;
p
∀λ ∈ R, ∀x ∈ l (R) :
∥λ · x∥p = |λ| · ∥x∥p .
(4.42)
So to prove that ∥ · ∥p verifies the conditions of a norm, we need to show that ∀x, y ∈ lp (R) :
∥x + y∥p ≤ ∥x∥p + ∥y∥p ,
(4.43)
which, at the same time, shows (4.41). The cases p = 1 and p = 2, were already considered in Examples 3.12 and 3.10, respectively. Observe that (4.43) can be written as the Minkowski inequality for infinite sums. Theorem 4.8 (Minkowski inequality for infinite sums) For every x = (xn )n∈N , y = (yn )n∈N ∈ lp (R), it holds +∞ X n=1
!1/p p
|xn + yn |
≤
+∞ X n=1
!1/p p
|xn |
+
+∞ X
!1/p p
|yn |
.
(4.44)
n=1
Proof. By Minkowski inequality for finite sums, Theorem 4.7, we have for m ∈ N: !1/p !1/p !1/p m m m X X X p p p |xn + yn | ≤ |xn | + |yn | . (4.45) n=1
n=1
n=1
Since x, y ∈ lp (R), the sums on the right side of (4.45) have limit as m → +∞; so we obtain (4.44). ■ By using the trick of the previous proof, we obtain H¨older inequality for infinite sums:
4.4. Additional examples of Banach and Hilbert spaces
131
Proposition 4.1 (H¨ older inequality for infinite sums) For every ′ x = (xn )n∈N ∈ lp (R) and every y = (yn )n∈N ∈ lp (R), it holds +∞ X n=1
|xn yn | ≤
+∞ X
!1/p |xn |p
n=1
+∞ X
!1/p′ |yn |p
′
.
(4.46)
n=1
In particular, (xn yn )n∈N ∈ l1 (R). We finish this section by stating the completeness of lp (R). Theorem 4.9 (lp (R) is complete) lp (R) is a Banach space for p ≥ 1. The proof is required as an exercise at the end of the chapter.
4.4.3. The Lebesgue space Lp (I), I ⊆ R Let p ≥ 1. Let’s consider the functional ∥ · ∥p : C([a, b]) −→ R given by !1/p Z b
|u(t)|p dt
∥u∥p =
.
a
It’s immediate that ∀u ∈ C([a, b]) : ∀λ ∈ R, ∀u ∈ C([a, b]) :
∥u∥p = 0 ⇐⇒ u = 0; ∥λ · u∥p = |λ| · ∥u∥p .
So, to prove that ∥ · ∥p is a norm on C([a, b]), we need to show that the triangle inequality holds: ∀u, v ∈ C([a, b]) :
∥u + v∥p ≤ ∥u∥p + ∥v∥p .
(4.47)
The cases p = 1 and p = 2 were already considered in Example 3.22. Let’s observe that (4.26) can be written as the Minkowski inequality for integrals. Theorem 4.10 (Minkowski inequality for integrals) For all u, v ∈ C([a, b]), it holds "Z #1/p "Z #1/p "Z #1/p b b b p p p |u(t) + v(t)| dt ≤ |u(t)| dt + |v(t)| dt . (4.48) a
a
a
To prove Theorem 4.10, we need H¨ older’s inequality for integrals. By using it and following the scheme of the proof of Theorem 4.7, it’s not difficult to achieve (4.48). This is required as an exercise at the end of the chapter. Lemma 4.5 (H¨ older inequality for integrals) Let 1 ≤ p ≤ ∞. For all u, v ∈ C([a, b]), it holds ∥uv∥1 ≤ ∥u∥p ∥v∥p′ ,
(4.49)
132
Chapter 4. Banach and Hilbert spaces
which for 1 < p < ∞ can be written as Z
b
|u(t)|p dt
|u(t)v(t)|dt ≤ a
!1/p
b
Z a
Z
!1/p′
b
′
|v(t)|p dt
.
(4.50)
a
Proof. Let u, v ∈ C([a, b]), generic. 1. Let’s assume that p = 1 so that p′ = +∞. Then Z b Z b ∥uv∥1 = |u(t)v(t)|dt ≤ sup |v(t)| · |u(t)|dt = ∥v∥∞ · ∥u∥1 . t∈[a,b]
a
a
Obviously, the case p = +∞ is exactly the same. 2. Let’s assume that 1 < p < +∞ so that 1 < p′ < +∞. If u = 0 or v = 0, then immediately holds (4.50). Therefore, let’s assume that u ̸= 0 and v ̸= 0. By Young’s inequality, we have for each t ∈ [a, b] that ′
so that Z b Z |u(t)v(t)|dt ≤ a
a
|u(t)v(t)| ≤
|v(t)|p |u(t)|p , + p p′
b
Z
|u(t)|p dt+ p
b
a
′
′ |v(t)|p 1 1 dt = ∥u∥pp + ∥v∥pp′ . (4.51) ′ p p p
Then in (4.51) we can replace u by λ u with λ > 0, so that (4.51) becomes Z b ′ λp−1 λ−1 |u(t)v(t)|dt ≤ ∥u∥pp + ′ ∥v∥pp′ . (4.52) p p a Let’s consider the function f :]0, +∞[−→ R given by f (λ) =
′ λ−1 λp−1 ∥u∥pp + ′ ∥v∥pp′ . p p
By using (4.28) and the standard optimization technique learned in the Calp′ /p culus course, we find that f has a global minimum at λ0 = ∥v∥p′ /∥u∥p . By using (4.28), we find that the global minimum of f is p′ (p−1)/p
1 ∥v∥p′ f (λ0 ) = · p ∥u∥p−1 p =
· ∥u∥pp +
′ 1 ∥u∥p · · ∥v∥pp′ p′ ∥v∥p′′ /p p
1 1 ∥v∥p′ ∥u∥p + ′ ∥u∥p ∥v∥p′ = ∥u∥p ∥v∥p′ . p p
(4.53)
Then by putting λ = λ0 in (4.52) we get, by (4.53), that (4.49) holds. We conclude by the arbitrariness of u and v.
■
For 1 ≤ p < +∞ the normed space e p ([a, b]) = (C([a, b]), ∥ · ∥p ), L
(4.54)
4.4. Additional examples of Banach and Hilbert spaces
133
is not complete. However, thanks to Theorem 3.11 and Remark 3.15, we know that e p ([a, b]), there is a Banach space, denoted Lp ([a, b]), which is the completion of L i.e., Lp (a, b) = C([a, b]). Then, we can think of an element of Lp ([a, b]) as a ”function” which can be aproximated as much as we want, in the metric ∥ · ∥p , by continuous functions. Remember that for u ∈ L1 ([a, b]), its Lebesgue integral is given by Z (L )
b
Z
a
n→+∞
b
un (t)dt,
u(t)dt = lim
(4.55)
a
where (un )n∈N ∈ C([a, b]) is any sequence that converges to u in the norm ∥ · ∥1 . The integral appearing in the right side of (4.55) is a Riemann integral. Also recall that for the elements of C([a, b]) the integrals of Riemann and Lebesgue coincide. Let’s remark that if 1 ≤ p ≤ q ≤ +∞, then there exist constants c1 , c2 , c3 > 0 such that ∀u ∈ C([a, b]) :
∥u∥1 ≤ c1 ∥u∥p ≤ c2 ∥u∥q ≤ c3 ∥u∥∞ ,
(4.56)
and, by using (4.55), it can be showed that (C([a, b]), ∥ · ∥∞ ) ⊆ Lq ([a, b]) ⊆ Lp ([a, b]) ⊆ L1 ([a, b]) and, for the corresponding topologies: T1 ⊆ Tp ⊆ Tq . We have mentioned that the elements of Lp ([a, b]) can be considered as “functions”. Actually, they are equivalence classes that appear by using the concept of Lebesgue measure in R. A little bit was presented in Section 3.6. Example 4.5 Let p ≥ 1. Once more, let’s consider the sequence (xm )m∈N ⊆ C([0, 1]) given, for each m ∈ N, by (% i1) a(m):= 1/2 + 1/m; (% i2) x[m](t):= if ta(m) then 1 else m*t-m/2; (% i4) plot2d([x[1](t),x[5](t),x[10](t),x[20](t)],[t,0,1],[y,-0.1,1.4]);
Figure 4.2.: The functions x1 (blue), x5 (red), x10 (green) end x20 (violet).
134
Chapter 4. Banach and Hilbert spaces
The sequence (xm )m∈N ⊆ C([0, 1]) is a Cauchy sequence in the norm ∥ · ∥p and, as m → +∞ it comes closer and closer to the function ( 0, if t ∈ [0, 1/2], x(t) = 1, if t ∈]1/2, 1], which is not continuous and, therefore, does not belong to C([0, 1]). However, since Lp ([0, 1]) is complete by construction, x ∈ Lp ([0, 1]). General case e p (I) and Lp (I) for the particular case Until now, we have introduced the spaces L of I = [a, b]. For this, key factors were: 1. the composition of continuous functions is a continuous function, and 2. a continuous function is always Riemann integrable in a compact interval. In the case when I is not compact - it’s either non-bounded or non-closed - we need some additional property to guarantee the convergence of integrals over I. In the following, we shall consider the “worst” case, i.e., when I = R, since it will clearly ilustrate the construction. For a function u ∈ C(R), we define its support as supp(u) = {x ∈ R / u(x) = 0}. We shall use the space of continuous functions with compact support C0 (R) = {u ∈ C(R) / supp(u) is compact}. For p ≥ 1, the functional ∥ · ∥p : C0 (R) −→ R given by Z ∥u∥p =
1/p |u(t)| dt , p
R
e p (R) = (C (R), ∥ · ∥p ), is not complete. However, is clearly a norm. The space L 0 thanks to Theorem 3.11 and Remark 3.15, we know that there is a Banach space, e p (R), i.e., Lp (R) = C (R). Then, denoted Lp (R), which is the completion of L 0 p we can think of an element of L (R) as a “function” which can be aproximated as much as we want, in the metric ∥ · ∥p , by continuous functions with compact support. For u ∈ L1 (R), its Lebesgue integral is given by Z (L )
Z u(t)dt = lim
R
n→+∞
un (t)dt,
(4.57)
R
where (un )n∈N ∈ C0 (R) is any sequence that converges to u in the norm ∥ · ∥1 . The integral appearing in the right side of (4.57) is a Riemann integral. Also recall that for the elements of C0 (R) the integrals of Riemann and Lebesgue coincide.
4.4. Additional examples of Banach and Hilbert spaces
135
4.4.4. The space Cn ([a, b]) By default, the linear space C1 ([a, b]) is equipped with the norm ∥ · ∥C 1 , given by ∥u∥C 1 = ∥u∥∞ + ∥u′ ∥∞ . The functional space (C1 ([a, b]), ∥ · ∥C 1 ) is a Banach space. The proof of this fact is not difficult so that it’s required as an exercise at the end of the chapter. Let’s remark that whenever we speak of the space C1 ([a, b]), we are using the norm ∥·∥C 1 . Otherwise, we speak of the set or linear space C1 ([a, b]) when it’s equipped with a different norm. Let n ∈ N. In the same way, by default, the linear space Cn ([a, b]) is equipped with the norm ∥ · ∥C n , given by ∥u∥C n =
n X
∥u(k) ∥∞ .
(4.58)
k=0
The functional space (Cn ([a, b]), ∥ · ∥C n ) is a Banach space. The proof of this fact is required as exercise at the end of the chapter. From (4.58) it immediately follows that ∀m, n ∈ N, m > n, ∀u ∈ Cm ([a, b]) :
∥u∥∞ ≤ ∥u∥C n ≤ ∥u∥C m ,
(4.59)
so that, in particular, m > n implies that the norm ∥ · ∥C m dominates the norm ∥ · ∥C n . Remark 4.8 Assume n < m. By Theorem 4.1 and (4.59), on Cm ([a, b]) for the topologies corresponding to the norms ∥ · ∥∞ , ∥ · ∥C 1 , ∥ · ∥C 2 , ..., ∥ · ∥C n ,..., ∥ · ∥C m , it holds: T∞ ⊆ T (1) ⊆ T (2) ⊆ ... ⊆ T (n) ⊆ ... ⊆ T (m) .
4.4.5. The Sobolev spaces H1 (I) and W1,p (I) Let p ≥ 1. On the linear space C1 ([a, b]) the functional ∥ · ∥1,p : C1 ([a, b]) → R, given by ∥u∥1,p = ∥u∥p + ∥u′ ∥p , is a norm. The space f1,p ([a, b]) = (C1 ([a, b]), ∥ · ∥1,p ). W is not a Banach space; however, thanks to Theorem 3.11 and Remark 3.15, we know that there is a Banach space, denoted W1,p ([a, b]), which is the completion f1,p ([a, b]), i.e., of W W1,p ([a, b]) = C1 ([a, b]). Then, we can think of an element of the Sobolev space W1,p ([a, b]) as a “function” which can be aproximated as much as we want, in the metric induced by ∥ · ∥1,p , by functions which are continuously differentiable. Remark 4.9 It’s important to keep in mind that a norm equivalent to ∥ · ∥1,p is defined by the formula 1/p ∥u∥∗1,p = ∥u∥pp + ∥u′ ∥pp .
136
Chapter 4. Banach and Hilbert spaces
Whenever a sequence (un )n∈N ⊆ C1 ([a, b]) converges in the norm ∥ · ∥1,p to some u ∈ W1,p ([a, b]), we have that lim ∥u′n − g∥p = 0,
lim ∥un − u∥p = 0 and
n→+∞
n→+∞
for some g ∈ Lp ([a, b]). We shall say that this function g is a weak derivative of u and denote g = u′ . This terminology is justified by the following property: Z b Z b ∀φ ∈ C10 ([a, b]) : u(t)φ′ (t)dt = − u′ (t)φ(t)dt, (4.60) a
a
which is nothing more than the formula of integration by parts. The functions φ appearing in (4.60) are called test functions. Remark 4.10 (Weak and strong derivatives) For a function in C1 ([a, b]), its weak derivative coincides with the usual derivative, sometimes referred to as its strong derivative. For the particular case of p = 2, we denote e 1 ([a, b]) = W f1,2 ([a, b]) and H1 ([a, b]) = W1,2 ([a, b]), H equipped with the scalar product defined by the formula Z b (u, v)H1 = (u, v)2 + (u′ , v ′ )2 = [u(t)v(t) + u′ (t)v ′ (t)] dt, a 2
where (·, ·)2 is the L -inner-product introduced in Example 3.15. In this case, the norm is given by !1/2 Z b 2 ′ 2 1/2 2 ′ 2 ∥u∥H1 = ∥u∥2 + ∥u ∥2 = |u(t)| + |u (t)| dt . a
General case We shall introduce the Sobolev spaces W1,p (I) and H1 (I), for a general region I ⊆ R. Let’s consider the linear space A 1 (I) = u ∈ C1 (I) / ∃v ∈ C10 (R) : u = v ↾I , so that a function belongs to A 1 (I) iff it’s the restriction to I of a function in C10 (R). The functional ∥ · ∥1,p : A 1 (I) −→ R, given by Z 1/p Z 1/p ∥u∥1,p = ∥u∥p + ∥u′ ∥p = |u(t)|p + |u′ (t)|p , I
I
f1,p (I) = (A 1 (I), ∥·∥1,p ). is not a Banach space; however, is a norm. The space W thanks to Theorem 3.11 and Remark 3.15, we know that there is a Banach space, f1,p (I), i.e., denoted W1,p (I), which is the completion of W W1,p (I) = A 1 (I). Then, we can think of an element of the Sobolev space W1,p (I) as a “function” which can be approximated as much as we want, in the metric induced by ∥ · ∥1,p , by restrictions of elements of C10 (R).
4.4. Additional examples of Banach and Hilbert spaces
137
Remark 4.11 Sometimes we write ∥u∥W1,p (I) = ∥u∥1,p to stress the region where the integration takes part. Whenever a sequence (un )n∈N ⊆ A 1 (I) converges in the norm ∥ · ∥1,p to some u ∈ W1,p (I), we have that lim ∥u′n − g∥p = 0,
lim ∥un − u∥p = 0 and
n→+∞
n→+∞
for some g ∈ Lp (I). Again we shall say that this g is a weak derivative of u and denote g = u′ , as it holds Z Z ∀φ ∈ C10 (I) : u(t)φ′ (t)dt = − u′ (t)φ(t)dt, I
I
as well as u ∈ C1 (I) ∩ Lp (I) ∧ u′ ∈ Lp (I) ⇒ u ∈ W1,p (I). For the particular case of p = 2, we denote e 1 (I) = W f1,2 (I) and H1 (I) = W1,2 (I), H equipped with the scalar product defined by the formula Z (u, v)H1 = (u, v)2 + (u′ , v ′ )2 = [u(t)v(t) + u′ (t)v ′ (t)] dt, I
where (·, ·)2 is the L2 -inner-product. The induced norm is then given by ∥u∥H1 = ∥u∥22 + ∥u′ ∥22
1/2
Z =
1/2 |u(t)|2 + |u′ (t)|2 dt .
I
To finish this section, we shall present an important theorem which requires a touch of Measure Theory. Given a region I ⊆ R, we define on F (I) an equivalence relation by u∼v ⇐⇒ u = v, a.e. in I. Therefore, for p ≥ 1, u∼v
⇒
[u ∈ Lp (I) ⇐⇒ v ∈ Lp (I)] .
Theorem 4.11 (Continuous representant of a Sobolev class) Let p ≥ 1, I ⊆ R and u ∈ W1,p (I). Then there exists u ˜ ∈ C(I) such that u = v a.e. in I, and Z x
∀x, x0 ∈ I :
u′ (t)dt.
u ˜(x) − u ˜(x0 ) = x0
A proof of this result can be found in [4, Sec.8.2]. In particular, this theorem says that a function can not be in a Sobolev space if it’s not equal (almost everywhere) to a continuous function.
138
Chapter 4. Banach and Hilbert spaces
Example 4.6 Let I =] − 1, 1[. The function u : I → R given by u(x) = |x|, belongs to W1,p (I), for all p ≥ 1. The weak derivative of u is given by ( −1, if − 1 < x < 0, u′ (x) = 1, if 0 ≤ x < 1. By Theorem 4.11, it follows that u′ does belong to W1,p (I), for all p ≥ 1.
4.5. Schauder and Hilbert basis Let (V, ∥ · ∥) be a normed space. Associated to a sequence (xn )n∈N ⊆ V is the n X sequence of partial sums, (sn )n∈N ⊆ V , given by sn = xk . If (sn )n∈N is k=1
convergent, we write its limit as s=
+∞ X
xn .
n=1
We say that (sn )n∈N is absolutely convergent iff (∥sn ∥)n∈N ⊆ R is convergent. Theorem 4.12 (Absolute convergence in a Banach spaces) In a Banach space any absolutely convergent sequence is convergent. The proof of Theorem 4.12 is easy and it’s required as an exercise at the end of the chapter. Example 4.7 (A non-convergent absolutely convergent series) On the linear space l∞ (R) = x = (xn )n∈N ∈ RN / ∃γ > 0, ∀n ∈ N : |xn | < γ , the functional given by ∥x∥∞ = sup |xn |. is a norm which makes l∞ (R) a Banach n∈N
space. In l∞ (R), let Υ be the set of real sequences with only finitely many nonzero terms.Let’s consider the sequence (υm )m∈N ⊆ Υ, where for each m ∈ N, υm = x(m) n
is given by
n∈N
x(m) = n
δnm . n2
The sequence of partial sums associated with (υm )m∈N is absolutely convergent but it is not convergent. Therefore, (Υ, ∥ · ∥∞ ) is not a Banach space. In a normed space V , we say that S ⊆ V is a total set iff ⟨S⟩ = V.
(4.61)
4.5. Schauder and Hilbert basis
139
Point (4.61) means that given any v ∈ V and any ϵ > 0, there are u1 , u2 , ..., um ∈ S and β1 , β2 , ..., βm ∈ R such that
m
X
βk uk < ϵ.
v −
k=1
Definition 4.1 (Schauder basis) Let (V, ∥·∥) be a normed space. We say that S ⊆ V is a Schauder basis of V iff S is a countable total linearly independent set. If V is an infinite-dimensional normed space, then S = {un ∈ V / n ∈ N} is a Schauder basis of V iff 1. S is linearly independent, and 2. ⟨S⟩ = V . In this case, ∀v ∈ V, ∃!(αn )n∈N ∈ RN :
v=
∞ X
αn un ,
(4.62)
n=1
i.e., any element of V can be uniquely written as an infinite linear combination of the elements of S. Point (4.62) is referred to as the Schauder series of the vector v in the basis S; the number αn is the n-th Schauder coefficient of the expansion. Example 4.8 (Schauder bases in a Banach space) In the space l∞ (R), let’s consider the subsets C = {em / m ∈ N} and B = {ηm / m ∈ N}, where, for each m ∈ N, em = e(m) n
e(m) n
and ηm = ηn(m) , are given by n∈N ( 1, if n ≤ m, = δmn , ηn(m) = 0, if n > m. n∈N
B and C are Schauder bases for l∞ (R). C is called the canonical basis of l∞ (R). Definition 4.2 (Orthogonal, orthonormal and Hilbert basis) Let V be an inner-product space. We say that B ⊆ V is 1. an orthogonal basis of V iff it’s a total orthogonal set and 0 ∈ / B; 2. an orthonormal basis of V iff it’s a total orthonormal set; 3. a Hilbert basis of V iff it’s a countable orthonormal basis. Remark 4.12 An orthogonal (orthonormal) subset of V \ {0} is also referred to as an orthogonal (orthonormal) system of V . In this case, ⟨B⟩ ⊆ V . In Theorem 4.15, we shall state that a separable Hilbert space has a Hilbert basis. For a non-separable Hilbert space we have Theorem 4.18, which states the existence of orthonormal basis. Keep in mind that if B is a Hilbert basis of V , then B is a Schauder basis of V . In this case, it holds the Fourier series, ∞ X ∀v ∈ V, ∃!(αn )n∈N ∈ RN : v = αn un . (4.63) n=1
Here, αn is said to be the n-th Fourier coefficient.
140
Chapter 4. Banach and Hilbert spaces
Theorem 4.13 (Parseval’s equality and Fourier coefficients.) Assume that V is an inner-product space and that B = {un / n ∈ N} is a Hilbert basis of V . For v ∈ V , the n-th Fourier coefficient in (4.63) is given by αn = (v, un ). Moreover, the Parseval equality holds: ∥v∥2 =
+∞ X
αn2 .
n=1
Proof. Let m ∈ N, generic. Then (v, um ) =
+∞ X
! αn un , um
=
+∞ X
αn (un , um ) = αm .
n=1
n=1
In the same way, 2
∥v∥ =
+∞ X
αn un ,
n=1
+∞ X
! αn un
n=1
=
+∞ X
αn2 .
n=1
■ Remark 4.13 Assume that B = {un / n ∈ N} is an orthogonal basis of the inner-product space V . Given v ∈ V , it holds the non-normalized Fourier series v=
+∞ X
cn un ,
n=1
where, for each n ∈ N, the n-th generalized Fourier coefficient is cn =
(v, un ) . ∥un ∥2
Moreover, Parseval’s equality now takes the form: 2
∥v∥ =
+∞ X
c2n ∥un ∥2 .
n=1
Theorem 4.14 (Bessel’s inequality) Let B = {un /n ∈ N} be an orthonormal subset of the Hilbert space H. Let’s write U = ⟨B⟩. Given v ∈ H, we define u ∈ U by +∞ X u= αn un , n=1
where, for each n ∈ N, αn = (v, un ). Then d(v, U ) = inf ∥v − w∥ = ∥v − u∥. w∈U
(4.64)
4.5. Schauder and Hilbert basis
141
Moreover, Bessel’s inequality holds: ∥v∥2 ≥
+∞ X
αn2 .
(4.65)
n=1
Idea of the proof. For m ∈ N, let’s denote Bm = {un / n ∈ Im } so that Um = ⟨Bm ⟩ is finite-dimensional and, consequently, closed. It holds U1 ⊊ U2 ⊊ ... ⊊ Um ⊊ Um+1 ⊊ ... ⊊ U. Instead of solving the minimization problem (4.64), we shall subsequently solve the problems d(v, Um ) = inf ∥v − w∥, m ∈ N, w∈Um
by using the tools learned in the course of Calculus. As a consequence we shall get ∥v∥2 ≥
m X
αn2 .
n=1
Then we shall pass to the limit as m −→ +∞. Proof. For m ∈ N, let’s define Γm : Rm → R by Γm (β1 , β2 , ..., βm )
= =
2
m
X
βn un
v −
n=1 m X
∥v∥2 +
βn2 − 2
n=1
m X
βn (v, un ).
(4.66)
n=1
Now let’s prove by induction that ∀m ∈ N :
Pm ,
(4.67)
where, Pm :
inf
(β1 ,...,βm )∈Rm
Γm (β1 , ..., βm ) = Γm (α1 , ..., αm ),
(4.68)
and αn = (v, un ), n ∈ Im . 1. For m = 1, Γ1 (β1 ) = ∥v∥2 + β12 − 2β1 (v, u1 ). It’s easy to check that Γ1 has a global minimum at β1 = (v, u1 ), i.e., inf Γ1 (β1 ) = Γ1 (α1 ). Therefore, P1 is true.
β1 ∈R
142
Chapter 4. Banach and Hilbert spaces
2. Let’s prove that ∀m ∈ N :
Pm ⇒ Pm+1 .
(4.69)
Let m ∈ N, generic. Let’s assume that Pm is true, i.e., that inf
(β1 ,...,βm )∈Rm
Γm (β1 , ..., βm ) = Γm (α1 , ..., αm ),
where αn = (v, un ), n ∈ Im . By (4.66), it’s clear that Γm+1 (β1 , β2 , ..., βm , βm+1 ) = Γm (β1 , β2 , ..., βm ) + fm+1 (βm+1 ), (4.70) where f : R −→ R is given by 2 fm+1 (βm+1 ) = βm+1 − 2βm+1 (v, um+1 ).
From (4.70), it immediatly follows that inf
(β1 ,...,βm+1 )∈Rm+1
=
Γm+1 (β1 , ..., βm+1 ) =
inf
(β1 ,...,βm )∈Rm
Γm (β1 , ..., βm ) +
inf
βm+1 ∈R
fm+1 (βm+1 )
=
2 Γm (α1 , ..., αm ) + αm+1 − 2αm+1 (v, um+1 )
=
Γm+1 (α1 , ..., αm+1 ),
where αm+1 = (v, um+1 ). Since m was chosen arbitrarily, we have proved (4.69), and then (4.67) as well. Let’s observe that for m ∈ N: 0 ≤ Γm (α1 , ..., αm ) = ∥v∥2 −
m X
αn2 .
n=1
Therefore, ∥v∥2 ≥
∀m ∈ N :
m X
αn2 ,
n=1
whence, by letting m −→ ∞, we get (4.65). Let’s finally prove (4.64), i.e., ∀w ∈ U :
∥v − u∥ ≤ ∥v − w∥.
Let w ∈ U , generic. Then there exists (γn )n∈N ⊆ R such that w=
+∞ X
γn un .
n=1
From (4.67) and (4.68), we know that
m m
X X
∀m ∈ N : v − αn un ≤ v − γn un ,
n=1
n=1
(4.71)
4.5. Schauder and Hilbert basis
143
whence
m m
X X
γn un = ∥v − w∥. αn un ≤ lim v − ∥v − u∥ = lim v − m→+∞
m→+∞ n=1
n=1
Since w was chosen arbitrarily, we have proved (4.71).
■
Remark 4.14 (Best approximation in a closed linear subspace) The proof of Theorem 4.14 shows that +∞ X u= (v, un ) un n=1
is the best approximation of v in the closed space U = ⟨B⟩. It also shows that, m X αn un is the best approximation of v in the for m ∈ N, the element vm = n=1
m-dimensional space ⟨Bm ⟩, where Bm = {un / n ∈ Im }. This shall be used in Section 4.7 where several Hilbert bases are built for the space L2 (I). Remark 4.15 (Orthogonal projector) In the context of Theorem 4.14, if there is some u∗ ∈ U such that ∥v − u∗ ∥ = d(v, U ), then u = u∗ . Therefore, there is an operator PU : H −→ H such that u = PU (v)
⇐⇒
∥v − u∥ = d(v, U )
⇐⇒
u=
+∞ X
(v, un ) un .
n=1
PU is a linear operator, referred to as the projection operator of H onto U . Observe that if u = PU (v), then the vector w = v − u is orthogonal to U , i.e., ∀y ∈ U :
(w, y) = 0.
(4.72)
Because of (4.72), PU is also called the orthogonal projector of H onto U . Let’s recall that a normed space (V, ∥ · ∥) is separable iff there exists D ⊆ V countable and dense, i.e., if D = {un / n ∈ N} and ∀u ∈ V, ∀ϵ > 0, ∃n0 ∈ N :
∥u − un0 ∥ < ϵ.
Example 4.9 (Rn is separable) The set Qn is countable and dense in Rn so that Rn is separable. For example, let n = 1 and x ∈]0, 1[. We can write x = 0.a1 a2 a3 . . . an . . . , where, for each n ∈ N, an ∈ {0, 1, 2, ..., 9}. Now we define (qn )n∈N ⊆ Q by qn = 0.a1 a2 . . . an , so that |x − qn | < 10−n . Our next result states that in a Hilbert space, the existence of a Hilbert basis is equivalent to the separability of H.
144
Chapter 4. Banach and Hilbert spaces
Theorem 4.15 (Hilbert basis and separability) A Hilbert space is separable iff it has a Hilbert basis. Proof. Let H be a Hilbert space. i) Let’s assume that H is separable. We have to prove that H has a Hilbert basis. Let’s pick D = {vn / n ∈ N}, a dense subset of H. For each n ∈ N, we take Vn = ⟨{v1 , v2 , ..., vn }⟩, and choose Bn an orthonormal basis of Vn in a way that B1 ⊆ B2 ⊆ ... ⊆ Bn ⊆ Bn+1 ⊆ ... Since
+∞ [
Vn is dense in H, it follows that B =
+∞ [
Bn is a Hilbert basis of
n=1
n=1
H. ii) Let’s assume that H has a Hilbert basis, say B = {un / n ∈ N}. We have to prove that H is separable, i.e., we have to find a countable dense subset of H. Since B is a Hilbert basis, A = ⟨B⟩ is dense in H. Let’s consider AQ , the set of finite linear combinations of elements of B having rational coefficients. Since AQ is countable, by Theorem 2.7, we’ll be done if we prove that AQ is dense in A. Then we have to prove that ∀u ∈ A, ∀ϵ > 0, ∃u∗ ∈ AQ :
∥u − u∗ ∥ < ϵ.
(4.73)
Let u ∈ A and ϵ > 0, generic. Then u has the form u=
m X
αk unk ,
k=1
for some αk ∈ R and unk ∈ B, k ∈ Im . Now, for each k ∈ Im , we choose rk ∈ Q such that ϵ |αk − rk | < · ∥unk ∥. m m X Therefore, u∗ = rk unk ∈ AQ and k=1
m
m
m
X
X
X
∥u − u∗ ∥ = αk unk − rk unk = (αk − rk )unk
k=1
≤
m X k=1
k=1 m X
|αk − rk |
0, ∃v ∈ E :
∥u − v∥p < ϵ.
(4.78)
Let u ∈ C0 (R) and ϵ > 0, generic. Let’s choose a, b ∈ Q such that supp(u) ⊆]a, b[= G. Let’s also choose 0 0, ∃p ∈ P([a, b]) :
∥u − p∥2 < ϵ.
(4.82)
Let u ∈ C([a, b]) and ϵ > 0, generic. By (4.56), we know that there is c > 0 such that ∀v ∈ C([a, b]) : ∥v∥2 ≤ c∥v∥∞ . (4.83) By Weierstrass Aproximation theorem, Theorem 4.3, there is p ∈ P([a, b]) such that ∥u − p∥∞ < ϵ/c. Then, by (4.83), we have that ∥u − p∥2 ≤ c∥u − p∥∞ < c
ϵ = ϵ. c
Since u and ϵ were chosen arbitrarily, we have proved (4.82). ■ Observe that in the hypothesis of the following corollary, B is a Hamel basis of the linear space P([a, b]); this is a purely algebraic concept. In other hand, in the thesis, B is a Schauder basis of L2 ([a, b]); this property depends on the norm. Corollary 4.5 (L2 ([a, b])-Schauder basis by polynomials) Let B be a Hamel basis of P([a, b]). Then B is a Schauder basis of L2 ([a, b]). Proof. Since P([a, b]) is dense in L2 ([a, b]), by Theorem 2.7, we just have to prove that ⟨B⟩ is dense in P([a, b]) in ∥ · ∥2 -norm. Since B is a Hamel basis of P([a, b]), we have that ⟨B⟩ = P([a, b]), so that ⟨B⟩ ⊇ P([a, b]), i.e., ⟨B⟩ is dense in P([a, b]). ■
4.7. Hilbert basis of L2 (I)
151
Remark 4.20 Corollary 4.5 and Gram-Schmidt scheme, (4.75), provide a way to construct a Hilbert basis for L2 ([a, b]) departing from any Hamel basis of P([a, b]), in particular from the canonical basis C = {en : n ∈ N∗ }, where ( e0 (t) = 1, en (t) = tn , n ∈ N. Construction of Fourier-Legendre basis Let’s apply Remark 4.20 to build a Hilbert basis for the space L2 ([−1, 1]). Let’s recall that the inner-product and the norm are given, respectively, by (% i1) Prod(u,v):=integrate(u(t)*v(t),t,-1,1); Z 1 Prod (u, v) := u(t) v(t)dt
(% o1)
−1
(% i2) N(u):= sqrt( Prod(u,u) ); We shall depart from the canonical basis C = {en /n ∈ N∗ }, where (% i3) e[n](t):= tˆn; The Gram-Schmidt scheme can be written as 1 e0 , P0 = ∥e 0∥ n X wn+1 = en+1 − (Pk , en+1 )Pk , k=0 1 Pn+1 = wn+1 , ∥wn+1 ∥
(4.84)
for n ∈ N∗ . By using (4.84), we obtain the Fourier-Legendre basis of L2 ([−1, 1]), E = {Pn /n ∈ N∗ }, whose elements are referred to as Legendre polynomials. Let’s compute the six first Legendre polynomials. (% i4) define(P[0](t),e[0](t)/N(e[0])); 1 P0 (t) := √ 2
(% o4)
(% i5) define( w[1](t), e[1](t) - Prod(P[0],e[1]) * P[0](t) ); w1 (t) := t
(% o5)
(% i6) define(P[1](t),w[1](t)/N(w[1])); √ 3t P1 (t) := √ 2
(% o6)
(% i7) define( w[2](t), e[2](t) - Prod(P[0],e[2]) * P[0](t) - Prod(P[1],e[2]) * P[1](t)); w2 (t) := t2 −
1 3
(% o7)
152
Chapter 4. Banach and Hilbert spaces
(% i8) define(P[2](t),w[2](t)/N(w[2])), fullratsimp; √ √ 3 5 t2 − 5 √ P2 (t) := 2 2
(% o8)
(% i9) define( w[3](t), e[3](t) - Prod(P[0],e[3]) * P[0](t) - Prod(P[1],e[3]) * P[1](t) - Prod(P[2],e[3]) * P[2](t)); w3 (t) := t3 −
3t 5
(% i10) define(P[3](t),w[3](t)/N(w[3])), fullratsimp; √ √ 5 7 t3 − 3 7t √ P3 (t) := 2 2
(% o9)
(% o10)
(% i11) define( w[4](t), e[4](t) - Prod(P[0],e[4]) * P[0](t) - Prod(P[1],e[4]) * P[1](t) - Prod(P[2],e[4]) * P[2](t) - Prod(P[3],e[4]) * P[3](t)), expand; w4 (t) := t4 −
6t2 3 + 7 35
(% o11)
(% i12) define(P[4](t),w[4](t)/N(w[4])), fullratsimp; P4 (t) :=
105t4 − 90t2 + 9 √ 8 2
(% o12)
(% i13) define( w[5](t), e[5](t) - Prod(P[0],e[5]) * P[0](t) - Prod(P[1],e[5]) * P[1](t) - Prod(P[2],e[5]) * P[2](t) - Prod(P[3],e[5]) * P[3](t) - Prod(P[4],e[5]) * P[4](t)), expand; w5 (t) := t5 −
5t 10t3 + 9 21
(% i14) define(P[5](t),w[5](t)/N(w[5])), fullratsimp; √ √ √ 63 11 t5 − 70 11 t3 + 15 11t √ P5 (t) := 8 2 In Figure 4.3 appear the six first Legendre polynomilas. Remark 4.21 It’ usual to write Legendre’s polynomials as 1/2 2n + 1 Pn (t) = pn (t), t ∈ [−1; 1], n ∈ N∗ , 2 1/2 2 so that ∥pn ∥2 = . For n ∈ N∗ it holds Rodr´ıguez formula: 2n + 1 1 dn 2 · [(t − 1)n ] 2n n! dtn N X (2n − 2j)! = (−1)j · n tn−2j , 2 j!(n − j)!(n − 2j)! j=0
pn (t) =
where N = n/2 if n is even, and N = (n − 1)/2 if n is odd.
(% o13)
(% o14)
(4.85)
(4.86)
4.7. Hilbert basis of L2 (I)
153
Figure 4.3.: Legendre polynomials P0 (blue), P1 (red), P2 (green), P3 (violet), P4 (black) and P5 (cyan).
Let’s use (4.86) to find the polynomials p1 , ..., p10 : (% i16) for n:0 thru 10 step 1 do define(p[n](t),expand(diff((tˆ2-1)ˆn,t,n)/(2ˆn*factorial(n)))); (% i17) for n:0 thru 10 step 1 do display(p[n](t)); p0 (t) = 1 p1 (t) = t 1 3t2 − p2 (t) = 2 2 3t 5t3 − p3 (t) = 2 2 35t4 15t2 3 p4 (t) = − + 8 4 8 63t5 35t3 15t p5 (t) = − + 8 4 8 231t6 315t4 105t2 5 p6 (t) = − + − 16 16 16 16 429t7 693t5 315t3 35t p7 (t) = − + − 16 16 16 16 6435t8 3003t6 3465t4 315t2 35 p8 (t) = − + − + 128 32 64 32 128 12155t9 6435t7 9009t5 1155t3 315t p9 (t) = − + − + 128 32 64 32 128 46189t10 109395t8 45045t6 15015t4 3465t2 63 p10 (t) = − + − + − 256 256 128 128 256 256 Remark 4.22 Legendre’s polynomials have many applications in Physics and Engineering; for example, in [12], they are used in a basic Quantum Mechanical analysis of the Hydrogen atom. One of the reasons for this, is that these polynomials have a lot of nice properties.
154
Chapter 4. Banach and Hilbert spaces
Remark 4.23 Both Pn and pn are solutions of the differential equation −[(1 − t2 ) y ′ (t)]′ − n(n + 1)y(t) = 0,
t ∈ [−1, 1].
If we consider the linear differential operator L : C∞ ([−1, 1]) −→ C∞ ([−1, 1]), given by L[u](t) = −[(1 − t2 ) u′ (t)]′ , we find that both Pn and pn are eigenfunctions of L, associated to the eigenvalue λn = n(n + 1). Since E = {Pn / n ∈ N∗ } is a Hilbert basis of L2 ([−1, 1]), every u ∈ L2 ([−1, 1]) can be written as a Fourier-Legendre series u=
∞ X
αn Pn ,
n=0
where the Fourier-Legendre coefficients are Z 1 αn = (u, Pn ) = u(t) Pn (t) dt,
n ∈ N∗ .
−1
By Remark 4.14, the best approximation of u in the m-dimensional space ⟨{P1 , P2 , ..., Pm }⟩ is the m-truncated Fourier-Legendre series. um =
m X
αn Pn .
n=0
Example 4.11 Let’s use the Legendre polynomials to approximate the function u : [−1, 1] → R, given by (% i18) u(t):= exp(-t)+sin(t); Let’s compute the first six Fourier-Legendre coefficients (% i19) for n:0 thru 5 step 1 do a[n]: Prod(u,P[n]); The m-truncated Fourier-Legendre series is given by (% i20) u[m](t):= sum(a[n]*P[n](t),n,0,m); um (t) :=
m X
an Pn (t)
(% o20)
n=0
Let’s define the metric of L2 ([−1, 1]) to be computed numerically: (% i21) D(u,v):= sqrt( quad qag ((u(t)-v(t))ˆ2, t, -1, 1, 3) ); Then the distance from u to the approximations u2 and u3 are (% i22) D(u,u[2])[1]; 0.07151413490484855
(% o22)
4.7. Hilbert basis of L2 (I)
155
(% i23) D(u,u[3])[1]; 0.004697899801161737
(% o23)
In Figure 4.4, are presented u together with the approximations u2 and u3 . Observe that visually, u3 is very close to u.
Figure 4.4.: The function [−1, 1] ∋ t 7→ u(t) = e−t + sin(t) ∈ R (blue) together with the truncated Legendre-Fourier series u2 (red) and u3 (green).
Finally, let’s see in a graphic how decreases the distance between u and um as m increases. (% i25) d(n):= [n, D(u,u[n])[1] ]; (% i27) plot2d([discrete, [ d(0), d(1), d(2), d(3), d(4), d(5) ] ]); See Figure 4.5.
Figure 4.5.: The value of ∥u − um ∥2 decreases as m increases.
4.7.2. Trigonometric Fourier series. Stone-Weierstrass theorem. In Section 4.7.1, Weierstrass Aproximation theorem, Theorem 4.3, was a key to prove that a Hamel basis of P([a, b]) is a Schauder basis for L2 ([a, b]). Then Gram-Schmidt helped us to show that the Fourier-Legendre polynomials constitute a Hilbert basis for L2 ([a, b]). In this section we shall use a generalization of Theorem
156
Chapter 4. Banach and Hilbert spaces
4.3, the Stone-Weierstrass theorem, to prove that the trigonometric Fourier system is a Hilbert basis for L2 ([−π, π]). In Definition 1.16 we introduced the concept of algebra. Now we shall present the concept of Banach algebra. In Section 7.2 we shall retake this concept. Definition 4.4 (Banach algebra) Let E be an algebra. We say that E is a Banach algebra iff (E, ∥ · ∥) is a Banach space and ∀u, v ∈ E :
||u · v|| ≤ ||u|| ||v||.
It’s immediate that the multiplication of vectors is continuous. Observe that if A is a subalgebra of the Banach algebra E, then A is Banach subalgebra of E. Let X be a non-void set. F (X), the space of all the real functions whose domain is X, introduced in Example 1.8, is actually a commutative algebra with unity whenever it’s equipped with the usual multiplication of functions. Here the unity is the function 1f : X −→ R, given by 1f (t) = 1. The linear subspace of F (X), B(X) = {u ∈ F (X) / u is bounded}, introduced in Example 3.13, is a subalgebra of F (X). In Example 3.13, we proved that B(X) has a norm, given by ∥u∥∞ = sup |u(t)|. t∈X
If (X, T ) is a topological space then C(X) is also a subalgebra of F (X). Therefore, by Theorem 1.12, Cb (X) = C(X) ∩ B(X) is a subalgebra of F (X). It’s immediate that Cb (X) is actually a commutative algebra with unity 1f . Remark 4.24 If (X, T ) is a compact topological space then Cb (X) = C(X). Theorem 4.21 (The Banach algebra (Cb (X), ∥ · ∥∞ )) Let (X, T ) be a topological space. Then (Cb (X), ∥ · ∥∞ ) is a Banach algebra. The proof is required as an exercise at the end of the chapter. The idea is to apply the strategy mentioned in Remark 3.12 and adapt the demonstration that (C([a, b]), ∥ · ∥∞ ) is complete, provided in Example 3.21. As an immediate consequence of Theorem 4.21 and Remark 4.24, we have the following result. Corollary 4.6 (C([a; b]) is a Banach algebra) The space (C([a; b]), ∥ · ∥∞ ) is a commutative Banach algebra with unity. Now we can state the Stone-Weierstrass theorem. Theorem 4.22 (Stone-Weierstrass) Let (X, T ) be a compact Hausdorff space and A a subalgebra of C(X). Assume that ∀t1 , t2 ∈ X, t1 ̸= t2 , ∃ρ ∈ A :
ρ(t1 ) ̸= ρ(t2 )
∀t ∈ X, ∃η ∈ A :
η(t) ̸= 0.
(4.87) (4.88)
4.7. Hilbert basis of L2 (I)
157
Then A is dense in (C(X), ∥ · ∥∞ ). For a proof of Theorem 4.22 , we refer the student to [8]. Remark 4.25 When condition (4.87) holds, we say that A separates points of X. Condition (4.88) holds if A contains the unity 1f . Remark 4.26 Observe that Weierstrass approximation theorem is a corollary of the Stone-Weierstrass Theorem because P([a, b])) is a subalgebra of C([a, b]) that separates points of [a, b] and contains the polynomial 1f . Let’s consider the trigonometric Fourier system (or simply Fourier system): F = {1f } ∪ {Cn /n ∈ N} ∪ {Sn /n ∈ N} ⊆ C([−π, π]), where, for n ∈ N, Cn (t) = cos(nt),
Sn (t)) sin(nt).
The trigonometric Fourier system F appeared in a natural way in the Bernoulli (1753) and J. Fourier (1822) on vibrating strings and heat respectively. This is because F helps to represent periodic systems simple functions. Given n ∈ N, both Cn and Sn are eigenfunctions dimensional Laplace operator, ∆ : C∞ ([−π, π]) −→ C∞ ([−π, π]),
works of D. conduction, in terms of of the one-
∆u = u′′ , associated to the eigenvalue λn = −n2 . See Figure 4.6.
Figure 4.6.: The functions C1 (blue), C2 (red) and C3 (green). As k increases, so does the frequency of Ck . We shall prove that F is an orthogonal basis of L2 ([−π, π]). Let’s recall that the inner-product and norm of L2 ([−π, π]) are given, respectively, by (% i1) Prod(u,v):= integrate(u(t)*v(t),t,-%pi,%pi); Z π Prod (u, v) := u(t) v(t)dt −π
(% i2) N(u):=sqrt(Prod(u,u)); Let’s check that F is an orthogonal set:
(% o1)
158 (% (% (% (% (%
Chapter 4. Banach and Hilbert spaces i4) i5) i6) i7) i9)
declare(n, integer); declare(m, integer); One(t):= 1; C[n](t):= cos(n*t); S[n](t):= sin(n*t); Prod(One,C[n]); Prod(One, S[n]);
which gives zero in both cases. (% i11) Prod(C[n],C[m]); Prod(C[n],S[m]); which gives zero in both cases. (% i12) Prod(S[n],S[m]); 0
(% o12)
Since the norms of the elements of F are (% i13) N(One); √ √ 2 π
(% o13)
(% i14) N(C[n]); √
π
(% o14)
π
(% o15)
(% i15) N(S[n]); √
it follows that the normalized Fourier system is orthonormal: 1 1 1 F1 = √ 1f ∪ √ Cn /n ∈ N ∪ √ Sn /n ∈ N . π π 2π Theorem 4.23 (Trigonometric Fourier system as a Schauder basis) A = ⟨F⟩ is dense in the space (C([−π, π]), ∥ · ∥∞ ). Therefore, F is a Schauder basis for (C([−π, π]) in the ∥ · ∥∞ -norm. Remark 4.27 Let’s recall that for all x, y ∈ R: 1 [sin(x + y) + sin(x − y)] , 2 1 cos(x) cos(y) = [cos(x + y) + cos(x − y)] , 2 1 sin(x) sin(y) = [cos(x − y) − sin(x + y)] . 2 sin(x) cos(y) =
4.7. Hilbert basis of L2 (I)
159
Proof. We shall apply the Stone-Weierstrass Theorem, Theorem 4.22. 1. Let’s prove that A is an algebra. Since F is a Hamel basis of A, by Theorem 1.13, we have to prove that u · v ∈ A, for every u, v ∈ F, which is equivalent to prove that ∀m, n ∈ N :
Sn · Cm , Cn · Cm , Sn · Sm ∈ A.
(4.89)
Let m, n ∈ N, generic. By using Remark 4.27, we have for t ∈ [−π, π]: 1 [sin((n + m)t) + sin((n − m)t)] , 2 1 cos(nt) cos(mt) = [cos((n + m)t) + cos((n − m)t)] , 2 1 sin(nt) sin(mt) = [cos((n − m)t) − cos((n + m)t)] . 2 sin(nt) cos(mt) =
Since sin is odd and cos is even, the last together with the arbitrariness of m, n and t, show (4.89). 2. By Remark 4.25, it just remains to prove that A separates points of [−π, π], i.e., ∀t1 , t2 ∈ [−π, π], t1 ̸= t2 , ∃ρ ∈ A : ρ(t1 ) ̸= ρ(t2 ). (4.90) Let t1 , t2 ∈ [−π, π] such that t1 ̸= t2 , generic. a) Let’s consider the case of t1 ∈] − π, 0[ and t2 ∈ [0, π] ∪ {−π}. We take ρ = sin ∈ A, so that ρ(t1 ) < 0 ≤ ρ(t2 ). b) Let’s consider the case of t1 ∈]−π, 0[ and t2 ∈]−π, 0[. If t2 ̸= −π −t1 , we take ρ = sin ∈ A so that ρ(t1 ) ̸= ρ(t2 ). If t2 = −π − t1 , we take ρ = cos ∈ A so that ρ(t1 ) ̸= ρ(t2 ), because cos is strictly increasing in ] − π, 0[. Since, t1 , t2 were chosen arbitrarily, points a) and b) prove (4.90). ■ Corollary 4.7 (Trigonometric Fourier system) The system F is an orthogonal basis of L2 ([−π, π]). Proof. Since C([−π, π]) is dense in L2 ([−π, π]), to prove that F is an orthogonal basis of L2 ([−π, π]), it’s enough to prove that A is dense in C([−π, π]) in the ∥·∥2 norm, i.e., we have to prove that ∀u ∈ C([−π, π]), ∀ϵ > 0, ∃w ∈ A :
∥u − w∥2 < ϵ.
(4.91)
Let u ∈ C([−π, π]) and ϵ > 0, generic. By (4.56), we know that there is c > 0 such that ∀v ∈ C([−π, π]) : ∥v∥2 ≤ c∥v∥∞ . (4.92) By Theorem 4.23, there is w ∈ A such that ϵ ∥u − w∥∞ < . c
(4.93)
160
Chapter 4. Banach and Hilbert spaces
Then, by (4.92) and (4.93), we have that ∥u − w∥2 ≤ c∥u − w∥∞ < c
ϵ = ϵ. c
Since u and ϵ were chosen arbitrarily, we have proved (4.91).
■
Remark 4.28 (Trigonometric Fourier series) Now, thanks to Corollary 4.7, it holds the classical Fourier series, i.e., for all u ∈ L2 (−π, π): u=
+∞ +∞ X X a0 1f + an Cn + bn Sn , 2 n=1 n=1
which, implies that +∞ +∞ X a0 X + an cos(nt) + bn sin(nt), u(t) = 2 n=1 n=1
a.e. t ∈ [−π, π],
where, by Remark 4.13, for n ∈ N: Z (u, 1f /2) (u, 1f ) 1 π = 2 = u(t) dt, ∥1f /2∥22 ∥1f ∥22 π −π Z 1 π (u, Cn ) = u(t) cos(nt) dt, an = ∥Cn ∥22 π −π Z (u, Sn ) 1 π bn = = u(t) sin(nt) dt. ∥Sn ∥22 π −π a0 =
In this context, Parseval equality becomes +∞
1 a20 X 2 an + b2n = + 2 π n=1
Z
π
|u(t)|2 dt.
−π
Example 4.12 Let’s use the trigonometric Fourier system to approximate the function f : [−π, π] −→ R, given by (% i16) f(t):= exp(t)*sin(8*t); The classical Fourier coefficients are given by (% i17) a0: (2*Prod(f,One)) / N(One)ˆ2, expand; −π
8%e 65π
π
−
8%e 65π
(a0)
(% i18) a(n):= (Prod(f,C[n])) / N(C[n])ˆ2; (% i19) b(n):= (Prod(f,S[n])) / N(S[n])ˆ2; The m-truncated Fourier-Legendre series is given by (% i20) f[m](t):= a0/2 + sum(a(n)*C[n](t),n,1,m)+sum(b(n)*S[n](t),n,1,m); m
fm (t) :=
m
X X a0 + a(n) Cn (t) + b(n) Sn (t) 2 n=1 n=1
(% o20)
4.7. Hilbert basis of L2 (I)
161
Let’s define the metric of L2 ([−π, π]) to be computed numerically: (% i21) D(u,v):= sqrt(quad qag((u(t)-v(t))ˆ2,t,-%pi,%pi,3)); Then the distance from u to the approximations u2 and u3 are (% i22) D(f,f[2])[1]; 11.17889165099482
(% o22)
11.02457768446077
(% o23)
(% i23) D(f,f[3])[1];
In Figure 4.7 are presented u together with the approximations f3 and f10 . Observe that, visually, f10 is very close to f .
Figure 4.7.: The function [−π, π] ∋ t 7→ f (t) = et sin(t) ∈ R (blue) together with the truncated Fourier series f3 (red) and f10 (green).
Finally, let’s see in a graphic how decreases the distance between u and um as m increases. (% i25) d(n):= [ n, D(f,f[n])[1] ]; (% i27) plot2d([discrete, [ d(1),d(2),d(3),d(4),d(5),d(6),d(7),d(8),d(9),d(10), d(11),d(12),d(13),d(14),d(15),d(16),d(17),d(18),d(19),d(20) ]]); See Figure 4.8.
4.7.3. Fourier-Hermite series In Section 4.7.1 we used the canonical basis of P([−1, 1]) to produce, with help of the Gram-Schmidt scheme, a Hilbert basis for L2 ([−1, 1]). In other hand, we can not use the canonical basis of P(R) to generate a Hilbert basis of L2 (R). Therefore, we shall modify the canonical polynomials so that the resulting functions do belong to L2 (R). So, we consider the system G = {wn / n ∈ N∗ }, where for each n ∈ N∗ , the function wn : R −→ R is given by 2
wn (t) = tn · e−t
/2
.
162
Chapter 4. Banach and Hilbert spaces
Figure 4.8.: The value of ∥f − fm ∥2 decreases as m increases.
By applying the Gram-Schmidt scheme to G, we obtain the Hermite-Fourier system M = {en / n ∈ N∗ }, where for n ∈ N∗ : en (t) =
(2n
2 1 √ 1/2 e−t /2 Hn (t), · n! π)
t ∈ R.
(4.94)
en is known as the n-th Hermite’s function and Hn is referred as the n-th Hermite polynomial. See Figure 4.9. By Gram-Schmidt and (4.94), it follows that ( Z 0, if m ̸= n, −t2 Hm (t)Hn (t) · e dt = √ n π, if m = n. 2 n! R
Figure 4.9.: The first Hermite functions: e0 (blue), e1 (red), e2 (green). The Hermite polynomials can be obtained with the formula: 2
Hn (t) = (−1)n et ·
N X dn −t2 2n−2j j (e ) = n! (−1) · tn−2j , dtn j!(n − 2j)! j=0
where N = n/2 if n is even, and N = (n − 1)/2 if n is odd. Hermite polynomials can also be produced by recurrence. For t ∈ R: ( H0 (t) = 1; Hn+1 (t) = 2t · Hn (t) − Hn′ (t).
(4.95)
4.8. Convex sets, functions and hyperplanes
163
Let’s apply (4.95) to find the first Hermite polynomials: (% i1) H[0](t):=1; (% i2) for n:0 thru 4 step 1 do define(H[n+1](t), 2*t*H[n](t)-diff(H[n](t),t)), expand; (% i3) for n:0 thru 5 step 1 do display(H[n](t)); H0 (t) = 1 H1 (t) = 2t H2 (t) = 4t2 − 2 H3 (t) = 8t3 − 12t H4 (t) = 16t4 − 48t2 + 12 H5 (t) = 32t5 − 160t3 + 120t (% o3) Remark 4.29 Given n ∈ N, the Hermite polynomial Hn is an eigenfunction of the Hermite operator, H : C∞ (R) −→ C∞ (R), given by H[u](t) = −u′′ (t) + 2t u′ (t), associated to the eigenvalue λn = 2n. The n-Hermite function verifies F[en ] = (−i)n F[en ], where F : L2 (R; C) → L2 (R; C) is the Fourier transform (see e.g. [18, Sec. 5.6]). Theorem 4.24 (Hermite-Fourier system as a Hilbert basis) M is a Hilbert basis for L2 (R). The student can find a proof of this result in [2]. Now, thanks to Theorem 4.24, it holds the Hermite-Fourier series, i.e., for all u ∈ L2 (R): +∞ X u= βn e n , n=1
which, implies that u(t) =
+∞ X
βn en (t),
a.e. t ∈ R.
n=1
where, βn = (u, en ).
4.8. Convex sets, functions and hyperplanes In many applications, the concept of convexity plays a significant role. In this section, we shall introduce these ideas and state some simple results.
164
Chapter 4. Banach and Hilbert spaces
Definition 4.5 (Convex set) Let V be a linear space and C ⊆ V . We say that C is a convex set iff ∀u, v ∈ C, ∀t ∈ [0, 1] :
tu + (1 − t)v ∈ C.
Next we present some interesting properties of convex sets. Theorem 4.25 (Basic properties of convex sets) Let V be a linear space. Let A, B ⊆ V , convex. Then 1. A + B is convex. 2. A + A = 2A. 3. Any finite convex combination of elements of A, belong to A, i.e., ∀u1 , ..., un ∈ A, ∀t1 , ..., tn ∈ [0, 1] :
n X
tk = 1 ⇒
k=1
n X
tk uk ∈ A.
k=1
4. If W is a linear space and T ∈ L(V, W ), then T (A) is convex. The proof of this result is easy and is required as an exercise at the end of the chapter. Let’s remark that A + B = {u = v + w / v ∈ A ∧ w ∈ B}, A + A = {u = v + w / v, w ∈ A}, 2A = {2u / u ∈ A}. Theorem 4.26 (Ball. Adherence of a convex set.) Let V be a normed space. Then, A ⊆ V convex implies that A is convex. Every ball in V is convex. The proof of this result is required as an exercise at the end of the chapter. Let’s remark that, by Theorem 4.26, given a ball B(u0 , r) ⊆ V , we have that ∀u1 , u2 ∈ B(u0 , r), ∀t ∈ [0, 1] :
∥tu1 + (1 − t)u2 ∥ < r.
(4.96)
Theorem 4.27 (Intersection of convex sets) Let \ V be a normed space and (Aλ )λ∈Λ a family of convex subsets of V . Then Aλ is convex. λ∈Λ
The proof of this result is required as an exercise at the end of the chapter. Remark 4.30 (Convex hull) Let V be a linear space and A ⊆ V . The convex hull of A, denoted Conv(A), is defined as the intersection of all the convex subsets of V which contain A. It can be proved that v ∈ Conv(A) iff v is a finite convex combination of elements of A, i.e., n n X X Conv(A) = u = tk vk / v1 , ..., vn ∈ A, t1 , ..., tn ∈ [0, 1] tj = 1 . k=1
j=1
4.8. Convex sets, functions and hyperplanes
165
Definition 4.6 (Affine hyperplane) Let E be a linear space and f : V −→ R a linear functional. Given α ∈ R, we say that the set H = [f = α] = {u ∈ E /f (u) = α} is an affine hyperplane (or simply hyperplane). In this case, we say that f = α is the equation of H. It’s clear that any hyperplane is a convex set. In particular, the kernel of a linear functional is an hyperplane. Proposition 4.2 Let E be a linear space, f : V −→ R a linear functional and u0 ∈ E \ Ker(f ). Then, E = Ker(f ) ⊕ ⟨{u0 }⟩. Proof. We have to prove that ∀u ∈ E, ∃!(z, α) ∈ Ker(f ) × R :
u = z + αu0 .
(4.97)
1. Let u ∈ E, generic. Let’s define z =u− Then f (z) = f (u) −
f (u) u0 . f (u0 ) f (u) f (u0 ) = 0, f (u0 )
so that z ∈ Ker(f ) and u = z + αu0 , with α = f (u)/f (u0 ) ∈ R. 2. Let’s take α1 ∈ R and z1 ∈ Ker(f ) such that u = z + αu0 = z1 + α1 u0 . Then, f (u) = αf (u0 ) = α1 f (u0 ), whence α = α1 as well as z = z1 . Since u was chosen arbitrarily, we have proved (4.97). ■ Remark 4.31 (Codimension of the kernel) In the context of Proposition 4.2, we have that codim(Ker(f )) = 1, which means that any hyperplane is just a traslation of Ker(f ), i.e., H = [f = α] = u0 + Ker(f ), where u0 is any vector such that f (u0 ) = α. Definition 4.7 (Convex functional) Let E be a linear space. A functional φ : E −→ R is said to be convex iff ∀u, v ∈ E, ∀t ∈ [0, 1] :
φ(tu + (1 − t)v) ≤ t φ(u) + (1 − t) φ(v).
We say that φ is concave if −φ is convex. The properties stated in the following proposition are required as an exercise in the following section.
166
Chapter 4. Banach and Hilbert spaces
Proposition 4.3 (Basic properties of convex functionals) Let E be a linear space and φ, ϕ : E −→ R. 1. If φ is convex, then the epigraph of φ, epi φ = {(u, λ) ∈ E × R / φ(u) ≤ λ}, is convex in the product space E × R, and conversely. 2. If φ is convex, then for every λ ∈ R, [φ ≤ λ] = {u ∈ E / φ(u) ≤ λ} is convex. 3. If φ and ϕ are convex, then φ + ϕ is convex.
4.9. Problems Problem 4.1 Let (V, ∥ · ∥) be a normed space and A, B ⊆ V . Prove that if either A or B is open then A + B is open. Problem 4.2 Let (V, ∥ · ∥) be a normed space. 1. Prove that ∀u, v ∈ V :
| ∥u∥ − ∥v∥ | ≤ ∥u − v∥.
2. Prove that the norm is a continuous mapping. Problem 4.3 Consider the context of Remark 4.3. Does it hold the following statement? Tα = Tβ ⇒ ∥ · ∥ α ∼ ∥ · ∥ β . Problem 4.4 1. Prove that ∀u ∈ C([a, b]) :
∥u∥1 ≤
∀u ∈ C([a, b]) :
∥u∥2 ≤
2. Prove that
√ √
b − a ∥u∥2 .
b − a ∥u∥∞ .
3. Prove that T1 ⊆ T2 ⊆ T∞ , where Tk is the topology induced by the norm ∥ · ∥k , k = 1, 2, ∞. Problem 4.5 Let (V, ∥ · ∥) be a normed space and W a subspace of V with dim(W ) < +∞. Prove that W is complete and closed. Problem 4.6 Consider a continuous function f : [0, 1] −→ R. For n ∈ N, the n-th Bernstein polynomial Bn [f ] : [0, 1] −→ R is defined by Bn [f ](t) =
n X n k tk (1 − t)n−k . f k n
k=0
4.9. Problems
167
1. (*) Prove that ∥Bn [f ] − f ∥∞ −→ 0, as n −→ +∞.
(4.98)
∥Bn [f ] − f ∥2 → 0, as n → +∞;
(4.99)
2. Prove that
∥Bn [f ] − f ∥1 → 0, as n → +∞; 3. By using Maxima explain, with graphics and computations, the convergences (4.98) and (4.99). Consider the cases f (x) = 10x2 + sin(12πx),
f (x) = x + sin(12πx).
Problem 4.7 Let (V, ∥ · ∥) be a normed space with dim(V ) < +∞, and A ⊆ V . Prove that A is compact iff it’s closed and bounded. Problem 4.8 Let p ≥ 1. Prove that the normed space lp (R) is a Banach space. Hint. Use the strategy mentioned in Remark 3.12 adapting the scheme used in Example 3.20. Problem 4.9 Let 1 < p < +∞. Prove that ∀u, v ∈ C([a, b]) :
∥u + v∥p ≤ ∥u∥p + ∥v∥p .
Hint. Use H¨ older inequality for integrals and adapt the proof of the Minkowski inequality for finite sums, Theorem 4.7. Problem 4.10 Let 1 ≤ p ≤ q ≤ +∞. 1. Prove that L1 ([a, b]) ⊇ Lp ([a, b]) by showing that ∥·∥1 is dominated by ∥·∥p . 2. Prove that Lp ([a, b]) ⊇ Lq ([a, b]) by showing that ∥·∥p is dominated by ∥·∥q . 3. Prove that Lq ([a, b]) ⊇ (C([a, b]), ∥ · ∥∞ ) by showing that ∥ · ∥q is dominated by ∥ · ∥∞ . 4. Prove that for the corresponding topologies, T1 ⊆ Tp ⊆ Tq . Problem 4.11 Find a function x ∈ L2 ([0, 1]) such that x2 ∈ / L2 ([0, 1]). Problem 4.12 On the linear space C1 ([a, b]) consider the functional ∥ · ∥C 1 : C1 ([a, b]) → R, given by ∥u∥C 1 = ∥u∥∞ + ∥u′ ∥∞ . 1. Prove that ∥ · ∥C 1 is a norm. 2. Prove that (C1 ([a, b]), ∥ · ∥C 1 ) is a Banach space. Hint. Use the completeness of (C([a, b]), ∥ · ∥∞ ), the Fundamental Theorem of Calculus and the fact that any normed space is Hausdorff. Problem 4.13 Consider the functional space X = (Cn ([a, b]), ∥ · ∥C n ). 1. Prove that X is complete. 2. Is the linear subspace (C∞ ([a, b]) closed?
168
Chapter 4. Banach and Hilbert spaces
Problem 4.14 In the functional space L∞ (R), consider the linear subspaces C∞ (R) = u ∈ C(R) / lim u(x) = 0 , |x|→+∞
C0 (R) = {u ∈ C(R) / ∃a > 0, ∀|x| > a : u(x) = 0}. Prove that C0 (R) = C∞ (R). Problem 4.15 Let p ≥ 1 and a < b. 1. Prove that the functional ∥ · ∥1,p : C1 ([a, b]) −→ R, given by ∥u∥1,p = ∥u∥p + ∥u′ ∥p , is a norm. 2. Prove that the Sobolev norm ∥ · ∥1,p is equivalent to the norm given by 1/p ∥u∥∗1,p = ∥u∥pp + ∥u′ ∥pp . Problem 4.16 Let I =] − 1, 1[. Consider the function u : I −→ R given by u(x) = |x|. 1. Prove that for evry p ∈ [1, +∞[, u ∈ W1,p (I). 2. Prove that the weak derivative of u is given by ( −1, if − 1 < x < 0, u′ (x) = 1, if 0 ≤ x < 1. Problem 4.17 Does the sequence (xn )n∈N ⊆ C([0, 1]) converge? xn (t) = tn − tn+1 ;
xn (t) = tn − t2n ;
xn (t) =
tn+2 tn+1 − . n+1 n+2
Problem 4.18 Let k ∈ N. Let x, xn , y ∈ Ck ([a, b]), n ∈ N. Prove that lim xn = x
n→+∞
⇒
lim xn y = x y.
n→+∞
Problem 4.19 Prove that P([a, b]) is dense in a) L2 ([a, b]), b) H1 (Ω). Is P([a, b]) dense in W1,p (RN ) for p ≥ 1? Problem 4.20 By using Maxima, compute the angles of the H1 ([0, 1])-triangle defined by the functions x1 , x2 , x3 , given by x1 (t) = sin(t),
x2 (t) = et ,
x3 (t) = t2 − 1.
Problem 4.21 Let V be a Hilbert space and M ⊆ V . 1. Prove that M ⊥ = {u ∈ V / ∀v ∈ M : (u, v) = 0} is a closed linear subspace of V . 2. Prove that M ⊆ (M ⊥ )⊥ . 3. Prove that if M ⊆ N , then N ⊥ ⊆ M ⊥ . Problem 4.22 Let U be a closed subspace of a Hilbert space H. Prove that U = U ⊥⊥ .
4.9. Problems
169
Problem 4.23 Let H be a Hilbert space and (xn )n∈N , (yn )n∈N ⊆ B(0, 1) such that (xn , yn ) → 1, as n → +∞. Prove that lim ∥xn − yn ∥ = 0.
n→+∞
Problem 4.24 Let (V, ∥ · ∥) be a normed space. Prove that absolute convergence implies convergence of a series iff V is a Banach space. Problem 4.25 Prove that the normed space l∞ (R) is a Banach space. Problem 4.26 Let Υ be the subset of RN with only finitely many nonzero terms. (m) Consider the sequence (υm )m∈N ⊆ Υ, where for each m ∈ N, υm = xn n∈N
is given by x(m) = n
δnm . n2
1. Prove that Υ is a non-closed linear subspace of l∞ (R). 2. Prove that the sequence of partial sums associated with (υm )m∈N is absolutely convergent but it is not convergent. Problem 4.27 Let (V, ∥ · ∥) be a infinite-dimensional normed space and S = {un / n ∈ N} ⊆ V . Prove that S is a Schauder basis of V iff ∀v ∈ V, ∃!(αn )n∈N ∈ RN :
v=
+∞ X
αn un .
n=1
Problem 4.28 In the space l∞ (R) let’s consider the subsets = {em / m ∈ N} C and B = {ηm / m ∈ N}, where for each m ∈ N, em = e(m) and ηm = n n∈N ηn(m) , are given by n∈N
( e(m) n
= δmn ,
ηn(m)
=
1, if n ≤ m, 0, if n > m.
By using (4.62), prove that B and C are Schauder basis for l∞ (R). Problem 4.29 Let H be a separable Hilbert space. Prove that H is isomorphic and isometric to l2 (R). Problem 4.30 Let (E, ∥ · ∥) be a Banach algebra. 1. Prove that the multiplication is a continuous mapping. 2. Let A be a subalgebra of E. Prove that A is Banach subalgebra of E. Problem 4.31 Let (X, T ) be a topological space. Prove that (Cb (X), ∥ · ∥∞ ) is a Banach algebra. Hint. Apply the strategy mentioned in Remark 3.12 and adapt the demonstration present in Example 3.21.
170
Chapter 4. Banach and Hilbert spaces
Problem 4.32 Consider the function u : [−1, 1] −→ R given by u(t) = e−|t| . By using Maxima, find um a linear combination of Legendre polynomials such that ∥u − um ∥2 < ϵk . Consider ϵ1 = 0.1 and ϵ2 = 0.01. Problem 4.33 Consider the function u : [−π, π] −→ R given by u(t) = e−|t| . By using Maxima, find um a truncated Fourier series such that ∥u − um ∥2 < ϵk . Consider ϵ1 = 0.1 and ϵ2 = 0.01. Problem 4.34 Consider the function u : R −→ R given by u(t) = e−|t| . By using Maxima, find um a linear combination of the Hermite functions such that ∥u − um ∥2 < ϵk . Consider ϵ1 = 0.1 and ϵ2 = 0.01. Problem 4.35 Consider the functions wn : [0, +∞[−→ R, given by w0 (t) = e−t/2 ,
wn (t) = tn e−t/2 ,
t ∈ [0, +∞[, n ∈ N.
We know that S = {wn / n ∈ N∗ } is a Schauder basis for L2 ([0, +∞[). For n ∈ N∗ , the Laguerre polynomial Ln : [0, +∞[→ R is given by Ln (t) =
(n) et · tn e−t . n!
1. Apply the Gram-Schmidt scheme to S to obtain the first ten elements of the ˆ n /n ∈ N∗ }, referred to as the Laguerre-Fourier system. Hilbert basis G = {L 2. For n = 1, ..., 10, verify that ˆ n (t) = e−t/2 Ln (t), L
t ∈ [0, +∞[.
3. Consider the linear differential operator G : C∞ ([0, +∞[) −→ C∞ ([0, +∞[), given by G[u](t) = t u′′ (t) + (1 − t) u′ (t), t ∈ [0, +∞[. Prove that for n ∈ N∗ , Ln is an eigenfunction of G associated to the eigenvalue λn = −n. Problem 4.36 Prove that the system of functions whose formulas are 2πk(t − a) 2πk(t − a) 1, sin , cos , k ∈ N, b−a b−a is orthogonal in H1 ([a, b]). Problem 4.37 Consider the set H10 ([a, b]) = x ∈ H1 ([a, b]) / x(a) = x(b) = 0 . 1. Prove that H10 ([a, b]) is a closed linear subspace of H1 ([a, b]). ⊥ 2. Find H10 ([a, b]) . Problem 4.38 Consider the set M = x ∈ H1 ([a, b]) / x(a) = x(b) . 1. Prove that M is a closed linear subspace of H1 ([a, b]). 2. Find M ⊥ .
4.9. Problems
171
Problem 4.39 Find the set M = {u ∈ C([0, 1]) / ∥u − u0 ∥∞ = d(K, u0 )}, where 1. u0 (t) = 1, K = {u ∈ C([0, 1]) / u(0) = 0}; 2. u0 (t) = t, K = P0 ([0, 1]); 3. u0 (t) = t2 , K = P1 ([0, 1]). Problem 4.40 Consider the set M =
Z
2
x ∈ L ([0, 1]) :
1
x(t)dt = 0 .
0 2
1. Prove that M is a closed linear subspace of L ([0, 1]). 2. Find explicitly M ⊥ . 3. Find d(M, x0 ), where x0 (t) = t2 . Problem 4.41 In the space L2 ([0, 1]), consider the element f : [0, 1] −→ R, given by f (t) = t3 . For k = 0, 1, 2: 1. Find the projection of f into the subspace Pk ([0, 1]). 2. Find d (f, Pk ([0, 1]))). Problem 4.42 Prove that any hyperplane is a convex set. Problem 4.43 Let V be a linear space. Let A, B ⊆ V , convex. Prove that 1. 2. 3. 4.
A + B is convex. A + A = 2A. Any finite convex combination of elements of A, belong to A. If W is a linear space and T ∈ L(V, W ), then T (A) is convex.
Problem 4.44 Let V be a linear space and A ⊆ V . Prove that v ∈ Conv(A) iff v is a finite convex combination of elements of A. Problem 4.45 Let V be a normed space and A ⊆ V convex. Prove that A is convex. Problem 4.46 Let M the set of functions x ∈ L2 ([0, 1]) such that x(t) ∈ [−1, 1],
a.e. t ∈ [0, 1].
Prove that M is convex. Is M closed? Problem 4.47 Let E be a linear space and φ, ϕ : E → R. Prove that 1. if φ is convex, then the epigraph of φ, epi φ = {(u, λ) ∈ E × R / φ(u) ≤ λ}, is convex in the product space E × R, and conversely; 2. if φ is convex, then for every λ ∈ R, [φ ≤ λ] = {u ∈ E / φ(u) ≤ λ} is convex; 3. if φ and ϕ are convex, then φ + ϕ is convex.
5. Fundamental theorems of Functional Analysis 5.1. Introduction The main results of this chapter are the so-called fundamental theorems of Functional Analysis: Riesz-Fr´echet representation theorem, Hahn-Banach theorem, the uniform boundedness principle, the open mapping theorem and the closed graph theorem. All of them involve linear operators or functionals so that we suggest the student to review Section 1.7 before starting; there it was introduced the concept of linear operator from an algebraic point of view. Given the normed spaces (V, ∥·∥V ) and (W, ∥·∥W ) and a linear subspace U ⊆ V , the notation A : U ⊆ V → W, implies that the space U is endowed with the norm ∥ · ∥V . As we shall see, in many useful situations, U is a dense proper subspace of V , i.e., U ̸= V and U = V , and we shall say that the operator A has a dense domain. Example 5.1 (Position operator in Quantum Mechanics) Let’s consider the linear operator Q : D(Q) ⊆ L2 (R) −→ L2 (R), given by Q[u](t) = t u(t), where D(Q) =
u ∈ L2 (R) /
Z
t2 u2 (t)dt < +∞ . Q is known as the position
R
operator in the context of Quantum Mechanics; it corresponds to the observable position of a particle. Since C0 (R) ⊆ D(Q) ⊆ L2 (R), it follows that Q has a dense domain. Example 5.2 (Momentum operator in Quantum Mechanics) Let’s consider the linear operator P : D(P ) ⊆ L2 (R; C) → L2 (R; C), given by P [u](t) = −iℏ u′ (t), h where D(P ) = H1 (R; C). Here ℏ = is Planck’s reduced constant. P is known 2π as the momentum operator in the context of Quantum Mechanics; it corresponds to the observable momentum of a particle. Since C0 (R; C) ⊆ D(P ) ⊆ L2 (R; C), it follows that P has a dense domain. 173
174
Chapter 5. Fundamental theorems of Functional Analysis
Definition 5.1 (Bounded linear operator / functional) Let (V, ∥ · ∥V ) and (W, ∥ · ∥W ) be normed spaces and T ∈ L(V, W ). We say that T is a bounded linear operator iff ∃c > 0, ∀u ∈ V :
∥T (u)∥W ≤ c ∥u∥V .
(5.1)
Whenever W = R, we say that T is a bounded linear functional. Because of (5.1), if G ⊆ V is bounded then T (G) is also bounded. It’s not difficult to prove that L(V, W ) = {T ∈ L(V, W ) / T is bounded}, is a linear subspace of L(V, W ). In the case W = V , we shall simply denote L(V ) = L(V, V ). Whenever W = R we denote by V ′ = L(V, R), the dual space of V . Theorem 5.1 (Norm of a bounded operator) The functional ∥·∥ : L(V, W ) −→ R, given by ∥T ∥ = inf(OT ), (5.2) where OT = {c > 0 / ∀u ∈ V : ∥T (u)∥W ≤ c∥u∥V },
(5.3)
is a norm. Remark 5.1 (A common abuse of notation) If it does not cause confusion, we shall denote ∥ · ∥ for both ∥ · ∥V and ∥ · ∥W . Proof. 1. Let’s prove that ∀T ∈ L(V, W ) :
∥T ∥ = 0 ⇐⇒ T = 0.
(5.4)
Let T ∈ L(V, W ), generic. a) Let’s assume that ∥T ∥ = 0. Then(5.2) implies that ∀u ∈ V :
∥T (u)∥W = 0,
and, since ∥ · ∥W is a norm, T (u) = 0, for every u ∈ V , i.e., T = 0. b) Let’s assume T = 0 so that OT =]0, +∞[. Therefore, (5.2) implies that ∥T ∥ = 0. Since T was chosen arbitrarily, we have proved (5.4). 2. Let’s prove that ∀λ ∈ R, ∀T ∈ L(V, W ) :
∥λ · T ∥ = |λ| · ∥T ∥.
(5.5)
Let λ ∈ R and T ∈ L(V, W ), generic. If λ = 0 then (5.5) is immediate. So let’s assume that λ = ̸ 0.
5.1. Introduction
175
a) Let’s prove that ∀α ∈ OλT :
∥T ∥ ≤
α . |λ|
(5.6)
Let α ∈ OλT , generic. Then, for all u ∈ V : ∥(λT )(u)∥ ≤ α∥u∥
and ∥T (u)∥ ≤
α ∥u∥, |λ|
α ∈ OT and, therefore, ∥T ∥ ≤ α/|λ|. Since α was chosen |λ| arbitrarily, we have proved (5.6). b) By Corollary 1.1, whence
∀ϵ > 0, ∃α0 ∈ OλT :
∥λT ∥ ≥ α0 − ϵ.
(5.7)
By (5.6) and (5.7), it follows that ∀ϵ > 0 :
∥T ∥ ≤
∥λT ∥ + ϵ , |λ|
whence ∥T ∥ ≤ ∥λT ∥/|λ|, i.e., ∥λT ∥ ≥ |λ| ∥T ∥.
(5.8)
By interchanging the roles of OT and OλT , and proceeding as in points a) and b), we obtain that ∥λT ∥ ≤ |λ| ∥T ∥ which together with (5.8), proves (5.5). 3. Let’s prove that ∀T, S ∈ L(V, W ) :
∥T + S∥ ≤ ∥T ∥ + ∥S∥.
(5.9)
Let T, S ∈ L(V, W ), generic. Let u ∈ V , c1 ∈ OT and c2 ∈ OS , generic. Then ∥(T + S)(u)∥ = ∥T (u) + S(u)∥ ≤ ∥T (u)∥ + ∥S(u)∥ ≤ c1 ∥u∥ + c2 ∥u∥ = (c1 + c2 )∥u∥. Since u, c1 and c2 were chosen arbitrarily, we have proved that c1 + c2 ∈ OT +S , for every c1 ∈ OT and every c2 ∈ OS , and so ∀c1 ∈ OT , ∀c2 ∈ OS :
∥T + S∥ ≤ c1 + c2 ,
and ∥T + S∥ ≤ inf(OT ) + inf(OS ) = ∥T ∥ + ∥S∥. Since T and S were chosen arbitrarily, we have proved (5.9). ■ Remark 5.2 (Bounded operators are Lipschitz continuous) In the context of Theorem 5.1, observe that ∀T ∈ B(V, W ), ∀u ∈ V :
∥T (u)∥W ≤ ∥T ∥ ∥u∥V .
176
Chapter 5. Fundamental theorems of Functional Analysis
Also, by the linearity of T , ∀u, v ∈ V :
∥T (u) − T (v)∥ ≤ ∥T ∥ · ∥u − v∥,
(5.10)
so that T is Lipschitz continuous. In particular, by the Banach fixed point Theorem, Theorem 3.14, if V is a Banach space, V = W and ∥T ∥ < 1, then T is contractive and, therefore, has a unique fixed point, i.e., ∃!u0 ∈ V :
T (u0 ) = u0 .
Remark 5.3 It’s important to have in mind the following alternative ways to compute the norm of an operator T ∈ L(V, W ): ∥T ∥ = sup u̸=0
∥T (u)∥W = sup ∥T (u)∥. ∥u∥V ∥u∥=1
(5.11)
Remark 5.4 (Computing the norm of an operartor) Let V and W be linear spaces. In practice, given a linear operator T : V → W that we suspect that is bounded, we follow these steps: 1. We prove the boundedness of T by finding a constant c1 > 0 such that ∀u ∈ V :
∥T (u)∥W ≤ c1 ∥u∥V .
(5.12)
this implies that c1 ∈ OT and immediately gives the upper estimate ∥T ∥ ≤ c1 .
(5.13)
In most cases, it’s not possible to compute exactly the value of ∥T ∥ and we have to deal with what we have. 2. However, in applications it’s nice when it’s possible to improve the estimate (5.13). For this, one reviews or changes the argument used to obtain (5.12) to find 0 < c2 ≤ c1 such that ∀u ∈ V :
∥T (u)∥W ≤ c2 ∥u∥V ,
(5.14)
so that c2 ∈ OT . This immediately gives the upper estimation ∥T ∥ ≤ c2 .
(5.15)
3. In the few cases where it’s possible to compute exactly the value of ∥T ∥, we pick an appropriate u0 ∈ V such that ∥u0 ∥ = 1 and ∥T (u0 )∥ = c2 , so that, by (5.11), we would have ∥T ∥ ≥ c2 , which, together with (5.15), implies that ∥T ∥ = c2 . Next theorem states that L(V, W ) = L(V, W ) ∩ C(V, W ). If V or W is finitedimensional, then L(V, W ) = L(V, W ). Theorem 5.2 (Continuity and boundedness) Let V and W be normed spaces and T ∈ L(V, W ). Then T is bounded iff it’s continuous.
5.1. Introduction
177
Proof. i) Let’s assume that T is bounded. a) By point (5.10), T is Lipschitz continuous so that T is continuous. b) In spite of the previous point, let’s provide a second way to show T is continuous, i.e., that ∀u0 ∈ V, ∀ϵ > 0, ∃δ > 0 : ∥u − u0 ∥ < δ ⇒ ∥T (u) − T (u0 )∥ < ϵ.
(5.16)
Since T is bounded, there is c > 0 such that ∀u ∈ V :
∥T (u)∥ ≤ c∥u∥.
Let u0 ∈ V and ϵ > 0, generic. Let’s choose 0 < δ < (5.17), for u ∈ B(u0 , δ) we have that
(5.17) ϵ . Then, by c
∥T (u) − T (u0 )∥ = ∥T (u − u0 )∥ ≤ c∥u − u0 ∥ < cδ < ϵ. Since u0 and ϵ were chosen arbitrarily, we have proved (5.16). ii) Let’s assume that T is continuous. We have to prove that T is bounded, i.e., ∃c > 0, ∀u ∈ V : ∥T (u)∥ ≤ c∥u∥. (5.18) Let’s reason by Reduction to Absurdity. So let’s assume that (5.18) is false, i.e., ∀c > 0, ∃u ∈ V : ∥T (u)∥ > c∥u∥. Therefore, we can choose a sequence (un )n∈N ⊆ V such that ∀n ∈ N :
∥T (un )∥ > n ∥un ∥.
Now, by putting for each n ∈ N, vn =
1 un , we have that ∥vn ∥ = 1/n. n∥un ∥
Then we have that lim vn = 0,
n→+∞
and ∥T (vn )∥ =
1 1 ∥T (un )∥ ≥ n∥un ∥ = 1 n∥un ∥ n∥un ∥
(5.19)
(5.20)
Points (5.19) and (5.20) contradict the continuity of T (see Theorem 2.18). ■ Remark 5.5 It’s important to keep in mind that if a linear operator, T : V −→ W , is continuous at a particular point u0 ∈ V , then it’s continuous. Remark 5.6 Let E be a normed space and M a linear subspace of E. Then, E ′ ⊆ M ′ . In particular, if ∥ · ∥β is a norm on E, dominated by the norm ∥ · ∥α , then, as it was mentioned in Theorem 4.1, Eα ⊆ Eβ , where Eα = (E, ∥ · ∥α ) and Eβ = (E, ∥ · ∥β ),
178
Chapter 5. Fundamental theorems of Functional Analysis
and Eβ′ ⊆ Eα′ . Even more if ∀u ∈ E :
∥u∥β ≤ ∥u∥α ,
then ∀η ∈ Eβ′ :
∥η∥Eα′ ≤ ∥η∥Eβ′ .
The composition of two linear operators (whenever it exists) is usually written as ST instead of S ◦ T . Corollary 5.1 (Composition of bounded linear operators ) Let V, W and U be normed spaces. If T ∈ L(V, W ) and S ∈ L(W, U ), then ST ∈ L(V, U ). Moreover, ∥ST ∥ ≤ ∥S∥ ∥T ∥. (5.21) The proof of this corollary is easy so we require it as an exercise at the end of the chapter. Remark 5.7 (L(V ) as a Banach algebra) As a consequence of Corollary 5.1 and Theorem 5.4,1 if V is a Banach space then (L(V ), +, ·, ◦) is a Banach algebra with unity Id, the identity mapping on V . Observe that if T ∈ L(V ), then ∀n ∈ N :
∥T n ∥ ≤ ∥T ∥n .
(5.22)
Also keep in mind that ∥Id∥ = 1. Theorem 5.3 (Extension of a bounded operator with dense domain) Let T : U ⊆ V −→ W be a bounded linear operator with dense domain. Assume that W is a Banach space. Then there exists T˜ ∈ L(V, W ) which is an extension of T and verifies ∥T˜∥ = ∥T ∥.
Proof. 1. Let u ∈ V , generic. Since U = V , there exists a sequence (un )n∈N ⊆ U such that lim un = u. (5.23) n→+∞
Since T is bounded, we have, for n, m ∈ N, that ∥T (un ) − T (um )∥ = ∥T (un − um )∥ ≤ ∥T ∥ ∥un − um ∥. Since m and n were arbitrary, the last relation, together with (5.23), imply that (T (un ))n∈N ⊆ W is a Cauchy sequence and, therefore, convergent to some element denoted T˜(u) ∈ W , as W is complete. It’s not difficult to show that T˜(u) is well defined, i.e., that it does not depend of the sequence converging to u. 1 Which
shall be stated and proved in Section 5.2.
5.1. Introduction
179
2. By point 1), we are able to define T˜ : V −→ W by T˜(u) = lim T (un ), n→+∞
where (un )n∈N ⊆ U is any sequence converging to u. 3. Let’s prove that T˜ is linear, i.e., ∀λ ∈ R, ∀u, v ∈ V :
T˜(λu + v) = λT˜(u) + T˜(v).
(5.24)
Let λ ∈ R and u, v ∈ V , generic. Let’s take sequences (un )n∈N , (vn )n∈N ⊆ U such that lim un = u, lim vn = v. n→+∞
n→+∞
Therefore, the sequence (λun + vn )n∈N ⊆ V converges to λu + v. Then we have that T˜(λu + v) = lim T (λun + vn ) = lim [λT (un ) + T (vn )] n→+∞
n→+∞
= λ lim T (un ) + lim T (vn ) = λT˜(u) + T˜(v). n→+∞
n→+∞
Since u and v, λ were chosen arbitrarily, we have proved (5.24). 4. Let’s use the notation (5.3). Let’s prove that OT˜ ⊆ OT , because it would immediately follow that ∥T ∥ = inf(OT ) ≤ inf(OT˜ ) = ∥T˜∥. Let α ∈ OT˜ , generic. Then ∀u ∈ V :
∥T˜(u)∥ ≤ α∥u∥,
so that, using the fact that T is a restriction of T˜: ∀u ∈ U :
∥T˜(u)∥ = ∥T (u)∥ ≤ α∥u∥,
so that α ∈ OT . Since α was chosen arbitrarily, we have proved OT˜ ⊆ OT . 5. Let’s prove that ∥T ∥ ∈ OT˜ , because it would immediately follow that ∥T˜∥ = inf(OT˜ ) ≤ ∥T ∥. Our goal corresponds to proving that ∀u ∈ V :
∥T˜(u)∥ ≤ ∥T ∥ ∥u∥.
(5.25)
Let u ∈ V , generic. Let’s pick (un )n∈N ⊆ U such that un −→ u, as n −→ +∞. We have that ∀n ∈ N :
∥T (un )∥ ≤ ∥T ∥ ∥un ∥,
then, by letting n −→ +∞, we get ∥T˜(u)∥ ≤ ∥T ∥ ∥u∥, thanks to the continuity of the norm and Theorem 2.18. Since u was chosen arbitrarily, we have proved (5.25).
180
Chapter 5. Fundamental theorems of Functional Analysis ■
Example 5.3 (An integral operator on C([a, b])) Let K ∈ C([a, b] × [a, b]). Let’s consider the linear operator T : C([a, b]) −→ C([a, b]) given by Z
b
T [u](t) =
K(t, τ ) u(τ ) dτ. a
Remember that, by default, C([a, b]) is endowed with the norm ∥ · ∥∞ . Let’s prove that T ∈ L(C([a, b])), i.e., ∃c > 0, ∀u ∈ C([a, b]) :
∥T [u]∥∞ ≤ c ∥u∥∞ .
(5.26)
Let’s take c > ∥K∥∞ · (b − a), with ∥K∥∞ = supt,τ ∈[a,b] |K(t, τ )|. Let u ∈ C([a, b]), generic. Then, Z Z b b ∥T [u]∥∞ = sup K(t, τ ) u(τ ) dτ ≤ sup |K(t, τ )| |u(τ )| dτ t∈[a,b] a t∈[a,b] a Z b ≤ sup |K(t, τ )| · ∥u∥∞ · dτ = c · ∥u∥∞ . t,τ ∈[a,b]
a
Since u was chosen arbitrarily, we have proved (5.26) and ∥T ∥ ≤ ∥K∥∞ · (b − a). Example 5.4 (Derivation as a non-continuous operator) Let’s consider the linear operator D : (C1 ([0, 1]), ∥ · ∥∞ ) → (C([0, 1]), ∥ · ∥∞ ), given by D[u] = u′ . For each n ∈ N, let’s consider the function un : [0, 1] → R, where un (t) = tn . It’s clear that (un )n∈N ⊆ C1 ([0, 1]) and ∀n ∈ N :
∥un ∥∞ = 1 ∧ ∥D[un ]∥∞ = n.
Therefore, by having in consideration (5.11), the last relation shows that the operator D is not bounded. Example 5.5 (Derivation as a continuous operator) Let’s consider the linear operator Λ : H1 (R) → L2 (R), given by Λ[u] = u′ . Since for all u ∈ H1 (R), ′
Z
∥Λ[u]∥2 = ∥u ∥2 =
1/2 |u (t)| dt ′
2
R
Z ≤
2
′
2
|u(t)| + |u (t)|
1/2 dt = ∥u∥1,2 ,
R
it follows that Λ ∈ L H1 (R), L2 (R) , and ∥Λ∥ ≤ 1.
5.2. Some properties of L(V, W )
181
Example 5.6 (Norm of the definite integral) On C([a, b]) we consider the linear functional G : C([a, b]) −→ R, given by Z b G(u) = u(t)dt. a
For u ∈ C([a, b]), we have that Z Z Z b b b |u(t)|dt ≤ sup |u(t)| · dt = (b − a) ∥u∥∞ . u(t)dt ≤ |G(u)| = a t∈[a,b] a a Since u was chosen arbitrarily, we have proved that G ∈ (C([a, b]))′ and ∥G∥ ≤ b − a.
(5.27)
Now, let’s take u0 : [a, b] −→ R, given by u0 (t) = 1. It’s clear that ∥u0 ∥∞ = 1 and |G(u0 )| = b − a so that ∥G∥ ≥ b − a, which, together with (5.27), proves that ∥G∥ = b − a. Example 5.7 (The definite integral on Lp ([a, b])) Let p ≥ 1. On Lp ([a, b]) we consider the linear functional F : Lp ([a, b]) −→ R, given by Z b F (u) = u(t)dt. a p
For a generic u ∈ L ([a, b]), we have, by H¨older’s inequality, that Z Z b !1/p′ b ′ |F (u)| = u(t)dt ≤ ∥u∥p dt = (b − a)1/p ∥u∥p , a a ′
so that ∥F ∥ ≤ (b − a)1/p .
5.2. Some properties of L(V, W ) Let’s show that the space of operators L(V, W ) is a Banach space provided the completeness of W . Theorem 5.4 (Completeness of L(V, W )) Let V be a normed space and W a Banach space. Then L(V, W ) is a Banach space. Proof. We have to prove that every Cauchy sequence of L(V, W ) is convergent. Let’s use the strategy described in Remark 3.12. 1. Let (Tn )n∈N ⊆ L(V, W ) be a generic Cauchy sequence. Then for every ϵ > 0, there is N ∈ N such that m, n > N implies that ∥Tn − Tm ∥ < ϵ. Therefore, for every u ∈ V , ∥Tn (u) − Tm (u)∥ = ∥(Tn − Tm )(u)∥ ≤ ∥Tn − Tm ∥ ∥u∥ < ϵ∥u∥,
(5.28)
so that for every u ∈ V , the sequence (Tn (u))n∈N ⊆ W is Cauchy. Since W is complete, for every u ∈ V , there is a unique element T (u) ∈ W such that lim Tn (u) = T (u).
n→+∞
(5.29)
182
Chapter 5. Fundamental theorems of Functional Analysis
2. By (5.29), we have found an operator T : V −→ W . Let’s prove that T is linear. Let u, v ∈ V and λ ∈ R, generic. We have, by the linearity of Tn , that T (λu + v) = lim Tn (λu + v) = lim [λTn (u) + Tn (v)] n→+∞
n→+∞
= λ lim Tn (u) + lim Tn (v) = T (u) + T (v). n→+∞
n→+∞
Since u, v and λ were chosen arbitrarily, we have proved that T is linear. 3. Let’s retake point 1. By (5.29), we can let m −→ +∞ in (5.28) to obtain, for n > N and all u ∈ V , ∥(Tn − T )(u)∥ ≤ ϵ∥u∥,
(5.30)
which implies that Tn − T ∈ L(V, W ), for n > N . Since L is a linear space and Tn ∈ L(V, W ), it follows that T ∈ L(V, W ). 4. By (5.30), we have, for n > N , that ∥Tn − T ∥ < ϵ. Since ϵ was chosen arbitrarily, this proves that Tn −→ T , as n −→ +∞. ■ As an immediate consequence we have: Corollary 5.2 (Completeness of the dual space) Let V be a normed space. Then V ′ is a Banach space. Let V, W and U be normed spaces. In Section 1.7 we saw that A ∈ L(V, W ) bijective ⇒ A−1 ∈ L(W, V ). Moreover, if A ∈ L(V, W ) and B ∈ L(W, U ) are bijective then (BA)−1 ∈ L(U, V ) and (BA)−1 = A−1 B −1 . Since IdV ∈ L(V ) and IdW ∈ L(W ) with ∥IdV ∥ = ∥IdW ∥ = 1, Corollary 5.1 implies that for T ∈ B(V, W ): T −1 ∈ L(W, V )
⇒
∥T −1 ∥ =
1 . ∥T ∥
However, in principle we can not ensure that T −1 ∈ L(W, V ). This is true whenever V and W are Banach spaces, see Corollary 5.13. For bounded operators working on normed spaces (not necessarily complete), we have the following result. Theorem 5.5 (Sufficient condition for existence of a bounded inverse) Let V, W be normed spaces and T ∈ L(V, W ) onto. If ∃b > 0, ∀u ∈ V :
∥T (u)∥ ≥ b∥u∥,
then T −1 does exist and belongs to L(W, V ).
(5.31)
5.2. Some properties of L(V, W )
183
Proof. Let’s take u ∈ Ker(T ), generic. Then, by (5.31), 0 = ∥T (u)∥ ≥ b∥u∥ ≥ 0, so that u = 0. Since u was arbitrary, we have shown that Ker(T ) = {0} so that, by Theorem 1.14, T is injective. By hypothesis, T is surjective so that, by Theorem 1.15, T −1 ∈ L(W, V ). Then (5.31) can be rewritten as ∃b > 0, ∀w ∈ W :
1 ∥w∥ ≥ ∥T −1 (w)∥, b
so that T −1 ∈ L(W, V ). ■ Next, we prove that a bounded operator which is close enough to a bijective bounded operator is also bijective. Theorem 5.6 (Invertibility region) Let V and W be Banach spaces and T0 ∈ L(V, W ) bijective. Then ∀D ∈ L(V, W ) :
∥D∥ < ∥T0 ∥ ⇒ T = T0 + D has a bounded inverse.
Proof. Let D ∈ L(V, W ) with ∥D∥ < ∥T0 ∥, generic. We have to prove that T = T0 + D is bijective which, by Remark 1.6, is equivalent to prove that ∀w ∈ W, ∃!u0 ∈ V :
T (u0 ) = w.
(5.32)
Let w ∈ W , generic. The following equations are equivalent T (u) = w,
u ∈ V,
T0 (u) + D(u) = w,
u ∈ V,
u + A(u) = v,
u ∈ V,
where v = T0−1 (w), and A = T0−1 D ∈ L(V ). By (5.21), we have that ∥A∥ ≤ ∥T0−1 ∥ ∥D∥ < 1.
(5.33)
Let’s consider the mapping F : V → V , given by F (u) = v − A(u). Thanks to (5.33), we have that F is contractive so that, by Banach fixed point theorem, there is a unique u0 ∈ V such that F (u0 ) = u0 , i.e., u0 + A(u0 ) = v. Since w was chosen arbitrarily, we have proved (5.32). We conclude by the arbitrariness of D. ■ Remark 5.8 For V and W Banach spaces we denote L× (V, W ) = {T ∈ L(V, W ) / T is bijective}. Then Theorem 5.6 implies that L× (V, W ) is an open subset of L(V, W ).
184
Chapter 5. Fundamental theorems of Functional Analysis
Corollary 5.3 (Inverse by a geometric series) Let V be a Banach space and A ∈ L(V ) such that ∥A∥ < 1. Then Id − A is bijective and (Id − A)−1 =
+∞ X
Ak .
(5.34)
k=0
Proof. By taking T = Id, the conditions of Theorem 5.6 are satisfied so that (Id − A)−1 exists and belongs to L(V ). By (5.22), we have that lim An = 0.
(5.35)
n→+∞
For all n ∈ N, we have that Id − A
n+1
= (Id − A) ·
n X
Ak ,
k=0
so, by (5.35) and letting n −→ +∞, we get Id = (Id − A) ·
+∞ X
Ak ,
k=0
which is equivalent to (5.34).
■
Remark 5.9 If in Corollary 5.3, we put B = Id − A, then B −1 =
+∞ X
(Id − B)k .
k=0
Next, we state that a bounded linear operator is completely defined by a basis of its domain. Theorem 5.7 (Bounded operator and Schauder basis) Let V and W be normed spaces and T ∈ L(V, W ). Assume that S = {un / n ∈ N} is a Schauder basis of V . Then, ⟨{T (un ) /n ∈ N}⟩ = Im(T ). Proof. Let w ∈ Im(T ), generic. We have to prove that w can be written as a linear combination (perhaps infinite) of elements of {T (un ) / n ∈ N}. Let’s pick v ∈ V such that T (v) = w. Since S is a Schauder basis of V , there is a unique +∞ X sequence (αn )n∈N ⊆ R such that v = αn un . Therefore, n=1
w = T (v) =
+∞ X
αn T (un ).
(5.36)
n=1
Since w was chosen arbitrarily, we are done.
■
Remark 5.10 (Schauder basis determines a bounded linear operator) Point (5.36) in the proof of Theorem 5.7 shows that the operator T is completely determined by the images of the Schauder basis S. This is completely analogous to what happens in finite dimensional linear spaces.
5.3. Continuous embeddings
185
5.3. Continuous embeddings Let (V, ∥ · ∥α ) and (U, ∥ · ∥β ) be normed space with U ⊆ V . Let’s suppose that ∥ · ∥α dominates ∥ · ∥β , i.e., ∃c > 0, ∀u ∈ U :
∥u∥β ≤ c∥u∥α .
By Theorem 4.1, we have that (U, ∥·∥α ) ⊆ (U, ∥·∥β ), and the embedding operator (or inclusion operator ) Idα,β : (U, ∥ · ∥α ) −→ (U, ∥ · ∥β ), given by Idα,β (u) = u, belongs to L ((U, ∥ · ∥α ), (U, ∥ · ∥β )). This situation is usually referred to by saying that (U, ∥ · ∥α ) is continuously embedded in (U, ∥ · ∥β ). In this setting it holds Tβ ⊆ Tα , where Tβ and Tα are the topologies induced by ∥ · ∥β and ∥ · ∥α , respectively. In Section 1.7, an algebraic isomorphism was defined as a bijective linear operator. Now, we adapt this idea for normed spaces. Definition 5.2 (Isomorphism of normed spaces) Let V and W be normed spaces. We say that T ∈ L(V, W ) is an isomorphism iff it’s bijective and ∀u ∈ V :
∥T (u)∥W = ∥u∥V .
In this case, we say that V and W are isomorphic. In particular, we have that ∥T ∥ = 1 and T is an isometry. Definition 5.3 (Generalized embedding) Let V and W be normed spaces. We say that V is embeddable in W iff there exists an isomorphism T between V and a subspace of W . In this case, we say that T is an embedding of V into W. Example 5.8 (Embedding of Sobolev spaces) Let p ≥ 1. From Section 4.4.5, we have that ∀u ∈ W1,p ([a, b]) :
∥u∥p ≤ ∥u∥p + ∥u′ ∥p = ∥u∥1,p .
Therefore, the operator I : W1,p ([a, b]) −→ Lp ([a, b]), given by I(u) = u, is continuous. Therefore, the embedding W1,p ([a, b]) ⊆ Lp ([a, b]) is continuous. Example 5.9 (Embedding of Lebesgue spaces) Let’s assume that 1 ≤ p < q < +∞. In Section 4.4.3, we mentioned that Lq ([a, b]) ⊆ Lp ([a, b]). Let’s prove that this is actually a continuous embedding, i.e., that the inclusion operator Idq,p : Lq ([a, b]) −→ Lp ([a, b]), given by Idq,p (u) = u, belongs to L (Lq ([a, b]), Lp ([a, b])). So we have to prove that ∃cq,p > 0, ∀u ∈ Lq (a, b) :
∥u∥p ≤ cq,p ∥u∥q .
(5.37)
Let’s take cq,p = (b − a)(q−p)/(qp) .
(5.38)
186
Chapter 5. Fundamental theorems of Functional Analysis
Let u ∈ Lq (a, b), generic. Since 1 ≤ p < q, we can take r > 0 such that 1 1 1 = + p q r
1 q−p = . r qp
or
Let’s recall that for s > 1, H¨ older inequality can be written as ′
∀h ∈ Ls (a, b), ∀g ∈ Ls (a, b) : So by taking
q , p
s=
∥hg∥1 ≤ ∥h∥s · ∥g∥s′ .
h = |u|p ,
we have
(5.39)
g = 1f ,
q−p 1 = , ′ s q
and, by (5.39), Z ∥Idq,p (u)∥p = ∥u∥p =
!1/p
b p
!p/q
b
Z
(|u(t)|p )
=
≤ { ∥ |u|p ∥s · ∥1f ∥s′ }
|u(t)| dt
1/p
a
q/p
Z ·
dt
a
!(q−p)/q 1/p
b
dt
a
h i1/p = ∥u∥pq · (b − a)p/r = cq,p ∥u∥q . Since u was chosen arbitrarily, we have proved (5.37). As a result, we have that ∥Idq,p ∥ ≤ (b − a)(q−p)/(qp) . Example 5.10 (An integral operator on Lp ([a, b])) Let p ≥ 1 and K ∈ C([a, b]× [a, b]). Let’s consider the linear operator T : Lp ([a, b]) −→ Lp ([a, b]) given by Z b T [u](t) = K(t, τ ) u(τ ) dτ. a p
Let’s prove that T ∈ L(L ([a, b])), i.e., ∃c > 0, ∀u ∈ Lp (a, b) :
∥T [u]∥p ≤ c ∥u∥p .
Let’s take c > ∥K∥∞ · (b − a), with ∥K∥∞ =
sup |K(t, τ )|. Let u ∈ Lp (a, b), t,τ ∈[a,b]
generic. Then we have that Z p b p ∥T [u]∥p = K(t, τ ) u(τ ) dτ dt a a !p Z b Z b ≤ |K(t, τ )| · |u(τ )| dτ dt Z
b
a
≤
∥K∥p∞
a
Z
b
Z
!p
b
|u(τ )| dτ
· a
(5.40)
a
= ∥K∥p∞ · ∥u∥p1 · (b − a),
dt
5.3. Continuous embeddings
187
whence, by using (5.37) and (5.38), cp,1 = (b − a)(p−1)/p , it follows that ∥T [u]∥p ≤ ∥K∥∞ · (b − a)1/p · ∥u∥1 ≤ ∥K∥∞ · (b − a)1/p · cp,1 · ∥u∥p = ∥K∥∞ · (b − a) · ∥u∥p . Since u was chosen arbitrarily, we have proved (5.40) and ∥T ∥ ≤ ∥K∥∞ · (b − a). Remark 5.11 In Corollary 5.2, we stated that the dual space V ′ of a normed space V , is a Banach space. In particular, this implies that a non-complete normed space e 1 ([a, b]) can not be isomorphic to the dual of any normed space. For example, L can not be isomorphic to a dual space. Example 5.11 (l∞ (R) ∼ = (l1 (R))′ ) Let’s prove that ′ l1 (R) ∼ = l∞ (R).
(5.41)
1. Let’s consider the canonical basis of l1 (R), C = {en / n ∈ N}, where, for each n ∈ N: e(m) = (δmn ) = (0, 0, ..., 0, 1, 0, ...). n Let’s recall that any x = (xn )n∈N ∈ l1 (R) can be written as x=
+∞ X
xn en ,
(5.42)
n=1
so that C is a Schauder basis of l1 (R). 2. Given F ∈ (l1 (R))′ , we have ∀n ∈ N :
|F (en )| ≤ ∥F ∥ ∥en ∥1 = ∥F ∥,
and sup |F (en )| ≤ ∥F ∥, so that hF = (F (en ))n∈N ∈ l∞ (R). Let’s also n∈N
observe that, by Remark 5.10 and (5.42), we have that F (x) =
+∞ X
xn F (en ).
n=1
3. What we have shown let us define a mapping J : (l1 (R))′ −→ l∞ (R) by J (F ) = hF = (F (en ))n∈N .
(5.43)
Let’s prove that J is linear, i.e., that ∀λ ∈ R, ∀F, G ∈ (l1 (R))′ :
J (λF + G) = λ J (F ) + J (G).
(5.44)
Let λ ∈ R and F, G ∈ (l1 (R))′ , generic. Then, J (λF + G) = hλF +G = ((λF + G)(en ))n∈N = (λF (en ) + G(en ))n∈N = λ(F (en ))n∈N + (G(en ))n∈N = λhF + hG = λ J (F ) + J (G). Since λ, F and G were chosen arbitrarily, we have proved (5.44).
188
Chapter 5. Fundamental theorems of Functional Analysis
4. Let’s prove that J is injective, i.e., ∀F, G ∈ (l1 (R))′ :
J (F ) = J (G) ⇒ F = G.
(5.45)
Let F, G ∈ (l1 )∗ such that J (F ) = J (G), generic. Then hF = hG so that ∀n ∈ N : F (en ) = G(en ), and ∀n ∈ N :
(F − G)(en ) = 0,
which implies that F = G. 5. Let’s prove that J is onto, i.e., ∀y ∈ l∞ (R), ∃G ∈ (l1 (R))′ :
J (G) = y.
(5.46)
Let y = (yn )n∈N ∈ l∞ (R), generic. Let’s define G : l1 (R) −→ R by G(x) = G ((xn )n∈N ) =
+∞ X
xm ym .
m=1
Since ∀n ∈ N :
G(en ) =
+∞ X
δmn yn = yn ,
m=1
it follows that J (G) = (G(en ))n∈N = (yn )n∈N = y.
(5.47)
Therefore, to prove that J is onto it remains to show that G ∈ (l1 (R))′ , i.e., that there is some c > 0 such that ∀x ∈ l1 (R) :
|G(x)| ≤ c ∥x∥1 .
(5.48)
Let’s take c = ∥y∥∞ . Let x = (xn )n∈N ∈ l1 (R), generic. Then +∞ +∞ X X |G(x)| = xn yn ≤ |xn yn | n=1
≤ sup |yn | · n∈N
n=1
+∞ X
|xn | = ∥y∥∞ · ∥x∥1 .
n=1
Since x was chosen arbitrarily, we have proved (5.48). 6. Until now, we have proved that J is an algebraic isomorphism. To prove (5.41), we still have to show that J is an isomorphism, i.e., ∀F ∈ (l1 (R))′ :
∥J (F )∥ = ∥F ∥.
(5.49)
Let F ∈ (l1 (R))′ , generic. We have that ∥J (F )∥ = ∥hF ∥∞ = sup |F (en )|. n∈N
(5.50)
5.4. Riesz-Fr´echet representation theorem
189
In other hand, for a generic x = (xn )n∈N ∈ l1 (R), we have that |F (x)| = F ≤
+∞ X
+∞ X n=1
! +∞ X xn F (en ) xn en = n=1
|xn F (en )| ≤ sup |F (en )| · n∈N
n=1
+∞ X
|xn |
n=1
= ∥hF ∥∞ · ∥x∥1 ,
(5.51)
whence, ∥hF ∥∞ ≤ ∥F ∥.
(5.52)
By (5.50) and (5.52), it follows that ∥J (F )∥ ≤ ∥F ∥.
(5.53)
Now, by assuming x ̸= 0, we have from (5.51) that |F (x)| ≤ ∥hF ∥∞ = ∥J (F )∥, ∥x∥1 so that ∥F ∥ = sup x̸=0
|F (x)| ≤ ∥J (F )∥, ∥x∥1
which, together with (5.53), show that ∥J (F )∥ = ∥F ∥. Since F was chosen arbitrarily, we have proved (5.49). ′ ′ ∼ p′ Example 5.12 (lp (R) ∼ l (R). = (lp (R))′ ) Let p > 1. We have that (lp (R)) = The proof of this fact is required as an exercise at the end of the chapter.
5.4. Riesz-Fr´ echet representation theorem In practical terms, it’s very important to know the general form of bounded linear functionals. When the domain is a Hilbert space, this representation is quite simple. Let’s first state a simple and useful result. Lemma 5.1 (Equality by using the inner-product) Let V be an inner-product space and u1 , u2 ∈ V . If ∀w ∈ V :
(w, u1 ) = (w, u2 ),
then u1 = u2 . In particular, if ∀w ∈ V : then u = 0.
(w, u) = 0,
(5.54)
190
Chapter 5. Fundamental theorems of Functional Analysis
Proof. From (5.54), we have that ∀w ∈ V :
(u1 − u2 , w) = 0.
So, for w = u1 − u2 , we get ∥u1 − u2 ∥2 = 0 so that u1 − u2 = 0 and u1 = u2 . ■ Theorem 5.8 (Riesz-Fr´ echet representation theorem) Let H be a Hilbert space and φ ∈ H ′ . Then there is a unique v ∈ H such that ∀u ∈ H :
φ(u) = (u, v),
(5.55)
and ∥φ∥H ′ = ∥v∥H . Idea of the proof. Let’s assume that Theorem 5.8 holds. In this case, ∀u ∈ Ker(φ) :
0 = φ(u) = (u, v),
so that the vector v we are looking for, should belong to (Ker(φ))⊥ . We also need a computation creating a link between φ(u) and (u, v). Proof. If φ = 0, then we take v = 0. Let’s assume that φ ̸= 0. Then there is some z0 ∈ [Ker(φ)]⊥ \ {0}. (5.56) 1. Let’s prove that ∃v ∈ H, ∀u ∈ H :
φ(u) = (u, v).
(5.57)
a) Let’s find the vector v. Let’s take some u0 ∈ H and write w = φ(u0 )z0 − φ(z0 )u0 , so that φ(w) = 0 and w ∈ Ker(φ).
(5.58)
By (5.56) and (5.58), we have that 0 = (w, z0 ) = (φ(u0 )z0 − φ(z0 )u0 , z0 ) = φ(u0 ) ∥z0 ∥2 − φ(z0 )(u0 , z0 ), so that φ(u0 ) = (u0 , v), where v =
φ(z0 ) z0 . ∥z0 ∥2
b) Now let’s prove that ∀u ∈ H :
φ(u) = (u, v).
(5.59)
Let u ∈ H, generic. Let’s use the trick of point a). Let’s write wu = φ(u)z0 − φ(z0 )u, so that φ(wu ) = 0 and wu ∈ Ker(φ).
(5.60)
By (5.56) and (5.60), we have that 0 = (wu , z0 ) = (φ(u)z0 − φ(z0 )u, z0 ) = φ(u) ∥z0 ∥2 − φ(z0 )(u, z0 ), so that φ(u) = (u, v). Since u was chosen arbitrarily, we have proved (5.59) as well as (5.57).
5.4. Riesz-Fr´echet representation theorem
191
2. Let’s assume that there is v˜ such that ∀u ∈ H :
φ(u) = (u, v˜).
∀u ∈ H :
(u, v˜ − v) = 0,
Then so, by Lemma 5.1, v˜ = v. 3. By CBS inequality we have, for any u ∈ H that |φ(u)| = |(u, v)| ≤ ∥v∥ ∥u∥, whence ∥φ∥ ≤ ∥v∥. 1 Now, by taking u0 = v, we have that ∥u0 ∥ = 1 and ∥v∥ 1 |φ(u0 )| = v, v = ∥v∥, ∥v∥
(5.61)
so that ∥φ∥ = sup |φ(u)| ≥ ∥v∥, which, together with (5.61), implies that ∥u∥=1
∥φ∥ = ∥v∥. ■ Remark 5.12 (H ∼ = H ′ ) Let H be a Hilbert space and w ∈ H. The mapping η : H −→ R, given by η(u) = (u, w), belongs to H ′ . In fact, by CBS inequality, for any u ∈ H: |η(u)| = |(u, w)| ≤ ∥w∥ ∥u∥. Therefore, the operator R : H ′ −→ H determined by the Riesz-Fr´echet Theorem, Theorem 5.8, given by R(φ) = v, where (5.62) holds, is an isomorphism. Therefore, H and H ′ are isomorphic: H∼ = H ′.
(5.62)
Almost always is applied the identification (5.62). Example 5.13 Let’s recall that the canonical basis of Rn is C = {e1 , e2 , ..., en }, where 0 .. . ek = (k-th row) 1 .. . 0 The canonical basis of (Rn )′ is C ′ = {e′1 , e′2 , ..., e′n } = {dx1 , dx2 , ..., dxn }, where ⟨e′i , ej ⟩ = δij ,
i, j = 1, ..., n.
As a consequence of the Riesz-Fr´echet representation theorem we have that ∀i ∈ In , ∀u ∈ Rn :
⟨e′i , u⟩ = eti u.
(5.63)
192
Chapter 5. Fundamental theorems of Functional Analysis
Example 5.14 (Bounded linear functionals on L2 (I)) Let I ⊆ R measurable. ′ Theorem 5.8 and Remark 5.12 show that L2 (I) ∼ = L2 (I), and that any F ∈ ′ 2 L (I) is given by Z F (u) =
u(t)f (t)dt, I
for some f ∈ L2 (I). Example 5.15 (Bounded linear functionals on H1 (I)) Let I ⊆ R, measurable. ′ Theorem 5.8 and Remark 5.12 show that H1 (I) ∼ = H1 (I), and that any F ∈ ′ H1 (I) is given by Z F (u) = [u′ (t)f ′ (t) + u(t)f (t)] dt, I
for some f ∈ H1 (I).
5.5. Hahn-Banach theorem In general terms, in an extension problem, one considers a mathematical object (e.g. a function) defined on a set Z, which has some “nice” property, and one wants to extend such an object to X ⊇ Z in such a way that the extension still has the mentioned property. The Hahn-Banach theorem provides a solution to the problem of extending a linear functional defined on a linear subspace, which is controlled by a Minkowski functional (e.g. a norm), to the linear space, keeping the control provided by the Minkowski functional. This theorem is as general as useful. Before taking on it, let’s write it for a Hilbert space, where it is a simple consequence of the Riesz-Fr´echet representation theorem. Corollary 5.4 (Extension of bounded linear functionals on a Hilbert) Let H be a Hilbert space, Z a closed linear subspace of H and η ∈ Z ′ . Then there is η˜ ∈ H ′ , an extension of η, such that ∥η∥Z ′ = ∥˜ η ∥H ′ . The proof is easy and is required as an exercise at the end of the chapter. A functional p : V −→ R is said to be a Minkowski functional on the linear space V iff it’s 1. Subadditive: ∀u, v ∈ V : p(u + v) ≤ p(u) + p(v); 2. Positive-homogeneous: ∀α > 0, ∀u ∈ V : p(αu) = α p(u). Theorem 5.9 (Hahn-Banach theorem) Let V be a linear space, U a linear subspace of V , and p a Minkowski functional on V . Assume that η ∈ L(U, R) is such that ∀u ∈ U : η(u) ≤ p(u).
5.5. Hahn-Banach theorem
193
Then there is η˜ ∈ L(V, R), an extension of η, such that ∀u ∈ V :
η˜(u) ≤ p(u).
Proof. To show the existence of η˜, we take several steps to build the conditions of Zorn’s lemma. 1. Let’s define E , the set of all linear extensions of η which are controlled by the Minkowski functional. Then g ∈ E iff g : Dom(g) ⊆ V −→ R is linear and ∀u ∈ Dom(g) : g(u) ≤ p(u). Since η ∈ E , E ̸= ∅. 2. On E we define an order relation by gh
⇐⇒
h is an extension of g,
i.e., Dom(g) ⊆ Dom(h) and ∀u ∈ Dom(g) :
g(u) = h(u).
3. Now, let’s prove that E is inductive, i.e., that every chain contained in E is bounded from above. So, let Q ⊆ E be totally ordered, generic. We have to prove that ∃˜ g ∈ E , ∀g ∈ Q : g g˜. (5.64) a) Let’s consider the set D=
[
Dom(g).
(5.65)
g∈Q
It’s clear that, for every g ∈ Q, Dom(g) ⊆ D. Let’s prove that D is a linear subspace of V , i.e., that ∀λ ∈ R, ∀u, v ∈ D :
λu + v ∈ D.
(5.66)
Let λ ∈ R and u, v ∈ D, generic. By (5.65), there are g1 , g2 ∈ Q such that u ∈ Dom(g1 ) and v ∈ Dom(g2 ). Since Q is a chain, either g1 g2 or g2 g1 . Without loss of generality, we can assume that g1 g2 , so that Dom(g1 ) ⊆ Dom(g2 ) and u, v ∈ Dom(g2 ). Since Dom(g2 ) is a linear space we have that λu + v ∈ Dom(g2 ) ⊆ D. Since λ, u and v were chosen arbitrarily, we have proved (5.66). b) Let’s define g˜ : D −→ R by g˜(u) = g(u), where g is such that Dom(g) ⊇ u. The fact that Q is a chain helps to prove that g˜ is well defined. Let’s prove that g˜ is linear, i.e., ∀λ ∈ R, ∀u, v ∈ D :
g˜(λu + v) = λ g˜(u) + g˜(v).
(5.67)
Let λ ∈ R and u, v ∈ D, generic. By (5.65), there are g1 , g2 ∈ Q such that u ∈ Dom(g1 ) and v ∈ Dom(g2 ). Since Q is a chain, either g1 g2 or g2 g1 . Without loss of generality, we can assume that g1 g2 , so that Dom(g1 ) ⊆ Dom(g2 ) and u, v ∈ Dom(g2 ). Since g2 is linear we have that g˜(λu + v) = g2 (λu + v) = λ g2 (u) + g2 (v) = λ˜ g (u) + g˜(v). Since λ, u and v were chosen arbitrarily, we have proved (5.67).
194
Chapter 5. Fundamental theorems of Functional Analysis
From points a) and b), we have (5.64). Since Q was chosen arbitrarily, we have proved that E is inductive. 4. By points 1), 2) and 3), we can apply Zorn’s lemma so that E has a maximal element η˜. Since η˜ belong to E , it’s an extension of η such that ∀u ∈ Dom(˜ η) :
η˜(u) ≤ p(u).
(5.68)
5. To finish the proof, we just need to show that V = Dom(˜ η ). Let’s reason by Reduction to Absurdity. So let’s assume that Dom(˜ η ) ⊊ V , so that there is some u0 ∈ V \ Dom(˜ η ). It’s clear that u0 ̸= 0. We define D1 = ⟨D ∪ {u0 }⟩, so that ∀u ∈ D1 , ∃!(v, t) ∈ D × R : u = v + t u0 . Let’s define h : D1 −→ R by h(u) = η˜(v) + tα, where α ∈ R. It’s clear that h is a proper linear extension of η˜. Now we have to choose α in such a way that h∈E
(5.69)
because this would contradict the maximality of η˜, allowing us to conclude. 6. Let’s observe that (5.69) is equivalent to ∀t ∈ R, ∀v ∈ Dom(˜ η) :
η˜(v) + tα ≤ p(v + tu0 ).
(5.70)
For t = 0, (5.70) is immediate. For t > 0, (5.70) is equivalent to ∀v ∈ Dom(˜ η) :
η˜(v) + α ≤ p(v + u0 ),
∀v ∈ Dom(˜ η) :
α ≤ p(v + u0 ) − η˜(v).
i.e., (5.71)
For t < 0, (5.70) is equivalent to ∀v ∈ Dom(˜ η) :
η˜(v) − α ≤ p(v − u0 ),
∀v ∈ Dom(˜ η) :
η˜(v) − p(v − u0 ) ≤ α.
i.e., (5.72)
So, to get (5.71) and (5.72) it’s enough to find α such that [˜ η (v) − p(v − u0 )] ≤ α ≤
sup v∈Dom(˜ η)
inf w∈Dom(˜ η)
[p(w + u0 ) − η˜(w)] . (5.73)
Let v, w ∈ Dom(˜ η ), generic. We have that η˜(v)+˜ η (w) = η˜(v+w) ≤ p(v+w) = p((v−u0 )+(w+u0 )) ≤ p(v−uo )+p(w+u0 ), whence, by the arbitrariness of v and w: ∀v ∈ Dom(˜ η ), ∀w ∈ Dom(˜ η) :
η˜(v) − p(v − u0 ) ≤ p(w + u0 ) − η˜(w)
and sup v∈Dom(˜ η)
[˜ η (v) − p(v − u0 )] ≤
inf w∈Dom(˜ η)
so that we can choose α verifying (5.73).
[p(w + u0 ) − η˜(w)]
5.5. Hahn-Banach theorem
195
■ A consequence of Hahn-Banach theorem is that Corollary 5.4 also holds for normed spaces and the closedness of the subspace is no longer required: Corollary 5.5 (Hahn-Banach for normed spaces) Let X be a normed space, Z a linear subspace of X and η ∈ Z ′ . Then there is η˜ ∈ X ′ , an extension of η, such that ∥η∥Z ′ = ∥˜ η ∥X ′ . Proof. Use p : X −→ R defined by p(u) = ∥η∥Z ′ ∥u∥. ■ Another consequence of Hahn-Banach theorem is that X ′ has enough elements to distinguish points of X: Corollary 5.6 (Existence of linear functionals) Let X be a normed space and u0 ∈ X \ {0}. Then there is η ∈ X ′ such that ∥η∥ = ∥u0 ∥,
η(u0 ) = ∥u0 ∥2 .
(5.74)
The proof of this result is required as an exercise at the end of the chapter. Remark 5.13 (Existence of linear functionals) In the context of Corollary 5.6, observe that the linear functional 1 ν= η, ∥u0 ∥ verifies ∥ν∥ = 1 and ν(u0 ) = ∥u0 ∥. Remark 5.14 (Duality product) In a normed space E, given u ∈ E and η ∈ E ′ , it’s usual to write η(u) = ⟨η, u⟩. Here ⟨·, ·⟩ : E ′ × E → R is referred to as the duality product in E ′ , E. Remark 5.15 (Duality map) In the context of Corollary 5.6, the functional η is not necessarily unique. However, if E ′ is strictly convex, then uniqueness holds. Therefore, in this case, the functional η can be seen as a version of u0 but living in the dual space: we can write (5.74) as ∥η∥ = ∥u0 ∥,
⟨η, u0 ⟩ = ∥u0 ∥2 ,
so that the duality map, F : E −→ E ′ , given by F (u0 ) = η, is an embedding. Any Hilbert space is strictly convex; the same happens with the Lebesgue spaces Lp (I), p > 1. Given η ∈ V ′ , we saw that ∥η∥ is computed by means of |η(u)| . u̸=0 ∥u∥
∥η∥ = sup
Let’s present now a dual version of formula (5.75).
(5.75)
196
Chapter 5. Fundamental theorems of Functional Analysis
Corollary 5.7 (Computation of the norm with help of the dual) Let V be a normed space. Then, ∀u ∈ V :
∥u∥ = max η̸=0
|η(u)| = max |η(u)|. ∥η∥ ∥η∥≤1
(5.76)
Proof. Let u ∈ V , generic. We know that ∀η ∈ V ′ :
|η(u)| ≤ ∥η∥ ∥u∥,
so that ∀η ∈ V ′ \ {0} :
∥u∥ ≥
whence ∥u∥ ≥ sup η̸=0
|η(u)| , ∥η∥
|η(u)| . ∥η∥
(5.77)
Now, by Corollary 5.6, we pick η0 ∈ V ′ such that ∥u∥ = ∥η0 ∥ and η0 (u) = ∥u∥2 , so that |η0 (u)| = ∥u∥, ∥η0 ∥ which, together with (5.77) and the arbitrariness of u, shows the first equality of (5.76). The second equality in (5.76) is easy to show so we omit it. ■ Remark 5.16 (Equality by using the duality product) In Lemma 5.1, we proved that if U is an inner-product space and v ∈ U , then [∀w ∈ U :
(w, v) = 0]
⇒
v = 0.
As a consequence of Corollary 5.7, we have that if V is a normed space and u ∈ V , then [∀η ∈ V ′ : ⟨η, u⟩ = 0] ⇒ u = 0.
5.6. The dual of C([a, b]). The Riemann-Stieljes integral. Let a < b. Let’s denote by M the set of finite meshings of the interval [a, b], i.e., M = {p = (t0 , t1 , ..., tn ) / a = t0 < t1 < ... < tn−1 < tn = b}. Observe that if p ∈ M, then there is n ∈ N such that p ∈ [a, b]n . We say that x : [a, b] −→ R is a function of bounded variation on [a, b] iff ∃c > 0, ∀p = (t0 , t1 , ..., tn ) ∈ M :
n X
|x(tk ) − x(tk−1 )| < c.
k=1
In this case, we call total variation of x to the value Var(x) = sup p∈M
n X k=1
|x(tk ) − x(tk−1 )|.
5.6. The dual of C([a, b]). The Riemann-Stieljes integral.
197
We denote by BV([a, b]) the set of functions of bounded variation on [a, b]. In Exercise 3.5, the student had to prove that 1. BV([a, b]) is a linear space; 2. Var(·) : BV([a, b]) −→ R is a seminorm; 3. C1 ([a, b]) ⊆ BV([a, b]) and Var(x) = ∥x′ ∥1 =
∀x ∈ C1 ([a, b]) :
Z
b
|x′ (t)|dt;
a
4. the functional ∥ · ∥ : BV([a, b]) → R, given by ||x||BV = |x(a)| + Var(x) is a norm. Let’s fix now w ∈ BV([a, b]). (t0 , t1 , ..., tn ) ∈ M we consider ∆(p) = max{tk − tk−1 / k ∈ In }
For a given u ∈ C([a, b]) and every p =
and S(p; u) =
n X
u(tk ) [w(tk ) − w(tk−1 )] .
k=1
This value is well defined because n X |S(p; u)| = u(tk ) [w(tk ) − w(tk−1 )] k=1
≤
n X
|u(tk )| · |w(tk ) − w(tk−1 )| ≤ ∥u∥∞ Var(w).
(5.78)
k=1
Now, the Riemann-Stieljes integral of u over [a, b] with respect to w is defined by Z b u(t) dw(t) = lim S(p; u). ∆(p)→0
a
Point (5.78) implies that Z b u(t) dw(t) ≤ ∥u∥∞ Var(w). a By point 3 above, if w ∈ C1 ([a, b]), then Z b Z u(t) dw(t) = a
(5.79)
b
u(t) w′ (t)dt,
a
where the integral of the right side is a Riemann integral. Theorem 5.10 (Riesz theorem for (C([a, b]))′ ) Let σ ∈ (C([a, b]))′ . Then, there is w ∈ BV([a, b]) such that Z ⟨σ, u⟩ =
b
u(t) dw(t), a
198
Chapter 5. Fundamental theorems of Functional Analysis
and ∥σ∥ = Var(w). In [17], the student can find a proof of this result as an application of Hahn-Banach theorem. Observe that (5.79) immediatly provides the estimation ∥σ∥ ≤ Var(w). As a consequence of Theorem 5.10, g (C([a, b]))′ ∼ b]), = BV([a, g where BV([a, b]) is the Banach space formed by the functions x ∈ BV([a, b]) which are continuous from the right and such that x(a) = 0.
5.7. Geometric forms of Hahn-Banach theorem In this section, we shall present a couple of very useful consequences of the HahnBanach theorem. They are usually referred to as the first and second geometric forms of the Hahn-Banach theorem. Let’s start with a result that characterizes a closed hyperplanes. Proposition 5.1 (Closed hyperplane) Let E be a linear space, η : E −→ R a linear functional and α ∈ R. Then H = [η = α] is closed iff η ∈ E ′ . Proof. If η ∈ E ′ or η = 0, it is immediate that H is closed. Then let’s assume that H is closed and that η ̸= 0. We have to prove that η ∈ E ′ . We shall do this by proving that ∃c > 0, ∀z ∈ B(0, 1) : |f (z)| ≤ c. (5.80) 1. Since H is closed, H c is open. Let’s pick u0 ∈ H c and η ̸= 0. Without loss of generality we can assume that η(u0 ) < α.
(5.81)
The point u0 is interior in H c so that there is r > 0 such that B(u0 , r) ⊆ H c .
(5.82)
2. Let’s prove that ∀u ∈ B(u0 , r) :
⟨η, u⟩ < α.
(5.83)
Let’s reason by Reduction to Absurdity. So let’s assume that there is some u1 ∈ B(u0 , r) such that ⟨η, u1 ⟩ > α. (5.84) Let’s consider the curve γ : [0, 1] −→ E given by γ(t) = tu0 + (1 − t)u1 , which is clearly continuous and, by (4.96), ∀t ∈ [0, 1] :
γ(t) ∈ B(u0 , r).
(5.85)
Then the function η ◦ γ : [0, 1] −→ R is continuous and, by (5.81) and (5.84), η ◦γ(0) < α and η ◦γ(1) < α. Then, by the Mean Value Theorem for continuous functions, there should be some t∗ ∈ [0, 1] such that η◦γ(t∗ ) = α, which, by (5.85), contradicts (5.82).
5.7. Geometric forms of Hahn-Banach theorem
199
3. Since B(u0 , r) = u0 + r · B(0, 1), (5.83) is equivalent to ∀z ∈ B(0, 1) :
⟨η, u0 + rz⟩ = ⟨η, u0 ⟩ + r⟨η, z⟩ < α,
i.e., α − ⟨η, u0 ⟩ , r By (5.81), α − η(u0 ) > 0 so that we actually have that ∀z ∈ B(0, 1) :
⟨η, z⟩
0 / u ∈ C . α Let’s show that the gauge of a convex set is actually a Minkowski functional as it was defined in Section 5.5. Lemma 5.2 (Minkowski functional of a convex set) Let E be a normed space, C ⊆ E an open convex set with 0 ∈ C. Then pC , the gauge of C, is a Minkowski functional, i.e., ∀λ > 0, ∀u ∈ E : ∀u, v ∈ E :
pC (λu) = λ pC (u);
(5.87)
pC (u + v) ≤ pC (u) + pC (v).
(5.88)
200
Chapter 5. Fundamental theorems of Functional Analysis
Moreover, ∃M > 0, ∀u ∈ E :
0 ≤ pC (u) ≤ M ∥u∥;
(5.89)
C = {u ∈ E / pC (u) < 1}.
(5.90)
Proof. 1. Let’s prove (5.87). So let λ > 0 and u ∈ E, generic. a) We have that Aλu = λ Au .
(5.91)
In fact, by using (5.86), we have that λ 1 Aλu = α > 0 / u ∈ C = α > 0 / u∈C α α/λ α α 1 1 = λ · > 0/ u∈C =λ· > 0/ u∈C λ α/λ λ α/λ 1 = λ · β > 0 / u ∈ C = λ · Au . β b) We have, by (5.91), that pC (λu) = inf(Aλu ) = inf(λ · Au ) = λ · inf(Au ) = λ · pC (u). Since u and λ were chosen arbitrarily, we have proved (5.87). 2. Let’s prove (5.90). Let’s denote D = {u ∈ E / pC (u) < 1} a) Let’s prove that C ⊆ D. We have that pC (0) = 0 < 1 so that 0 ∈ D. Let u ∈ C \ {0}, generic. Since C is open there is some β > 0 such ˜ that B(u, β) ⊆ C. Then, by choosing 0 < β˜ < β and ϵ = β/∥u∥ we have that β˜ 1 u+ u= u ∈ B(u, β) ⊆ C, ∥u∥ 1/(1 + ϵ) 1 so that pC (u) = inf(Au ) ≤ < 1, whence u ∈ D. 1+ϵ b) Let’s prove that D ⊆ C. Let u ∈ D, generic. Then there is α > 0 such that 1 u ∈ C. (5.92) pC (u) < α < 1 and α Since C is convex and 0 ∈ C, we have, by (5.92), that 1 α u + (1 − α)0 = u ∈ C. α 3. Let’s prove (5.89). We have that pC (0) = 0 = ∥0∥. Let u ∈ E \ {0}, generic. Since C is open and 0 ∈ C, there is r > 0 such that B(0, r) ⊆ C. Then, for all ϵ > 0 we have that r 1 u= u ∈ B(0, r) ⊆ C, (1 + ϵ)∥u∥ (1 + ϵ)∥u∥/r
5.7. Geometric forms of Hahn-Banach theorem
201
(1 + ϵ)∥u∥ , whence, by letting ϵ −→ 0, we get (5.89) with r M = 1/r, as u was chosen arbitrarily. 4. Let’s prove (5.88). Let u, v ∈ E, generic. We have, for ϵ > 0, that 1 pC (u) pC u = 0. Hence η˜ ∈ E ′ . We have that η˜(u0 ) = 1,
(5.96)
and, by (5.90), ∀u ∈ C :
η˜(u) ≤ pC (u) < 1.
(5.97)
From (5.96) and (5.97), we get (5.93). ■ Proof. [of Theorem 5.11] Let’s consider the set C = A − B. Since A ∩ B = ∅, it follows that 0 ∈ / C. By point 1 of Theorem 4.25, A − B is convex.
(5.98)
Since A is open, for every v ∈ E, A − v is open, so that [ C =A−B = (A − v)
(5.99)
v∈B
is open. By (5.98) and (5.99), we can apply Lemma 5.3. Then there exists η˜ ∈ E ′ such that for every u ∈ C, η˜(u) < η˜(0) = 0, i.e., ∀u1 ∈ A, ∀u2 ∈ B :
η˜(u1 − u2 ) < 0,
∀u1 ∈ A, ∀u2 ∈ B :
η˜(u1 ) < η˜(u2 ).
so that Then by choosing α ∈ R such that sup η˜(u1 ) ≤ α ≤ inf η˜(u2 ), we have that u1 ∈A
the hyperplane [˜ η = α] separates A and B.
u2 ∈B
■
Theorem 5.12 (Hahn-Banach, second geometric form) Let E be a normed space and A, B ⊆ E non-void convex sets such that A ∩ B = ∅. Assume that A is closed and B compact. Then there is a closed hyperplane that strictly separates A and B.
5.7. Geometric forms of Hahn-Banach theorem
203
Proof. We have to prove that there is η ∈ E ′ and α ∈ R such that H = [η = α] strictly separates A and B, i.e., that for some ϵ > 0, ∀u ∈ A :
η(u) ≤ α − ϵ,
(5.100)
∀v ∈ B :
η(v) ≥ α + ϵ.
(5.101)
1. Let δ > 0. Then the sets Aδ = A + BE (0, δ) and Bδ = B − BE (0, δ) are convex, open and non empty. 2. Let’s prove that ∃δ0 > 0, ∀δ ∈]0, δ0 [:
Aδ ∩ Bδ = ∅.
(5.102)
Let’s proceed by Reduction to Absurdity. Then let’s assume the existence of sequences (εn )n∈N ⊆ R+ and (zn )n∈N ⊆ E such that lim εn = 0, and n→+∞
∀n ∈ N :
zn ∈ Aεn ∩ Bεn .
For each n ∈ N, we have that zn = un + αn = vn − βn , with un ∈ A, vn ∈ B and αn , βn ∈ BE (0, εn ). Then ∥ un − vn ∥< 2εn .
(5.103)
Since B is compact, we can extract from (vn )n∈N a subsequence (vnk )k∈N such that lim vnk = v ∈ B, which, together with (5.103), implies that k→+∞
lim ∥ unk − v ∥= 0, i.e.,
k→+∞
lim unk = v.
k→+∞
Since A is closed, v ∈ A and, consequently, v ∈ A ∩ B which is a contradiction. 3. Let’s take δ ∈]0, δ0 [. Now, by Theorem 5.11, there exist η ∈ E ′ and α ∈ R such that the closed hyperplane [η = α] separates Aδ and Bδ : ∀u ∈ A, v ∈ B, ∀z1 , z2 ∈ BE (0, 1) :
η(u + δz1 ) ≤ α ≤ η(v − δz2 ),
i.e., ∀u ∈ A, v ∈ B, ∀z1 , z2 ∈ BE (0, 1) :
η(u) + δη(z1 ) ≤ α ≤ η(v) − δη(z2 ).
From here we get that ∀u ∈ A, v ∈ B :
η(u) + δ∥η∥ ≤ α ≤ η(v) − δ∥η∥.
Then, by choosing ϵ = δ∥η∥ > 0, we get (5.100) and (5.101). ■ Remark 5.17 Observe that the geometric forms of Hahn-Banach provide sufficient conditions to separate disjoint convex sets. This is useful in an infinite-dimensional situation. In the finite-dimensional situation always is possible to separate two disjoint convex sets. Let’s finish this section with a useful consequence of the second geometric form of Hahn-Banach.
204
Chapter 5. Fundamental theorems of Functional Analysis
Corollary 5.8 (Non-dense linear subspace) Let E be a normed space and F ⊆ E a linear subspace such that F ̸= E. Then there exists η ∈ E ′ \ {0} such that ∀u ∈ F : ⟨η, u⟩ = 0. The proof is required as an exercise at the end of the chapter. Remark 5.18 (Dense linear subspace) A statement which is equivalent to Corollary 5.8 is as follows. A linear subspace U ⊆ E is dense in E iff ∀η ∈ E ′ :
η ↾U = 0 ⇒ η = 0.
5.8. Adjoint operator Let’s consider two normed spaces X and Y , T ∈ L(X, Y ) and g ∈ Y ′ . Then, by Corollary 5.1, gT ∈ X ′ . This means that we can define an operator T ′ : Y ′ → X ′ by T ′ (g) = gT. (5.104) We shall say that T ′ is the adjoint operator of T . From(5.104), it immediately follows that ∀u ∈ X, ∀g ∈ Y ′ : ⟨g, T (u)⟩ = ⟨T ′ (g), u⟩, (5.105) which is actually a characterization of the adjoint of T . Proposition 5.2 (Basic properties of the adjoint operator) Let X and Y be normed spaces. Then ∀T, S ∈ L(X, Y ) :
(T + S)′ = T ′ + S ′ ;
(5.106)
(λT )′ = λT ′ ;
(5.107)
∀λ ∈ R, ∀T ∈ L(X, Y ) :
′
(IdX ) = IdX ′ . The proof of this result is easy and is required as an exercise at the end of the chapter. Theorem 5.13 (Norm of the adjoint) Let X and Y be normed spaces and T ∈ L(X, Y ). Then T ′ ∈ L(Y ′ , X ′ ) and ∥T ′ ∥ = ∥T ∥. Proof. 1. Let’s prove that T ′ is linear, i.e., ∀λ ∈ R, ∀f, g ∈ Y ′ :
T ′ (λf + g) = λT ′ (f ) + T ′ (g).
(5.108)
Let λ ∈ R and f, g ∈ Y ′ , generic. By using (5.105), we have, for a generic u ∈ X, that ⟨T ′ (λf + g), u⟩ = ⟨λf + g, T (u)⟩ = λ ⟨f, T (u)⟩ + ⟨g, T (u)⟩ = λ ⟨T ′ (f ), u⟩ + ⟨T ′ (g), u⟩ .
5.8. Adjoint operator
205
Since u was chosen arbitrarily, we have proved that T ′ (λf + g) = λT ′ (f ) + T ′ (g). Since f and g were chosen arbitrarily, we have proved (5.108). 2. Let’s prove that ∥T ′ ∥ ≤ ∥T ∥. Let u ∈ X and g ∈ Y ′ , generic. By (5.105), we have that | ⟨T ′ (g), u⟩ | = | ⟨g, T (u)⟩ | ≤ ∥g∥ ∥T ∥ ∥u∥. Since u was chosen arbitrarily, the last implies that ∥T ′ (g)∥ ≤ ∥T ∥ ∥g∥, which, by the arbitrariness of g, allows us to conclude. 3. To finish the proof, we have to show that ∥T ∥ ≤ ∥T ′ ∥, i.e., that ∀u ∈ X :
∥T (u)∥ ≤ ∥T ′ ∥ ∥u∥.
(5.109)
Let u ∈ X, generic. If u ∈ Ker(T ), then (5.109) holds immediately. Therefore, let’s assume that u ∈ X \ Ker(T ). Then v0 = T (u) ∈ Y \ {0}. Now, by Remark 5.13, we pick g0 ∈ Y ′ such that ∥g0 ∥ = 1,
g0 (v0 ) = g(T (u)) = ∥v0 ∥ = ∥T (u)∥.
Then, ∥T (u)∥ = |g(T (u))| = | ⟨g0 , T (u)⟩ | = | ⟨T ′ (g0 ), u⟩ | ≤ ∥T ′ ∥ ∥g0 ∥ ∥u∥ = ∥T ′ ∥ ∥u∥, whence ∀u ∈ X \ Ker(T ) : and
∥T (u)∥ ≤ ∥T ′ ∥, ∥u∥
∥T (u)∥ ≤ ∥T ′ ∥. u̸=0 ∥u∥
∥T ∥ = sup
■ Remark 5.19 (Adjoint of the inverse and of a composition product) It’s easy to prove that if T ∈ L(X, Y ) and S ∈ L(Y, Z), then (ST )′ = T ′ S ′ . If T is bijective and T −1 ∈ L(Y, X), then (T ′ )−1 = (T −1 )′ .
(5.110)
Example 5.16 (Adjoint in finite dimension) Let’s recall that any T ∈ L(Rn , Rm ) has a unique representation as T (u) = A u, where A ∈ Mmn (R). Here u =
206 Chapter 5. Fundamental theorems of Functional Analysis u1 .. = Rd , and, therefore, T ′ . . Since for all d ∈ N, Rd is a Hilbert space, (Rd )′ ∼ un can be considered as an element of L(Rm , Rn ). Actually, it can be proved that v1 T ′ (v) = At v where At ∈ Mnm (R) denotes the transpose of A. Here v = ... . vm Then the properties (5.106)-(5.110) are the typical known properties of the transposition of matrices.
5.9. Reflexivity and separability (I) In this section, we introduce the concept of reflexivity and show a first group of properties, some related to the concept of separability. In Section 6.3, we shall consider more properties related to weak topologies. Remember that a metric space (X, d) is said to be separable iff there is some D ⊆ X countable and dense. In Theorem 4.15, we stated that a Hilbert space is separable iff it has a Hilbert basis. In Theorem 4.19, we showed that, for p ≥ 1 and I ⊆ R measurable, the space Lp (I) is separable and, therefore, L2 (I) has a Hilbert basis. Let’s recall that for a normed space E, its dual E ′ is complete and so it’s a Banach space. It’s immediate that the second dual (or bidual) of E, denoted E ′′ is also a Banach space. Sometimes, whenever E is complete, we can identify E with E ′′ in a particular way; in this case, we shall say that E is reflexive. We shall make very clear this because reflexivity is a very important property. Let E be a normed space and u ∈ E, fixed. We consider the linear mapping ψu : E ′ −→ R given by ⟨ψu , η⟩E ′′ ,E ′ = ⟨η, u⟩E ′ ,E .
(5.111)
′′
It’s easy so show that ψu ∈ E and that ∥ψu ∥E ′′ = ∥u∥E .
(5.112)
As u was arbitrary in (5.112), the canonical mapping, J : E −→ E ′′ , given by J(u) = ψu , is an isometry that let us identify each u ∈ E with ψu ∈ E ′′ . By doing this, we could write (5.111) as ∀u ∈ E, ∀η ∈ E ′ :
⟨u, η⟩ = ⟨η, u⟩ .
Definition 5.6 (Reflexivity) Let E be a normed space. We say that E is reflexive iff the canonical mapping J is an isomorphism. In Definition 5.6, the use of the canonical mapping is imperative as there are non-reflexive spaces that are isomorphic to their biduals (but not thru J). Remark 5.20 It’s clear that a reflexive space has to be a Banach space. Therefore, a non-reflexive space can not be isomorphic to the dual of some normed space. It’s also clear that a finite-dimensional normed space is reflexive.
5.9. Reflexivity and separability (I)
207
Theorem 5.14 (Reflexivity of a Hilbert space) Let H be a Hilbert space. Then H is reflexive. Proof. We have to prove that J : H −→ H ′′ is surjective, i.e., ∀ψ ∈ H ′′ , ∃u0 ∈ H :
ψ = J(u0 ),
i.e., ∀ψ ∈ H ′′ , ∃u0 ∈ H, ∀η ∈ H ′ :
⟨ψ, η⟩H ′′ ,H ′ = ⟨η, u0 ⟩H ′ ,H .
(5.113)
Let’s consider the isomorphism R : H ′ −→ H provided by the Riesz-Fr´echet theorem (see Remark 5.12). Then v = R(φ)
⇐⇒
∀u ∈ H :
⟨φ, u⟩H ′ ,H = (u, v).
It’s easy to check that the functional (·, ·)1 : H ′ × H ′ → R, given by (φ, η)1 = (R(φ), R(η)),
(5.114)
is an inner-product. Since H is complete, the Riesz-Fr´echet theorem shows that (H ′ , (·, ·)1 ) is also a Hilbert space. Let ψ ∈ H ′′ , generic. It’s not difficult to show that ψ also belongs to (H ′ , (·, ·)1 )′ . By the Riesz-Fr´echet theorem, there is a unique φ0 ∈ H ′ such that ∀η ∈ H ′ :
⟨ψ, η⟩ = (η, φ0 )1 = (R(η), R(φ0 )).
(5.115)
Now, let’s take u0 = R(φ0 ) ∈ H. Then, by considering (5.114), point (5.115) becomes ∀η ∈ H ′ : ⟨ψ, η⟩H ′′ ,H ′ = ⟨η, u0 ⟩H ′ ,H . Since η was chosen arbitrarily, we have proved (5.113).
■
Example 5.17 (lp (R) is reflexive) Let 1 < p < +∞. By adapting the scheme ′ ′ used in Example 5.11, it can be proved that (lp (R)) ∼ = lp (R). As a consequence, the space lp (R) is reflexive. Next, we shall prove that a separable normed space with a non-separable dual space can not be reflexive; this is the case of l1 (R) and L1 (I). We shall need the following result. Lemma 5.4 (Existence of a functional) Let Y be a proper linear subspace of a normed space X. Let u0 ∈ X \ Y and δ = d(Y, u0 ). Then there exists η˜ ∈ X ′ such that ∥˜ η ∥ = 1,
Y ⊆ Ker(˜ η ),
η˜(u0 ) = δ.
(5.116)
The proof of this result is required as an exercise at the end of the chapter. The idea is to define a bounded linear functional η : ⟨Y ∪ {u0 }⟩ −→ R by η(v + αu0 ) = αδ, where v ∈ Y and α ∈ R. After proving that η verifies (5.116), it can be applied Theorem 5.5 to obtain η˜.
208
Chapter 5. Fundamental theorems of Functional Analysis
Theorem 5.15 (Dual separability implies separability) Let E be a normed space. If E ′ is separable then E is separable. Proof. Let’s assume that E ′ is separable. Then S ′ = {η ∈ E ′ / ∥η∥ = 1} is also separable. Then we can pick M = {ηn / n ∈ N} ⊆ S ′ such that M = S ′ . By (5.11), there is a sequence (un )n∈N ⊆ E such that ∀n ∈ N :
∥un ∥ = 1 ∧ |ηn (un )| ≥
1 . 2
Let Y = ⟨{un / n ∈ N}⟩. Y is separable because the set of finite linear combinations of {un / n ∈ N} with rational coefficients is dense in ⟨{un / n ∈ N}⟩. We finish the proof by showing that E = Y . Let’s proceed by Reduction to Absurdity. Then let’s assume that Y ⊊ E. Since Y is closed, by Lemma 5.4, there is η ∈ S ′ such that Y ⊆ Ker(η). Since for each n ∈ N, un ∈ Y , it follows that 1 ≤ |ηn (un )| = |ηn (un ) − η(un )| = |(ηn − η)(un )| ≤ ∥ηn − η∥ ∥un ∥ 2 so that ∥ηn − η∥ ≥ 1/2, contradicting the condition of M being dense in S ′ . ■ Corollary 5.9 (Sufficient condition for non-reflexivity) Let E be a separable normed space such that E ′ is not separable. Then E is not reflexive. Proof. Let’s assume that E is reflexive. Then, via J, E ′′ is separable. Theorem 5.15 then implies that E ′ is separable, a contradiction. ■ Theorem 5.16 (Reflexivity of the dual) Let E be a reflexive Banach space. Then E ′ is reflexive. Proof. Let J : E −→ E ′′ and J˜ : E ′ −→ E ′′′ be the canonical mappings. We know that J(E) = E ′′ and denote ψu = J(u), u ∈ E. We have to prove that ˜ ′ ) = E ′′′ , i.e., J(E ˜ ∀φ ∈ E ′′′ , ∃η ∈ E ′ : φ = J(η), which means ∀φ ∈ E ′′′ , ∃η0 ∈ E ′ , ∀ψ ∈ E ′′ :
⟨φ, ψ⟩E ′′′ ,E ′′ = ⟨ψ, η0 ⟩E ′′ ,E ′ .
(5.117)
′′′
Let φ ∈ E , generic. Let’s define η0 : E −→ R by η0 (u) = ⟨φ, ψu ⟩E ′′′ ,E ′′ . ′
(5.118)
′′
It’s easy to show that η0 ∈ E . Let ψ ∈ E , generic. Since E is reflexive, for some u0 ∈ E we have that ψ = ψu0 and ∀η ∈ E ′ :
⟨ψu0 , η⟩E ′′ ,E ′ = ⟨η, u⟩E ′ ,E .
(5.119)
Then, by (5.118) and (5.119), we get ⟨φ, ψ⟩E ′′′ ,E ′′ = ⟨φ, ψu0 ⟩E ′′′ ,E ′′ = ⟨η0 , u0 ⟩E ′ ,E = ⟨ψu0 , η0 ⟩E ′′ ,E ′ = ⟨ψ, η0 ⟩E ′′ ,E ′ . Since φ and ψ were chosen arbitrarily, we have proved (5.117).
■
5.10. Uniform boundedness principle
209
5.10. Uniform boundedness principle In Remark 3.14, we mentioned how Baire’s [theorem is usually applied. If E ̸= ∅ is a complete metric space such that E = Xn , where, for each n ∈ N, Xn ⊆ E n∈N
is closed. Then there exists some n0 ∈ N such that int(Xn0 ) ̸= ∅. As we shall see, the uniform boundedness principle, also called Banach-Steinhaus theorem, is obtained by applying Baire’s theorem. Theorem 5.17 (Uniform boundedness principle) Let E and F be Banach spaces and (Tλ )λ∈Λ ⊆ L(E, F ) such that ∀u ∈ E, ∃cu > 0, ∀λ ∈ Λ :
∥Tλ (u)∥ < cu .
(5.120)
Then, ∃c > 0, ∀λ ∈ Λ :
∥Tλ ∥ ≤ c.
(5.121)
Remark 5.21 (From pointwise to uniform estimation) Observe that (5.120) is a pointwise estimate that can be written as ∀u ∈ E :
sup ∥Tλ (u)∥ < +∞. λ∈Λ
In other hand, (5.121) is a uniform estimate that can be written as ∃c > 0, ∀λ ∈ Λ, ∀u ∈ E :
∥Tλ (u)∥ < c ∥u∥.
(5.122)
Xn = {u ∈ E / ∀λ ∈ Λ : ∥Tλ (u)∥ ≤ n},
(5.123)
Proof. We have to prove (5.121). 1. For n ∈ N, we define
which is clearly closed. From (5.120), it follows that E =
[
Xn . Then, by
n∈N
Baire’s theorem, there is n0 ∈ N such that int(Xn0 ) ̸= ∅. Then there exist u0 ∈ Xn0 and r > 0 such that B(u0 , r) ⊆ Xn0 .
(5.124)
Since B(u0 , r) = u0 + r B(0, 1), (5.124) is equivalent to ∀z ∈ B(0, 1) :
∥Tλ (u0 + rz)∥ ≤ n0 ,
i.e., ∀λ ∈ Λ, ∀z ∈ B(0, 1) :
Tλ 1 u0 + Tλ (z) ≤ n0
r r
(5.125)
210
Chapter 5. Fundamental theorems of Functional Analysis
2. Now, let’s choose c = 2n0 /r. Let λ ∈ Λ and z ∈ B(0, 1), generic. Then, by using (5.125), (5.124) and (5.123), we have that
1 1
∥Tλ (z)∥ = Tλ (z) + Tλ u0 − Tλ (u0 )
r r
1 1 n0 n0
≤
Tλ (z) + Tλ r u0 + r ∥Tλ (u0 )∥ ≤ r + r = c. (5.126) Since z was chosen arbitrarily, (5.126) implies that ∥Tλ ∥ = sup ∥Tλ (z)∥ ≤ c, ∥z∥≤1
which, by the arbitrariness of λ, shows (5.121). ■ Example 5.18 (Non-completeness by the uniform boundedness principle) We consider the linear space P(R) equipped with the norm given by the greatest size of its coefficients: ∥p∥ = max |αj |, j=0,...,Np
where p(t) =
Np X
αj tj . Here Np = deg(p). Let’s consider (νn )n∈N ⊆ (P(R))′ ,
j=0
where ⟨νn , 0f ⟩ = 0 and ⟨νn , p⟩ =
n−1 X
αj .
j=0
Now, for a fixed p ∈ P(R), let’s denote cp = ∥p∥(Np + 1) > 0. Then, for all n ∈ N: n−1 X Np X X n−1 |νn (p)| = αj ≤ |αj | ≤ ∥p∥ 1 = ∥p∥ (Np + 1) = cp . j=0 j=0 j=0 Since p was chosen arbitrarily, we have proved that ∀p ∈ P(R), ∃cp > 0, ∀n ∈ N :
|νn (p)| < cp .
If (P(R), ∥ · ∥) were complete we would have the uniform estimation ∃c > 0, ∀n ∈ N :
∥νn ∥ ≤ c.
Let’s prove that this is not the case, so that (P(R), ∥ · ∥) is not complete. For each m ∈ N, let’s consider pm ∈ P(R) given by pm (t) =
m X
tj .
k=0
Then, for each n ∈ N: ∥νn ∥ ≥
|νn (pn )| = n, ∥pn ∥
so that (∥νn ∥)n∈N ⊆ R is not bounded.
5.10. Uniform boundedness principle
211
Corollary 5.10 (Pointwise limit defines a bounded operator) Let E and F be Banach spaces and (Tn )n∈N ⊆ L(E, F ). Let’s assume that for every u ∈ E, there is an element T (u) ∈ F such that T (u) = lim Tn (u). n→+∞
(5.127)
Then, sup ∥Tn ∥ < +∞,
(5.128)
n∈N
T ∈ L(E, F ), ∥T ∥ ≤ lim inf ∥Tn ∥.
(5.129)
n→+∞
Remark 5.22 In the context of the Corollary 5.10, observe that it’s not necessarily true that (Tn )n∈N converges to T . That’s one of the meanings of (5.129). Proof. 1. From (5.127), it immediately follows that T is linear. 2. For each u ∈ E, the sequence (Tn (u))n∈N ⊆ F is convergent, so that it’s also bounded. Then (5.128) follows from the uniform boundedness principle, Theorem 5.17. Then, by (5.122), there is c > 0 such that ∀n ∈ N, ∀u ∈ E :
∥Tn (u)∥ ≤ c ∥u∥,
whence, by letting n → +∞, ∀u ∈ E :
∥T (u)∥ ≤ c∥u∥,
so that T ∈ L(E, F ). 3. For all n ∈ N, we have, for every u ∈ E, that ∥Tn (u)∥ ≤ ∥Tn ∥ ∥u∥, so that ∀u ∈ E :
lim inf ∥Tn (u)∥ ≤ lim inf ∥Tn ∥ ∥u∥, n→+∞
n→+∞
and, consequently, ∀u ∈ E :
∥T (u)∥ ≤ lim inf ∥Tn ∥ ∥u∥, n→+∞
which implies (5.129). ■ The following two corollaries are quite useful in applications and are obtained easily by applying the uniform boundedness principle. Corollary 5.11 (Weakly bounded implies bounded) Let E be a Banach space and B ⊆ E such that ∀η ∈ E ′ :
⟨η, B⟩ = {⟨η, v⟩ / v ∈ B} is bounded in R.
(5.130)
212
Chapter 5. Fundamental theorems of Functional Analysis
Then, B is bounded. Proof. For each v ∈ B, let’s consider ψv = J(v) ∈ E ′′ , where J is the canonical mapping. By (5.130), for every η ∈ E ′ , there is cη > 0 such that ∀v ∈ B :
| ⟨ψv , η⟩ | = | ⟨η, v⟩ | ≤ cη .
Since J preserves the norm, the uniform boundedness principle applied to the family (ψv )v∈B provides ∃c > 0, ∀v ∈ B :
∥ψv ∥E ′′ = ∥v∥E ≤ c,
i.e., B is bounded.
■
Corollary 5.12 (Weakly-∗ bounded implies bounded) Let E be a Banach space and D ⊆ E ′ such that ∀u ∈ E :
⟨D, u⟩ = {⟨η, u⟩ / η ∈ D} is bounded in R.
(5.131)
Then, D is bounded. The proof is required as an exercise at the end of the chapter.
5.11. The open mapping theorem and the closed graph theorem Let (X, T ) and (Y, G) be topological spaces. We say that f : X −→ Y is an open mapping iff ∀A ∈ T : f (A) ∈ G. The main result of this section, the open mapping theorem, states that a surjective bounded linear operator is open. To prove it, we need a couple of lemmas. Lemma 5.5 Let E and F be Banach spaces and T ∈ L(E, F ) surjective. Then there exists c > 0 such that T (BE (0, 1)) ⊇ BF (0, 2c).
(5.132)
Proof. 1. For each n[ ∈ N let’s write Xn = n · T (BE (0, 1)). Since T is onto we have that F = Xn . By Baire’s theorem (see Remark 3.14), there is n0 ∈ N n∈N
such that int(Xn0 ) ̸= ∅. 2. By point 1, we have that int T (BE (0, 1)) ̸= ∅, so that there exist c > 0 and v0 ∈ F such that BF (v0 , 4c) ⊆ T (BE (0, 1)).
(5.133)
5.11. The open mapping theorem and the closed graph theorem
213
3. Obviously v0 ∈ T (BE (0, 1)). Let’s show that −v0 ∈ T (BE (0, 1))
(5.134)
Let’s pick a sequence (wn )n∈N ⊆ T (BE (0, 1)) such that wn −→ v0 as n −→ +∞. For each n ∈ N, there is un ∈ BE (0, 1) such that T (un ) = wn , and it’s clear that −un ∈ BE (0, 1). Then lim T (−un ) = − lim T (un ) = − lim wn = −v0 ,
n→+∞
n→+∞
n→+∞
whence follows (5.134). 4. By adding (5.133) to (5.134), and using Theorems 4.25 and 4.26, it follows that BF (0, 4c) ⊆ T (BE (0, 1)) + T (BE (0, 1)) = 2 · T (BE (0, 1)), whence it follows (5.132). ■ Lemma 5.6 Let E and F be Banach spaces and T ∈ L(E, F ). Assume that there exists c > 0 such that (5.132) holds. Then, T (BE (0, 1)) ⊇ BF (0, c).
Proof. We have to prove that ∀v ∈ BF (0, c), ∃u ∈ BE (0, 1) :
T (u) = v.
(5.135)
Let v ∈ BF (0, c), generic. We shall construct the element u for (5.135) to hold. 1. Let’s first observe that (5.132) means that T (BE (0, 1)) is dense in BF (0, 2c), so that ∀r > 0 :
T (BE (0, r)) is dense in BF (0, 2cr).
(5.136)
2. Then, by taking r = 1/2 in (5.136), we have that T (BE (0, 1/2)) is dense in BF (0, c). Since v ∈ BF (0, c), we have that ∀ϵ > 0, ∃z ∈ BE (0, 1/2) :
∥v − T (z)∥ ≤ ϵ.
(5.137)
Therefore, for ϵ = c/2 in (5.137), we choose z1 ∈ B(0, 1/2) such that ∥v − T (z1 )∥ < c/2. 3. Then, by taking r = 1/4 in (5.136), we have that T (BE (0, 1/4)) is dense in BF (0, c/2). Since v − T (z1 ) ∈ BF (0, c/2), we have that ∀ϵ > 0, ∃z ∈ BE (0, 1/4) :
∥(v − T (z1 )) − z∥ ≤ ϵ.
(5.138)
Therefore, for ϵ = c/4 in (5.138), we choose z2 ∈ B(0, 1/4) such that ∥(v − T (z1 )) − T (z2 )∥ < c/4.
214
Chapter 5. Fundamental theorems of Functional Analysis
4. Proceeding like in 1 and 2 we create a sequence (zn )n∈N ⊆ E such that
! n
X c 1
zk < n . (5.139) ∀n ∈ N : ∥zn ∥ < n ∧ v − T
2
2 k=1
By (5.139) and Remark 4.12, the series associated to (zn )n∈N is absolutely +∞ X convergent and we can choose u = zn , so that, again by (5.139), T (u) = n=1
v.
■ Theorem 5.18 (Open mapping theorem) Let E and F be Banach spaces and T ∈ L(E, F ) surjective. Then, 1. there exists c > 0 such that T (BE (0, 1)) ⊇ BF (0, c);
(5.140)
2. T is an open mapping. Proof. 1. This point is readily proved by Lemmas 5.5 and 5.6. 2. Let U ⊆ E, open. Let’s prove that T (U ) is open, i.e., that T (U ) ⊆ int(T (U )).
(5.141)
Let v0 ∈ T (U ), generic. Then there is some u0 ∈ E such that T (u0 ) = v0 . Since U is open, we can pick r > 0 such that BE (u0 , r) ⊆ U , i.e., u0 + r · BE (0, 1) ⊆ U , which implies that v0 + r · T (BE (0, 1)) ⊆ T (U ). Then, by using (5.140), it follows that BF (v0 , rc) ⊆ T (U ), i.e., v0 is an interior point of T (U ). Since v0 was chosen arbitrarily, we have proved (5.141). ■ For bounded operators working on Banach spaces, we have the following important result. Corollary 5.13 (The inverse of a bounded operator is bounded) Let E and F be Banach spaces and T ∈ L(E, F ) bijective. Then T −1 ∈ L(F, E).
Proof. By the open mapping theorem and point 2 of Theorem 2.17, it follows that T −1 ∈ L(F, E). ■ A second consequence of the open mapping theorem is that two comparable Banach norms are equivalent.
5.11. The open mapping theorem and the closed graph theorem
215
Corollary 5.14 (Equivalence of Banach norms) Let E be a linear space such that (E, ∥ · ∥1 ) and (E, ∥ · ∥2 ) are Banach spaces. If the norms ∥ · ∥1 and ∥ · ∥2 are comparable, then they are equivalent. We require the proof of this result as an exercise at the end of the chapter. Remark 5.23 (Product norm) Let X and Y be normed spaces. The usual norm on the linear space X × Y , called the product norm, is given by ∥(u, v)∥ = ∥u∥X + ∥v∥Y . It’s clear that if X and Y are Banach spaces then X × Y is also a Banach space. Not all the linear operators which appear in applications of Functional Analysis are bounded. For example, the position and momentum operators of Quantum Mechanics, presented in Examples 5.1 and 5.2, are not bounded. However, almost all the linear operators relevant to applications are closed linear operators. Definition 5.7 (Closed linear operator) Let X and Y be normed spaces and T : D(T ) ⊆ X → Y a linear operator. We say that T is a closed linear operator if its graph, G (T ) = {(u, T (u)) / u ∈ D(T )}, is closed in X × Y . Remark 5.24 (Characterization of a linear closed operator) Let X and Y be normed spaces and T : D(T ) ⊆ X → Y a linear operator. Then T is closed iff for all sequence (un )n∈N ⊆ D(T ) such that lim un = u
n→+∞
∧
lim T (un ) = v,
n→+∞
it holds u ∈ D(T )
∧
T (u) = v.
Theorem 5.19 (Closed graph theorem) Let (X, ∥ · ∥X ) and (Y, ∥ · ∥Y ) be Banach spaces and T : D(T ) ⊆ X → Y a closed linear operator. Assume that D(T ) is closed in X. Then T ∈ L(D(T ), Y ). Proof. 1. Observe that (D(T ), ∥ · ∥X ) is a Banach space because D(T ) is closed. 2. On D(T ) the graph norm is defined by ∥u∥ = ∥u∥X + ∥T (u)∥Y .
(5.142)
3. By using the fact that G (T ) is closed, it’s possible to prove that (D(T ), ∥ · ∥) is a Banach space. 4. From (5.142), it immediately follows that ∀u ∈ D(T ) :
∥u∥X ≤ ∥u∥,
216
Chapter 5. Fundamental theorems of Functional Analysis so that ∥ · ∥ dominates ∥ · ∥X on D(T ). Then Corollary 5.14 implies that the norms ∥ · ∥ and ∥ · ∥X are equivalent on D(T ). Then there exists c > 0 such that ∀u ∈ D(T ) : ∥u∥ ≤ c∥u∥X , whence ∀u ∈ D(T ) :
∥T (u)∥Y ≤ ∥u∥X + ∥T (u)∥Y = ∥u∥ ≤ c∥u∥X ,
i.e., T is bounded. ■ Remark 5.25 Observe that the inverse of Theorem 5.19 is true because the graph of any continuous function is closed. Example 5.19 (Differentiation as a closed non-bounded operator) Let’s consider the operator D : C1 ([0, 1]) ⊆ C([0, 1]) −→ C([0, 1]), given by D[u] = u′ . Here, both the domain and codomain of D are endowed with the norm ∥ · ∥∞ . In Example 5.4, we showed that D is not bounded. Let’s use Remark 5.24 to prove that D is a closed operator. So let’s take a generic sequence (un )n∈N ⊆ C1 ([0, 1]) such that lim un = u and lim u′n = v, n→+∞
n→+∞
and let’s prove that u ∈ C1 ([0, 1]) and that u′ = v. Since convergence in ∥ · ∥∞ is uniform convergence, we have, by applying the Fundamental Theorem of Calculus, for t ∈ [0, 1] that Z t Z t Z t v(τ )dτ = lim u′n (τ )dτ = lim u′n (τ )dτ = u(t) − u(0), 0
0 n→+∞ 1
n→+∞
0
′
which shows that u ∈ C ([0, 1]) and that u = v. Corollary 5.15 (Closed operator) Let (X, ∥ · ∥X ) and (Y, ∥ · ∥Y ) be normed spaces and T : D(T ) ⊆ X → Y . Then: 1. If D(T ) is a closed subset of X, then T is closed. 2. If T is closed and Y is Banach, then D(T ) is a closed subset of X.
5.12. Problems Problem 5.1 Let (V, ∥ · ∥V ) and (W, ∥ · ∥W ) be normed spaces. Prove that L(V, W ) = {T ∈ L(V, W ) / T is bounded}, is a linear subspace of L(V, W ). Problem 5.2 Prove point (5.63). Problem 5.3 Let (V, ∥ · ∥V ) and (W, ∥ · ∥W ) be normed spaces and T ∈ L(V, W ). Prove that ∥T (u)∥W ∥T ∥ = sup = sup ∥T (u)∥. ∥u∥V u̸=0 ∥u∥=1
5.12. Problems
217
Problem 5.4 Let (V, ∥ · ∥V ) and (W, ∥ · ∥W ) be normed spaces and T ∈ L(V, W ). Prove that if T is continuous at a particular point u0 ∈ V , then T ∈ L(V, W ). Problem 5.5 Let V, W and U be normed spaces. Prove that if T ∈ L(V, W ) and S ∈ L(W, U ), then ST ∈ L(V, U ), and that ∥ST ∥ ≤ ∥S∥ ∥T ∥. Problem 5.6 Let V be a Banach space. Prove that L(V ) endowed with the composition product is a Banach algebra with unity. Problem 5.7 Consider the linear operator T : l∞ (R) → l∞ (R), given by x n , x = (xn )n∈N . T [x] = n n∈N 1. Prove that T ∈ L(l∞ (R)). 2. Try to compute ∥T ∥. 3. Prove that Im(T ) is not closed in l∞ (R). Problem 5.8 Let B(R) denote the space of bounded real functions, endowed with norm ∥ · ∥∞ . Consider the delay operator T : B(R) −→ B(R), given by T [x](t) = x(t − τ ), for some delay τ > 0. 1. Prove that T ∈ L(B(R)). 2. Try to compute ∥T ∥. Problem 5.9 Consider the linear operator T : C([0, 1]) −→ C([0, 1]), given by Z t T [f ](t) = f (τ ) dτ. 0
1. 2. 3. 4.
Prove that T ∈ L(C([0, 1])). Try to compute ∥T ∥. Find Im(T ). Is T is injective? If so, find the inverse of the operator T˜ : C([0, 1]) −→ Im(T ), given by T˜[f ] = T [f ], and determine if T˜−1 is continuous.
Problem 5.10 On C([0, 1]) consider the operators T and S, given by Z 1 S[u](t) = t · u(τ ) dτ, T [u](t) = t · u(t). 0
1. 2. 3. 4.
Prove that T ∈ L(C([0, 1])) and compute ∥T ∥. Prove that S ∈ L(C([0, 1])) and compute ∥S∥. Do S and T commute? Try to compute ∥ST ∥ and ∥T S∥.
Problem 5.11 Let V be a Banach space and A ∈ L(V ) such that ∥A∥ < 1. By using Banach fixed point theorem and the corresponding iteration, prove that Id − A is bijective and that (Id − A)−1 =
+∞ X k=0
Ak .
218
Chapter 5. Fundamental theorems of Functional Analysis
Problem 5.12 Consider on C([0, 1]) the linear operator Ak . 1. 2. 3. 4. 5.
Prove that Ak ∈ L(C([0, 1])). Try to compute ∥Ak ∥. Find Ker(Ak ). Is Ak surjective? If Ak is bijective, find A−1 k . Z t x(τ )dτ ; A2 [x](t) = t2 · x(0); A1 [x](t) =
A3 [x](t) = x(t2 ).
0
Problem 5.13 Consider on L2 ([0, 1]) the linear operator Ak . 1. 2. 3. 4. 5.
Prove that Ak is bounded. Try to compute ∥Ak ∥. Find Ker(Ak ). Is Ak surjective? If Ak is bijective, find A−1 k . Z 1 A1 [x](t) = t · x(τ )dτ ;
Z A2 [x](t) =
0
t
x(τ ) dτ ; 0
( x(t), t ∈ [0, λ] A3 [x](t) = , 0, t ∈ (λ, 1]
where 0 < λ < 1.
Problem 5.14 Consider the linear operator A. 1. 2. 3. 4. 5.
Prove that A is bounded. Try to compute ∥A∥. Find Ker(A). Is A surjective? If A is bijective, find A−1 . A : C1 ([a, b]) −→ C([a, b]),
A[x](t) = x′ (t);
A : H1 ([0, 1]) −→ H1 ([0, 1]),
A[x](t) = t · x(t);
A : H1 ([0, 1]) −→ L2 ([0, 1]),
A[x](t) = t · x(t).
Problem 5.15 Let X and Y be normed spaces and A, B ∈ L(X, Y ) such that Ker(A) = Ker(B) and Im(A) = Im(B). Is it true that A = B? Problem 5.16 Consider the formula A[x](t) = φ(t) x(t). Find conditions on the function φ : [0, 1] −→ R for A to belong to L(X). 1. X = C([0, 1]); e 2 ([0, 1]). 2. X = L Problem 5.17 Let φ0 , ..., φn ∈ C([a, b]). Consider the linear operator A : Cn ([a, b]) −→ C([a, b]), given by n X A[u](t) = φk (t) u(k) (t). k=1
5.12. Problems
219
1. Prove that A is bounded. 2. Try to compute ∥A∥. Problem 5.18 Let E be a Banach space and T ∈ L(E). Assume that ∃β > 0, ∀u ∈ E :
∥T (u)∥ ≥ β∥u∥.
1. Prove that Im(T ) is closed in E. 2. Consider E = C([a, b]) and T [u](t) = φ(t) u(t). Which conditions should verify φ ∈ C([a, b]) \ {0} for Im(T ) to be closed in C([a, b])? Problem 5.19 Let α ≥ 0. Let’s consider the linear space ( Cα =
x ∈ C([0, +∞[) / ∥x∥α =
sup
)
αt
e |x(t)| < +∞ .
t∈[0,+∞[
1. Prove that ∥ · ∥α is a norm on Cα . 2. Prove that (Cα , ∥ · ∥α ) is a Banach space. 3. Consider the function y : [0, +∞[−→ R, given by y(t) = tσ e−γt , where γ > α and σ ≥ 0. Prove that y ∈ Cα and that σ σ ∥y∥α = e−σ . γ−α 4. Let β ≥ 0 and λ ∈ Cβ . Consider the operator Λ : Cα −→ Cα+β , given by Λ[x](t) = λ(t) x(t). Prove that Λ ∈ L(Cα , Cα+β ) and that ∥Λ∥ = ∥λ∥β . 5. Let β > α ≥ γ. Consider the operator A : Cα −→ Cγ , given by Z t A[x](t) = e−β(t−τ ) x(τ ) dτ. 0
Prove that A ∈ L(Cα , Cγ ) and that 1 if γ = α, β − α, ∥A∥ = 1/(β−α) (α − γ)α−γ , if γ < α. (β − γ)β−γ Problem 5.20 Let X and Y be Banach spaces. Let A ∈ L(X, Y ) such that given y ∈ Y , there is x ∈ X verifying ||Ax − y|| ≤ α||y||
and ||x|| ≤ β||y||,
for some constants α ∈]0, 1[ and β > 0. Prove that for all y ∈ Y , the equation Ax = y,
x ∈ X,
has a solution x0 ∈ X such that ||x0 || ≤
β ||y||. 1−α
220
Chapter 5. Fundamental theorems of Functional Analysis
Problem 5.21 In the space C([−1, 1]), consider the operators T and S given by T [x](t) =
1 [x(t) + x(−t)] ; 2
S[x](t) =
1 [x(t) − x(−t)] . 2
1. Prove that T ∈ L(C([−1, 1])) and compute ∥T ∥. 2. Prove that S ∈ L(C([−1, 1])) and compute ∥S∥. 3. Are T and S projection operators? Problem 5.22 Let X and Y be linear spaces. We say that the functional h : X ×Y −→ R is bilinear on X ×Y iff for every u ∈ X and v ∈ Y , h(u, ·) ∈ L(Y, R) and h(·, v) ∈ L(X, R). If X and Y are normed spaces, we say that h is bounded iff ∃c > 0, ∀(u, v) ∈ X × Y : |h(u, v)| ≤ c∥u∥ ∥v∥. We denote by BI(X × Y ) the set of bounded bilinear functionals on X × Y . 1. Prove that BI(X × Y ) is a linear space. 2. Prove that the functional ∥ · ∥ : BI(X × Y ) → R, given by ∥h∥ = inf(Oh ), where Oh = {c > 0 / ∀(u, v) ∈ X × Y : |h(u, v)| ≤ c∥u∥ ∥v∥}, is a norm. 3. Prove that for all h ∈ BI(X × Y ): ∥h∥ =
|h(u, v)| = sup |h(u, v)|. sup ∥u∥ ∥v∥ ∥u∥ = 1 u ̸= 0 ∥v∥ = 1 v ̸= 0
4. Assume that (·, ·) is an inner-product on X. Prove that (·, ·) ∈ BI(X × X). 5. Assume now that X and Y are Hilbert spaces. Prove that there is a unique S ∈ L(X, Y ) such that ∀(u, v) ∈ X × Y :
h(u, v) = (S(u), v),
and ∥S∥ = ∥h∥. Problem 5.23 Let H be a Hilbert space. Prove that H ∼ = H ′′ . Problem 5.24 Let A ∈ L(H), where H is a separable Hilbert space. Prove that ∥A∥ = inf
x,y∈H
|(Ax, y)| . ||x|| · ||y||
Problem 5.25 Let H be a Hilbert space, Z a closed linear subspace of H and η ∈ Z ′ . Prove that there is η˜ ∈ H ′ , an extension of η, such that ∥η∥Z ′ = ∥˜ η ∥H ′ , 1. without using the Hahn-Banach theorem; 2. using the Hahn-Banach theorem.
5.12. Problems
221
Problem 5.26 Let X be a normed space, Z a linear subspace of X and η ∈ Z ′ . 1. Prove that there is η˜ ∈ X ′ , an extension of η, such that ∥η∥Z ′ = ∥˜ η ∥X ′ . 2. Prove that for any u0 ∈ X \ {0} there is η ∈ X ′ such that η(u0 ) = ∥u0 ∥2 .
∥η∥ = ∥u0 ∥, Hint. Apply point 1 for Z = ⟨{u0 }⟩.
Problem 5.27 Let X be a normed space separable, Z a linear subspace of X and η ∈ Z ′ . Without using Zorn’s lemma nor Hahn-Banach theorem, prove that there is η˜ ∈ X ′ , an extension of η, such that ∥η∥Z ′ = ∥˜ η ∥X ′ . Problem 5.28 Let E be a normed space, C ⊆ E an open convex set and u0 ∈ / C. Assume that 0 ∈ / C. Prove that there exists η ∈ E ′ such that ∀u ∈ C :
η(u) < η(u0 ).
Idea. Use the result for the case 0 ∈ C which was presented in Section 5.7. Problem 5.29 Let E be a normed space and u0 ∈ E \ {0}. Prove that there is ν ∈ E ′ such that ∥ν∥ = 1/∥u0 ∥ and ⟨ν, u0 ⟩ = 1. Problem 5.30 Let E be a normed space and F, U ⊆ E linear subspaces. 1. Prove that if F ̸= E, then there exists η ∈ E ′ \ {0} such that ∀u ∈ F :
⟨η, u⟩ = 0.
Hint. Use a geometric form of Hahn-Banach theorem. 2. Prove that U is dense in E iff ∀η ∈ E ′ :
η ↾F = 0 ⇒ η = 0.
Problem 5.31 Consider on C([−1, 1]) the linear functional ηk . 1. 2. 3. 4.
Prove that ηk is bounded. Try to compute ∥ηk ∥. Try to find explicitly Ker(ηk ). g ([−1, 1]) such that the action of ηk is written Search for a function w ∈ BV as a Riemann-Stieljes integral. Does w belong to C1 ([−1, 1])? η1 (x) = η3 (x) =
n X
x(−1) + x(1) ; 3
αk x(tk ),
η2 (x) = 2[x(1) − x(0)];
where αk ∈ R and tk ∈ [−1, 1] are given;
k=1
η4 (x) =
1 [x(ϵ) + x(−ϵ) − 2x(0)], 2ϵ
for some ϵ ∈ [−1, 1];
222
Chapter 5. Fundamental theorems of Functional Analysis 1
Z
Z
η5 (x) =
1
η6 (x) = −x(0) +
x(t)dt;
x(t)dt; −1
0
Z
0
1
Z x(t)dt −
η7 (x) =
x(t)dt;
−1
0
n X 1 k η8 (x) = x(t)dt − x ; 2n + 1 n −1 Z
1
k=−n
Z
1
t x(t)dt;
η9 (x) =
x(−1) + x(1) + 2
η10 (x) =
−1
Z
1
t x(t)dt. −1
Problem 5.32 On C([0, 1]) consider the linear functional ηk . 1. Is ηk continuous? g ([0, 1]) such that the action 2. If the answer is yes, search for a function w ∈ BV of ηk is written as a Riemann-Stieljes integral. Does w belong to C1 ([0, 1])? Z
1
η1 (u) =
u
√ t dt;
(5.143)
0
Z η2 (u) =
1
u(t2 )dt;
(5.144)
0
Z η3 (u) = lim
n→+∞
1
u (tn ) dt.
(5.145)
0
Problem 5.33 For which values of p ≥ 1, do the formulas (5.143), (5.144) and (5.143) define a continuous functional on Lp ([0, 1])? Problem 5.34 Consider on E the linear functional η. 1. Prove that η is bounded. 2. Try to compute ∥η∥. 3. Try to find explicitly Ker(η). Z
1
η(x) =
t x(t)dt,
E = L1 ([−1, 1]);
t x(t)dt,
E = L2 ([−1, 1]);
−1 Z 1
η(x) = −1 1
Z η(x) =
t−1/3 x(t)dt,
E = L2 ([−1, 1]).
−1
Problem 5.35 Consider on E the linear functional η. 1. Prove that η is bounded. 2. Try to compute ∥η∥. 3. Try to find explicitly Ker(η).
5.12. Problems
223
For x = (xn )n∈N ⊆ R: E = l2 (R);
η(x) = x1 + x2 , η(x) =
+∞ X xn , n n=1
E = l2 (R);
+∞ X xn , E = l1 (R); n n=1 +∞ X 1 xn , E = l1 (R); η(x) = 1− n n=1
η(x) =
η(x) = x1 + x2 , η(x) =
+∞ X
E = m;
21−n xn ,
E = c0 ;
n=1
η(x) = lim xn ,
E = c.
n→+∞
Here m, c and c0 are, respectively, the set sequences which are bounded, converging and converging to zero, endowed with the norm ∥ · ∥∞ . Problem 5.36 Let u, v, w ∈ C([a, b]) and C10 ([a, b]) = {x ∈ C1 ([a, b]) / x(a) = x(b) = 0}. ′ 1. Prove that F ∈ C ([a, b]) , where F (x) = 1
b
Z
u(t)x′ (t)dt.
a
′ 2. Prove that G ∈ C ([a, b]) , where G(x) = 1
Z
b
[v(t)x(t) + w(t)x′ (t)] dt.
a
3. Prove that ∀x ∈ C10 ([a, b]) :
F (x) = 0,
implies that u ∈ P0 ([a, b]). 4. Prove that ∀x ∈ C10 ([a, b]) :
G(x) = 0,
1
′
implies that w ∈ C ([a, b]) and w = v. Problem 5.37 Let X be a normed space and F : X −→ R linear. Prove that F ∈ X ′ iff for all c ∈ R the sets Fc = {u ∈ X / ⟨F, u⟩ < c} and F c = {u ∈ X / ⟨F, u⟩ > c} are open. Problem 5.38 Let E be a normed space and η : E −→ R linear. Prove that η ∈ E ′ iff Ker(η) is closed. Problem 5.39 Let M =
Z
0
x ∈ C([−1, 1]) /
Z
1
x(t)dt = −1
0
1. Prove that M is a closed linear subspace of C([−1, 1]). ′ 2. Find η ∈ (C([−1, 1])) such that M = Ker(η).
x(t)dt .
224
Chapter 5. Fundamental theorems of Functional Analysis
3. Assume that x ∈ C([−1, 1]) \ M . Show that there is no y ∈ M such that d(x, M ) = ∥x − y∥. Problem 5.40 Consider η : C([−1, 1]) −→ R, given by ⟨η, x⟩ = x(0). 1. Prove that η ∈ (C([−1, 1]))′ . Z
1
2. Find g ∈ BV ([−1, 1]) such that ⟨η, x⟩ =
x(t) dg(t). −1
Z
1
1
3. Prove that there is no f ∈ C ([−1, 1]) such that ⟨η, x⟩ =
x(t) f (t) d(t). −1
Problem 5.41 Consider η : C([−1, 1]) −→ R, given by Z 1 x(−1) + x(1) ⟨η, x⟩ = + t x(t) dt. 2 −1 1. Prove that η ∈ (C([−1, 1]))′ . Z
1
2. Find g ∈ BV ([−1, 1]) such that ⟨η, x⟩ =
x(t) dg(t). −1
Problem 5.42 Consider η : H1 ([−1, 1]) −→ R, given by Z 1 ⟨η, x⟩ = [x(t) sin(t) + x′ (t) cos(t)] . −1
1. Prove that η ∈ (H ([−1, 1]))′ . 2. Compute ∥η∥. 1
Problem 5.43 Let X and Y be normed spaces. Prove that ∀T, S ∈ L(X, Y ) :
(T + S)′ = T ′ + S ′ ;
∀λ ∈ R, ∀T ∈ L(X, Y ) :
(λT )′ = λT ′ ; (IdX )′ = IdX ′ ;
∀n ∈ N, ∀T ∈ L(X, Y ) :
(T n )′ = (T ′ )n .
Problem 5.44 Let X and Y be normed spaces. Prove that if T ∈ L(X, Y ) and S ∈ L(Y, Z), then (ST )′ = T ′ S ′ . Problem 5.45 Let X be a normed space. Prove that if T ∈ L(X, Y ) is bijective with T −1 ∈ L(Y, X), then (T ′ )−1 = (T −1 )′ . Problem 5.46 Let X and Y be normed spaces. Prove that Φ : L(X, Y ) −→ L(Y ′ , X ′ ), given by Φ(A) = A′ , is continuous. Problem 5.47 Let H be a separable Hilbert space and A ∈ L(H). Identify H ∼ = H ′ and denote A∗ = A′ . Prove that Ker(AA∗ ) = Ker(A∗ ); Im(AA∗ ) = Im(A);
Ker(A∗ A) = Ker(A); ∥AA∗ ∥ = ∥A∥2 ;
(Im(A))⊥ = Ker(A∗ );
(Im(A∗ ))⊥ = Ker(A);
(Ker(A))⊥ = Im(A∗ );
(Ker(A∗ ))⊥ = Im(A).
5.12. Problems
225
Problem 5.48 Let X be a Banach space and A ∈ L(X). Prove that (Im(A))⊥ = Ker(A′ ). The orthogonality is understood in the sense of the duality product ⟨·, ·⟩X ′ ,X . Problem 5.49 Identify L2 ([0, 1]) with its dual. Consider the linear operator A : L2 ([0, 1]) −→ L2 ([0, 1]). 1. Prove that A ∈ L(L2 (0, 1)). 2. Find A∗ = A′ . Z
t
x(τ ) dτ ;
A[x](t) =
A[x](t) = t x(t);
0
Z A[x](t) = t
1
Z x(τ )dτ ;
1
A[x](t) =
0
tx(t)dt. 0
Problem 5.50 Identify (l1 (R))′ ∼ = l∞ (R). Consider the linear operator A : l1 (R) −→ 1 l (R). 1. Prove that A ∈ L(l1 (R)). 2. Try to compute ∥A∥. 3. Find A′ . For x = (x1 , x2 , ...) ∈ l1 (R), consider A(x) = (x1 , x2 , ..., xn , 0, 0, 0, ...); A(x) = (λ1 x1 , λ2 x2 , ..., λn xn , ...),
where ∥(λn )∥∞ ≤ 1;
A(x) = (0, x1 , x2 , ..., xn , ...); A(x) = (x2 , x3 , ..., xn , ...). Problem 5.51 Let X and Y be normed spaces, T ∈ L(X, Y ) and ∅ ̸= M ⊆ X. The annihilator of M is defined as M ⊥ = {η ∈ X ′ / ∀u ∈ M : ⟨η, u⟩ = 0}. 1. Prove that M ⊥ is a closed linear subspace of X ′ . ⊥ = Ker(T ′ ). 2. Prove that Im(T ) 3. Let U1 , U2 be two different linear subspaces of X. Prove that U1⊥ ̸= U2⊥ . Problem 5.52 Let X and Y be normed spaces, T ∈ L(X, Y ) and ∅ = ̸ B ⊆ X ′. ⊥ The annihilator of B is defined as B = {u ∈ X / ∀η ∈ B : ⟨η, u⟩ = 0}. 1. Prove that B ⊥ is a closed linear subspace of X. ⊥ 2. Prove that Im(T ) ⊆ (Ker(T ′ )) . Problem 5.53 Show that a reflexive space is a Banach space. Problem 5.54 Let 1 < p < +∞. ′ ′ 1. Prove that (lp (R)) ∼ = lp (R). Hint. Adapt the scheme used in Example 5.11.
226
Chapter 5. Fundamental theorems of Functional Analysis
2. Prove that lp (R) is reflexive. Problem 5.55 Let Y be a proper closed linear subspace of a normed space X. Let u0 ∈ X \ Y and δ = d(Y, u0 ). 1. Prove that there is η˜ ∈ X ′ such that ∥˜ η ∥ = 1,
Y ⊆ Ker(˜ η ),
η˜(u0 ) = δ.
(5.146)
Hint. The idea is to define a bounded linear functional η : ⟨Y ∪ {u0 }⟩ −→ R by η(v + αu0 ) = αδ, where v ∈ Y and α ∈ R. Prove that η verifies (5.146) and then apply Theorem 5.5 to obtain η˜. 2. Prove that there is ν ∈ X ′ such that ∥ν∥ =
1 , δ
Y ⊆ Ker(ν),
ν(u0 ) = 1.
3. Give a different, simpler proof of point 1 when X is a Hilbert space. Problem 5.56 Let E be a normed space. Prove that if E is reflexive the E ′ is reflexive. Problem 5.57 Prove that a closed linear subspace of a reflexive Banach space is reflexive . Problem 5.58 Prove that a Banach space E is reflexive iff E ′ is reflexive. Hint. Use Exercise 5.57. Problem 5.59 Let X be a normed space and M ⊆ X. Prove that u0 ∈ ⟨M ⟩ iff ∀η ∈ X ′ :
η ↾M = 0 ⇒ η(u0 ) = 0.
Problem 5.60 Let E be a Banach space and D ⊆ E ′ such that ∀u ∈ E :
⟨D, u⟩ = {⟨η, u⟩ / η ∈ D} is bounded in R.
Prove that D is bounded. Hint. Adapt the proof of Corollary 5.11 to use the uniform boundedness principle. Problem 5.61 Let E be a linear space such that (E, ∥ · ∥1 ) and (E, ∥ · ∥2 ) are Banach spaces. Assume that the norms ∥ · ∥1 and ∥ · ∥2 are comparable. Prove that ∥ · ∥1 and ∥ · ∥2 are equivalent. Problem 5.62 Let X and Y be linear spaces and T ∈ L(X, Y ). Prove that G (T ), the graph of T , is a linear subspace of X × Y . Problem 5.63 Let X and Y be Banach spaces. Prove that X × Y is a Banach space.
5.12. Problems
227
Problem 5.64 Let X and Y be normed spaces and T : D(T ) ⊆ X → Y a linear operator. Prove that T is closed iff for all sequence (un )n∈N ⊆ D(T ) such that lim un = u
n→+∞
∧
lim T (un ) = v,
n→+∞
it holds u ∈ D(T )
∧
T (u) = v.
Problem 5.65 Redo the proof of the closed graph theorem, Theorem 5.19, and prove the points that were just mentioned. Problem 5.66 Let E and F be normed spaces. 1. Prove that the following formulas define norms on E × F : ∥(u, v)∥∞ = max{∥u∥E , ∥v∥F },
∥(u, v)∥2 = ∥u∥2E + ∥v∥2F
1/2
.
2. Prove that the norms ∥ · ∥∞ and ∥ · ∥2 are equivalent to the norm given by ∥(x, y)∥1 = ∥u∥E + ∥v∥F . Problem 5.67 If the inverse T −1 of a closed linear operator exists show that T −1 is a closed linear operator. Problem 5.68 Let X and Y be normed spaces and T : D(T ) ⊆ X → Y a closed linear operator. Assume that (un )n∈N , (wn )n∈N ⊆ D(T ) are such that lim un = lim wn
n→+∞
n→+∞
and that (T (un ))n∈N , (T (wn ))n∈N ⊆ Y are convergent sequences. Prove that lim T (un ) = lim T (wn ).
n→+∞
n→+∞
Problem 5.69 Let X and Y be normed spaces and T : D(T ) ⊆ X → Y a closed linear operator. Show that Ker(T ) is a closed subspace of X. Problem 5.70 For each n ∈ N, consider An : l2 (R) −→ l2 (R) given by An (x) = (xn+1 , xn+2 , xn+3 , ...),
x = (x1 , x2 , ...).
1. Prove that for each n ∈ N, An ∈ L(l2 (R)); 2. Prove that lim An = 0. n→+∞
3. For each n ∈ N, find A′n . 4. Is it true that lim A′n = 0? n→+∞
Problem 5.71 Let X be a Banach space, A ∈ L(X). assume that for every t ∈ R, +∞ X ϕ(t) = λk tk , converges. k=0
228
Chapter 5. Fundamental theorems of Functional Analysis ! n X k 1. Prove that (Sn )n∈N = λk A ⊆ L(X) converges to ϕ(A) in k=0
n∈N
L(X). 2. Prove that ||eA || ≤ e||A|| . 3. Find eId , where Id is the identity operator. 4. Prove that the series
+∞ X k=0
such that ||Ak0 || < 1.
Ak converges in L(X) iff there is k0 ∈ N ∪ {0}
6. Weak topologies, reflexivity and separability In this chapter, we study the so-called weak topologies on a normed space E and its dual space E ′ , and their properties related to separability, reflexivity, Hausdorff condition, etc. These spaces are already equipped with strong topologies, T∥·∥E and T∥·∥E′ , which are the topologies induced by the norms ∥·∥E and ∥·∥E ′ , respectively. The weak topologies are useful because a topology with a less number of open sets has more compact sets and, in its turn, this allows to solve optimization problems by results like that presented in Remark 2.22. Before studying this chapter, we recommend the student to review the material presented in Section 2.7.
6.1. Weak convergence and topology σ(E, E ′ ) Let E be a Banach space. Let’s consider the elements of E ′ as a family of functions, that is, let’s consider the family (η)η∈E ′ . Let’s recall that a particular η0 belongs to E ′ iff the mapping η0 : E −→ R is linear and continuous, i.e., ∃c > 0, ∀u ∈ E :
| ⟨η0 , u⟩ | ≤ c∥u∥.
(6.1)
Definition 6.1 (Weak topology, σ(E, E ′ )) The weak topology, σ(E, E ′ ) on E, is the initial topology associated to the family (η)η∈E ′ Therefore, σ(E, E ′ ) is the weakest topology for which all the elements of E ′ are still continuous mappings. Definition 6.1 immediately implies that σ(E, E ′ ) ⊆ T∥·∥ ,
(6.2)
i.e., all the weak open sets are open in the topology induced by the norm. Remark 6.1 (Finite and infinite dimension) If dim(E) < +∞, then the weak topology σ(E, E ′ ) coincides with the topology induced by the norm, i.e., we get equality in (6.2). Otherwise, if dim(E) = +∞, then σ(E, E ′ ) ⊊ T∥·∥ , so that we have that there are ∥ · ∥-open subsets of E which are not σ(E, E ′ )-open (see Example 6.2). Remark 6.2 (Weak neighborhoods) For future purposes, let’s denote by N∥·∥ (u) and Nw (u) the sets of strong and weak neighborhoods of u ∈ E, i.e., in the topologies T∥·∥ and σ(E, E ′ ), respectively. 229
230
Chapter 6. Weak topologies, reflexivity and separability
Remark 6.3 (The term “bounded”) In the framework of (E, σ(E, E ′ )), the term “continuous” (operator) is better than “bounded” (operator) as boundedness, (6.1), requires the use of a norm. Observe that given u0 ∈ E, ϵ > 0, η ∈ E ′
and a = ⟨η, u0 ⟩ ,
the following set is an open neighborhood of u0 in the topology σ(E, E ′ ): Vu0 (η; ϵ) = η −1 (]a − ϵ, a + ϵ[)
(6.3)
= {u ∈ E / a − ϵ < ⟨η, u⟩ < a + ϵ}
(6.4)
= {u ∈ E / |⟨η, u − u0 ⟩| < ϵ} .
(6.5)
Since the bounded open intervals form a topological basis of (R, U), by Remark 2.19, a basis of σ(E, E ′ ) is built by the finite intersections of sets of the form (6.5). Consequently, an open fundamental system of u0 ∈ E is formed by the sets of the n \ form Vu0 (η1 , ..., ηn ; ϵ1 , ..., ϵn ) = Vu0 (ηk ; ϵk ). Actually, there is a simpler kind k=1
of fundamental system of neighborhoods: Proposition 6.1 (An open fundamental system in σ(E, E ′ )) Let E be a normed space and u0 ∈ E. Then, in the weak topology σ(E, E ′ ), an open fundamental system of u0 is Fw (u0 ) = {Vu0 (η1 , ..., ηn ; ϵ) / ηk ∈ E ′ ∧ ϵ > 0}, where Vu0 (η1 , ..., ηn ; ϵ) =
n \
Vu0 (ηk ; ϵ) = {u ∈ E / ∀k ∈ In : | ⟨ηk , u − u0 ⟩ | < ϵ}.
k=1
The proof of this result is required as an exercise at the end of the chapter. Proposition 6.2 (σ(E, E ′ ) is Hausdorff) Let E be a Banach space. Then the weak topology σ(E, E ′ ) is Hausdorff. Proof. Thanks to Theorem 2.19, it’s enough to prove that the family (η)η∈E ′ separates points of E, i.e., ∀u, v ∈ E, u ̸= v, ∃η ∈ E ′ :
⟨η, u⟩ = ̸ ⟨η, v⟩ .
Let u, v ∈ E, u ̸= v, generic. Let’s consider the disjoint compact convex sets A = {u} and B = {v}. By the second geometric form of Hahn-Banach theorem, Theorem 5.12, there is η ∈ E ′ and α ∈ R such that [η = α] strictly separates A and B, i.e., ∃ϵ > 0 : ⟨η, u⟩ ≤ α − ϵ ∧ ⟨η, v⟩ ≥ α + ϵ, whence ⟨η, u⟩ = ̸ ⟨η, v⟩. Since u and v were chosen arbitrarily, we are done.
■
By Proposition 6.2 and Theorem 2.10, a sequence (un )n∈N ⊆ E which converges in the weak topology σ(E, E ′ ) has a unique limit u ∈ E. In this case, we denote lim un = u,
n→+∞
in σ(E, E ′ ),
6.1. Weak convergence and topology σ(E, E ′ )
231
or un ⇀ u,
as n −→ +∞, in σ(E, E ′ ),
(6.6)
or, simply, un ⇀ u, if there is no confusion. Since for a Hilbert space H it holds H∼ = H ′ , we also write un ⇀ u, weakly in H. Remark 6.4 (Strong and weak convergence) Whenever (6.6) holds, we say that (un )n∈N weakly converges to u. In other hand, when (un )n∈N converges to u in the norm of E, lim ∥un − u∥ = 0, we say that (un )n∈N strongly converges to n→+∞
u. This terminology is congruent with Definition 2.2. Convergence in the weak topology σ(E, E ′ ) is characterized in the following result: Proposition 6.3 (Characterization of the weak convergence) Let E be a Banach space. Then un ⇀ u, in σ(E, E ′ ), iff ∀η ∈ E ′ :
lim ⟨η, un ⟩ = ⟨η, u⟩ .
n→+∞
The proof follows immediately from Theorem 2.20. In the following result, we relax the condition of Proposition 6.3. Theorem 6.1 (Weak convergence and Schauder basis) Let E be a Banach space such that E ′ has a Schauder basis S = {ηm / m ∈ N}. Then, un ⇀ u, in σ(E, E ′ ) iff ∀m ∈ N :
lim ⟨ηm , un ⟩ = ⟨ηm , u⟩ .
n→+∞
The proof is required as an exercise at the end of the chapter. Remark 6.5 (Weak convergence and total sets) The hypothesis of theorem 6.1 can be relaxed by asking S to be total, i.e., ⟨S⟩ = E ′ . A proof of this can be found in [14]. Example 6.1 (Weak convergence in C([a, b]) is point convergence) Let t0 ∈ [a, b]. The t0 -translated Dirac’s delta is the linear functional δt0 : C([a, b]) −→ R, given by ⟨δt0 , u⟩ = u(t0 ). We have that ∀u ∈ C([a, b]) :
| ⟨δt0 , u⟩ | = |u(t0 )| ≤ max |u(t)| = ∥u∥∞ , t∈[a,b]
so that δt0 ∈ (C([a, b]))′ and ∥δt0 ∥ ≤ 1. Moreover, since 1f ∈ C([a, b]) and ∥1f ∥∞ = 1 and ⟨δt0 , 1f ⟩ = 1, it follows, by (5.11), that ∥δt0 ∥ = 1. Now, let’s consider S = {δt0 / t0 ∈ [a, b]}. It can be proved that S is total, i.e., ⟨S⟩ = (C([a, b]))′ . Then the condition of Remark 6.5 is fulfilled and un ⇀ u, in σ(C([a, b]), (C([a, b]))′ ), iff ∀t0 ∈ [a, b] :
lim ⟨δt0 , un ⟩ = ⟨δt0 , u⟩ ,
n→+∞
232
Chapter 6. Weak topologies, reflexivity and separability
iff ∀t0 ∈ [a, b] :
lim un (t0 ) = u(t0 ),
n→+∞
i.e., weak convergence in C([a, b]) is nothing but point convergence. Remark 6.6 (Weak convergence in a separable Hilbert space) By Theorem 4.15, a separable Hilbert space H has a Hilbert basis B = {em / m ∈ N}. By the RieszFr´echet representation theorem, H ∼ = H ′ so that H satisfies the conditions of Theorem 6.1. Therefore, un ⇀ u, weakly in H, iff ∀m ∈ N :
lim (em , un ) = (em , u).
n→+∞
(6.7)
Let’s now see that (6.7) has a nice interpretation. Since B is a Hilbert basis of H, we can write u=
un =
+∞ X
αm em =
+∞ X
(u, em ) em ,
m=1 +∞ X
m=1 +∞ X
m=1
m=1
(n) αm em =
(un , em ) em .
Since B was arbitrary, we come to the conclusion that weak convergence of (un )n∈N to u in H means convergence of the coordinates of un to the corresponding coordinates of u in any Hilbert basis. Remark 6.7 (Weakly and strong boundedness) Let’s recall that in Corollary 5.11, we stated that a subset B of a Banach space E, is bounded iff it’s weakly bounded i.e., ∀η ∈ E ′ :
⟨η, B⟩ = η(B) = {⟨η, v⟩ / v ∈ B} is bounded in R.
Proposition 6.4 (Strong implies weak convergence) Let E be a Banach space, (un )n∈N ⊆ E and u ∈ E. Assume that (un )n∈N strongly converges to u. Then, (un )n∈N weakly converges to u. Proof. 1. By Theorem 2.11, that (un )n∈N strongly converges to u means that ∀U ∈ N∥·∥ (u), ∃N ∈ N :
n > N ⇒ un ∈ U.
Since σ(E, E ′ ) ⊆ T∥·∥ , it follows, by Proposition 6.1, that ∀U ∈ Fw (u), ∃N ∈ N :
n > N ⇒ un ∈ U,
so that, again by Theorem 2.11, (un )n∈N weakly converges to u. 2. The thesis also follows from the relation ∀n ∈ N :
| ⟨η, un − u⟩ | ≤ ∥η∥ ∥un − u∥. ■
6.1. Weak convergence and topology σ(E, E ′ )
233
Proposition 6.5 (Properties of weak convergence) Let E be a Banach space and (un )n∈N ⊆ E a sequence that weakly converges to u ∈ E. We have that 1. (un )n∈N is bounded and ∥u∥ ≤ lim inf ∥un ∥; n→+∞
2. if (ηn )n∈N ⊆ E ′ strongly converges to η ∈ E ′ , then lim ⟨ηn , un ⟩ = ⟨η, u⟩ .
n→+∞
(6.8)
Proof. 1. The boundedness of the sequence (un )n∈N follows from Corollary 5.11. We have that ∀η ∈ E ′ , ∀n ∈ N : | ⟨η, un ⟩ | ≤ ∥η∥ ∥un ∥, so that, by taking inferior limit, we get ∀η ∈ E ′ :
| ⟨η, u⟩ | ≤ ∥η∥ · lim inf ∥un ∥, n→+∞
whence, by Corollary 5.7, ∥u∥ =
max | ⟨η, u⟩ | ≤ max ∥η∥ · lim inf ∥un ∥ ≤ lim inf ∥un ∥. n→+∞ n→+∞ η ∈ E′ η ∈ E′ ∥η∥ ≤ 1 ∥η∥ ≤ 1
2. For all n ∈ N, we have that | ⟨ηn , un ⟩ − ⟨η, u⟩ | = | ⟨ηn , un ⟩ − ⟨η, un ⟩ + ⟨η, un ⟩ − ⟨η, u⟩ | ≤ | ⟨ηn − η, un ⟩ | + | ⟨η, u − un ⟩ | ≤ ∥ηn − η∥ · ∥un ∥ + | ⟨η, u − un ⟩ |, whence, by letting n −→ +∞, we get (6.8). ■ Proposition 6.6 (Continuity in weak topology) Let (Z, T ) be a topological space and E a Banach space. Then, ψ : Z −→ (E, σ(E, E ′ )) is continuous iff ∀η ∈ E ′ : η ◦ ψ ∈ C(Z). The proof is an immediate consequence of Theorem 2.21. Example 6.2 (A closed set which is not weakly closed) Let E be an infinitedimensional Banach space. We know that the sphere S = S(0, 1) = {u ∈ E / ∥u∥ = 1} is closed in (E, ∥ · ∥), i.e., cl∥·∥ (S) = S. However, S is not closed in the weak topology σ(E, E ′ ). Actually, we shall prove that clσ(E,E ′ ) (S) = B E (0, 1), which, in particular, implies that BE (0, 1) ∈ / σ(E, E ′ ).
(6.9)
234
Chapter 6. Weak topologies, reflexivity and separability
Proof. Let’s prove that BE (0, 1) ⊆ clσ(E,E ′ ) (S). Then, by Proposition 6.1 and Remark 2.5, we have to prove that ∀u0 ∈ BE (0, 1), ∀V ∈ Fw (u0 ) :
V ∩ S ̸= ∅.
(6.10)
Let u0 ∈ BE (0, 1) and V ∈ Fw (u0 ), generic. Then, by Proposition 6.1, we can assume that there are some η1 , ..., ηk ∈ E ′ and ϵ > 0 such that V = Vu0 (η1 , ..., ηk ; ϵ) = {u ∈ E / ∀j ∈ Ik : | ⟨ηj , u − u0 ⟩ | < ϵ}. Since E is infinite-dimensional, there is some y0 ∈ E \ {0} such that ⟨ηj , y0 ⟩ = 0,
j = 1, ..., k.
(6.11)
Now, we define g : [0, +∞[−→ R by g(t) = ∥u0 + ty0 ∥. Then g ∈ C([0, +∞[) and, as u0 ∈ BE (0, 1), g(0) = ∥u0 ∥ < 1
and
lim g(t) = +∞.
t→+∞
Therefore, there is some t0 > 0 such that g(t0 ) = ∥u0 + t0 y0 ∥ = 1 so that u0 + t0 y0 ∈ S.
(6.12)
Moreover, by (6.11), we have for every j ∈ Ik , | ⟨ηj , (u0 + t0 y0 ) − u0 ⟩ | = 0 < ϵ, so that u0 + t0 y0 ∈ V. (6.13) By (6.12), (6.13) and the arbitrariness of u0 and V , we have proved (6.10). Until now, we have proved that B E (0, 1) ⊆ clσ(E,E ′ ) (S). It’s easy to check that if ∥u∥ > 1, then u does not belong to clσ(E,E ′ ) (S). ■ Example 6.3 (A non strongly convergent sequence which weakly converges) Let H be a separable Hilbert space with Hilbert basis B = {em / m ∈ N}. It’s easy to show that em ⇀ 0, weakly in H, but (em )m∈N does not converge strongly to 0 because ∥em ∥ = 1, for every m ∈ N. Therefore, 0 ∈ clσ(H,H ′ ) (S(0, 1)), in congruence with point (6.9) in Example 6.2. Example 6.4 (Fourier-Legendre basis) In Section 4.7.1, we saw that the Legendre polynomials form a Hilbert basis for L2 ([−1, 1]), the Fourier-Legendre basis: E = {Pn / n ∈ N}. By Example 6.3, we have that Pn ⇀ 0, weakly in L2 ([−1, 1]). Let’s show now that for a convex set, weakly closed is equivalent to strongly closed. Theorem 6.2 (Closed convex sets) Let E be a Banach space and D ⊆ E convex. Then C is weakly closed iff it’s strongly closed.
6.1. Weak convergence and topology σ(E, E ′ )
235
Proof. Since σ(E, E ′ ) ⊆ T∥·∥ , we just have to prove that if D is strongly closed then it’s weakly closed or, equivalently, that Dc is weakly open. If D = E, then D is weakly closed. Then let’s assume that D ̸= E is strongly closed. Let u0 ∈ Dc , generic. Let’s show that u0 ∈ intσ(E,E ′ ) (Dc ).
(6.14)
By the second geometric form of Hahn-Banach theorem, there are η ∈ E ′ and α ∈ R such that the hyperplane [η = α] strictly separates D and {u0 }. Then, ∀v ∈ D :
⟨η, u0 ⟩ < α < ⟨η, v⟩ .
−1
The set V = η (] − ∞, α[) = {u ∈ E / ⟨η, u⟩ < α}, is a weakly open neighborhood of u0 and V ∩ D = ∅, so that V ⊆ Dc . This shows (6.14). ■ Corollary 6.1 Let E be a Banach space. Assume that un ⇀ u, weakly in σ(E, E ′ ). Let’s denote A = {un / n ∈ N}. Then, there is (yn )n∈N ⊆ Conv(A) ⊆ E such that lim ∥yn − u∥ = 0.
n→+∞
In Corollary 6.1, we are using the definition of convex hull, see Remark 4.30. Remark 6.8 Let (X, T ) be a topological space, S ⊆ X, f : S −→ R and x0 ∈ S. Let’s recall that f is lower semicontinuous (l.s.c.) iff ∀ ϵ > 0, ∃V ∈ T :
x ∈ V ∩ S ⇒ f (x0 ) − ϵ < f (x);
and that f is upper semicontinuous (u.s.c.) iff ∀ ϵ > 0, ∃V ∈ T :
x ∈ V ∩ S ⇒ f (x) < f (x0 ) + ϵ.
Let E be a normed space, φ : E −→ R, u0 ∈ E. Since σ(E, E ′ ) ⊆ T∥·∥ , by (2.43) and (2.44), it immediately follows that 1. if φ is weakly l.s.c. at u0 , then φ is strongly l.s.c. at u0 ; 2. if φ is weakly u.s.c. at u0 , then φ is strongly u.s.c. at u0 ; 3. if φ is weakly continuous at u0 , then φ is strongly continuous at u0 . Corollary 6.2 (Weakly l.s.c.) Let E be a Banach space Assume that φ : E −→ R is convex and strongly l.s.c.. Then φ is weakly l.s.c. The proof is required as an exercise at the end of the chapter. The idea is to use Proposition 4.3. Example 6.5 Let E be a Banach space. Since for every u, v ∈ E and t ∈ [0, 1], | ∥u∥ − ∥v∥ | ≤ ∥u − v∥, ∥tu + (1 − t)v∥ ≤ t∥u∥ + (1 − t)∥v∥, it follows that ∥ · ∥ is continuous and convex. Therefore, ∥ · ∥ is also weakly l.s.c.
236
Chapter 6. Weak topologies, reflexivity and separability
Theorem 6.3 (Continuity of linear operators) Let E and F be Banach spaces and T ∈ L(E, F ). We denote Ew = (E, σ(E, E ′ )) and Fw = (F, σ(F, F ′ )) Then T ∈ L(E, F ) iff T : Ew → Fw is continuous. The proof is required as an exercise at the end of the chapter. The idea is to use the closed graph theorem for one of the implications.
6.2. Weak ∗ convergence and topology σ(E ′ , E) Let E be a Banach space. Until now, we have got two useful topologies on E ′ :1) T∥·∥E′ , the strong topology induced by the norm ∥ · ∥E ′ , and 2) σ(E ′ , E ′′ ), the weak topology. In this section, we shall introduce a topology even weaker, the weak ∗ topology, σ(E ′ , E). As it was already mentioned, a reason to search for topologies with less and less number of open sets is because it increases the number of compact sets - which are nice for solving e.g. optimization problems. Let’s use the notation of Section 5.9. The canonical mapping J : E −→ E ′′ is an isometry which is not always surjective, i.e., not always E is reflexive. In any case, we have E∼ (6.15) = J(E) ⊆ E ′′ . We consider J(E), the image of J, as a family of functions: (ψu )u∈E or (J(u))u∈E . Let’s recall that a particular ψu0 belongs to J(E) iff the mapping ψu0 : E ′ −→ R, given by ⟨ψu0 , η⟩E ′′ ,E ′ = ⟨η, u0 ⟩E ′ ,E , is linear and continuous, ∃c > 0, ∀η ∈ E ′ :
| ⟨ψu0 , η⟩ | = | ⟨η, u0 ⟩ | ≤ c∥η∥E ′ .
Definition 6.2 (Weak ∗ topology, σ(E ′ , E)) The weak ∗ topology, σ(E ′ , E) on E ′ , is the initial topology associated to the family (ψu )u∈E = (J(u))u∈E ⊆ E ′′ . Therefore, σ(E ′ , E) is the weakest topology for which all the elements of J(E) ∼ =E are still continuous mappings. Since J(E) ⊆ E ′′ , it follows that σ(E ′ , E) ⊆ σ(E ′ , E ′′ ) ⊆ T∥·∥E′ ,
(6.16)
i.e., all the weak ∗ open sets are open in the weak topology of E ′ , which in its turn are open in the topology induced by the norm of E ′ . Remark 6.9 The identification (6.15) justifies the use of the notation σ(E ′ , E) instead of σ(E ′ , J(E)). Remark 6.10 (Weak topology for functionals) The ∗ in the term “weak ∗ topology” has to do with the notation E ∗ which is also common for the dual of E. Sometimes the weak ∗ topology σ(E ′ , E) is referred to as the weak topology for functionals.
6.2. Weak ∗ convergence and topology σ(E ′ , E)
237
Remark 6.11 (Reflexivity and finite dimension) Whenever E is a reflexive Banach space, σ(E ′ , E) = σ(E ′ , E ′′ ). This is the case of Hilbert spaces as it was proved in Theorem 5.14. If dim(E) < +∞, then σ(E ′ , E) = σ(E ′ , E ′′ ) = T∥·∥E′ . Given η0 ∈ E ′ , ϵ > 0, u ∈ E
and a = ⟨ψu , η0 ⟩ = ⟨η0 , u⟩ ,
the following set is an open neighborhood of η0 in the topology σ(E ′ , E): Vη0 (u; ϵ) = ψu−1 (]a − ϵ, a + ϵ[) = {η ∈ E ′ / a − ϵ < ⟨ψu , η⟩ < a + ϵ} = {η ∈ E ′ / |⟨η − η0 , u⟩| < ϵ} .
(6.17)
Since the bounded open intervals form a topological basis of (R, U), by Remark 2.19, a basis of σ(E ′ , E) is built by the finite intersections of sets of the form (6.17). Consequently, an open fundamental system of neighborhoods of η0 ∈ E ′ n \ is formed by sets of the form Vη0 (u1 , ..., un ; ϵ1 , ..., ϵn ) = Vη0 (uk ; ϵk ). Actually, k=1
there is a simpler kind of fundamental system of neighborhoods: Proposition 6.7 (An open fundamental system in σ(E ′ , E)) Let E be a Banach space and η0 ∈ E ′ . Then, in the weak ∗ topology σ(E ′ , E), an open fundamental system of η0 is Fw∗ (η0 ) = {Vη0 (u1 , ..., un ; ϵ) / uk ∈ E ∧ ϵ > 0}, where Vη0 (u1 , ..., un ; ϵ) =
n \
Vη0 (uk ; ϵ)
k=1
= {η ∈ E ′ / ∀k ∈ In : | ⟨η − η0 , uk ⟩ | < ϵ}. The proof of this result is required as an exercise at the end of the chapter. Proposition 6.8 (σ(E ′ , E) is Hausdorff) Let E be a Banach space. Then the weak ∗ topology σ(E ′ , E) is Hausdorff. Proof. Thanks to Theorem 2.19, we just need ∀η, ν ∈ E ′ , η ̸= ν, ∃u ∈ E :
η(u) ̸= ν(u),
which is trivially true. ■ By Proposition 6.8 and Theorem 2.10, a sequence (ηn )n∈N ⊆ E ′ which converges in the weak ∗ topology σ(E ′ , E) has only one limit η ∈ E ′ . In this case, we denote lim ηn = η,
n→+∞
or
∗
ηn ⇀ η, ∗
in σ(E ′ , E),
in σ(E ′ , E),
(6.18)
or, simply, ηn ⇀ η, if there is no confusion. In this case, we say that (ηn )n∈N weakly ∗ converges to η.
238
Chapter 6. Weak topologies, reflexivity and separability
Remark 6.12 (Weakly-∗ and strong boundedness) Let’s recall that, in Corollary 5.12, we stated that a subset D of E ′ , E Banach, is bounded iff it’s weakly -∗ bounded, i.e., ∀u ∈ E :
⟨D, u⟩ = {⟨η, u⟩ / η ∈ D} is bounded in R.
Convergence in the weak ∗ topology σ(E ′ , E) is characterized in the following result. Proposition 6.9 (Characterization of weak ∗ convergence) Let E be a ∗ Banach space. Then ηn ⇀ η, in σ(E ′ , E), iff ∀u ∈ E :
lim ⟨ηn , u⟩ = ⟨η, u⟩ .
n→+∞
The proof follows immediately from Theorem 2.20. In the following result, we relax the condition of Proposition 6.9. Theorem 6.4 (Weak ∗ convergence and Schauder basis) Let E be a Ba∗ nach space which has a Schauder basis S = {um / m ∈ N}. Then ηn ⇀ η, ′ in σ(E , E), iff ∀m ∈ N :
lim ⟨ηn , um ⟩ = ⟨η, um ⟩ .
n→+∞
The proof is required as an exercise at the end of the chapter. Remark 6.13 (Weak ∗ convergence and total sets) The hypothesis of Theorem 6.4 can be relaxed by asking S to be total, i.e., ⟨S⟩ = E. Example 6.6 (Dirac’s delta as a weak ∗ limit) Let a ≥ 1. In Section 5.6, we saw that the action of a functional η ∈ (C([−a, a]))′ can be expressed as a Riemann-Stieljes integral, Z a ⟨η, u⟩ = u(t) dw(t), (6.19) −a
where w belongs to BV([−a, a]), the space of functions of bounded variation. Here ∥η∥ = Var(w). As it was mentioned, C1 ([−a, a]) ⊆ BV([−a, a]), and if w ∈ C1 ([−a, a]), then (6.19) can be written as Z a ⟨η, u⟩ = u(t) w′ (t)dt. −a
g ([−a, a]), where We also saw that it holds the isomorphism (C([−a, a]))′ ∼ = BV g BV([−a, a]) is the Banach space formed by the functions x ∈ BV([−a, a]) which are continuous from the right and such that x(−a) = 0. Observe that g ([−a, a]). Qm ([−a, a]) = {u ∈ Cm ([−a, a]) / u(−a) = 0} ⊆ BV In Example 6.1, we saw that Dirac’s delta, δ : C([−a, a]) −→ R, given by ⟨δ, u⟩ = u(0),
6.2. Weak ∗ convergence and topology σ(E ′ , E)
239
belongs to (C([−a, a]))′ with ∥δ∥ = 1. Now, for each n ∈ N, let’s pick ϕn ∈ C([−a, a]) such that ∀t ∈ [−a, a] :
ϕn (t) ≥ 0;
(6.20)
∀|t| > 1/n : Z a
ϕn (t) = 0;
(6.21)
ϕn (t)dt = 1.
(6.22)
−a
We take now Φn : [−a, a] −→ R, given by Z t Φn (t) = ϕn (τ ) dτ. −a
It’s clear that ∀n ∈ N :
g ([−a, a]) ∼ Φn ∈ Q1 ([−a, a]) ⊆ BV = (C([−a, a]))′ .
Then, we have a sequence (νn )n∈N ⊆ (C([−a, a]))′ where Z a Z a ⟨νn , u⟩ = u(t) dΦn (t) = u(t) ϕn (t) dt. −a
(6.23)
−a
∗
Now, let’s prove that νn ⇀ δ, in σ((C([−a, a]))′ , C([−a, a])), i.e., that ∀u ∈ C([−a, a]) :
lim ⟨νn , u⟩ = ⟨δ, u⟩ = u(0).
n→+∞
(6.24)
Proof. Let u ∈ C([−a, a]), generic. By using the Mean Value Theorem for integrals, we have that for each n ∈ N: Z a Z 1/n ⟨νn , u⟩ = u(t) ϕn (t) dt = u(t) ϕn (t) dt −a
−1/n
Z
1/n
= u(αn ) ·
ϕn (t) dt = u(αn ), −1/n
with αn ∈ [−1/n, 1/n], whence, lim ⟨νn , u⟩ = lim u(αn ) = u(0) = ⟨δ, u⟩ .
n→+∞
n→+∞
Since u was chosen arbitrarily, we have proved (6.24).
■
Remark 6.14 (May the Force be with us...) The property that we have proved in Example 6.6, is usually described in Engineering in an intuitive but wrong way. It’s considered the existence of a “function” δ : [−a, a] −→ R such that ( Z a +∞, if t = 0, δ(t) = δ(t) dt = 1, 0, if t ∈ [−a, a] \ {0}, −a and
Z
a
∀u ∈ C([−a, a]) :
u(t)δ(t)dt = u(0). −a
In most of cases, engineers obtain right answers in their computations where Dirac’s delta is involved, but they do not really know what they are doing.
240
Chapter 6. Weak topologies, reflexivity and separability
Example 6.7 Let’s work in Maxima a particular case of Example 6.6. Then let’s consider Dirac’s delta δ : C([−1, 1]) −→ R, given by (% i1)
δ(u):= u(0);
so that (% i2)
δ(cos); 1
(% o2)
3
(% o5)
and for f : [−1, 1] −→ R, given by (% i4)
f(t):= exp(t/2)+2;
we have (% i5)
δ(f);
Let’s create a sequence (φn )n∈N verifying (6.20), (6.21) and (6.22). Let’s consider as a model the quadratic function given by (% i6)
g(t):=(1-tˆ2);
For each n ∈ N we define φn : [−1, 1] −→ R by (% i8) (% i9)
h[n](t):= (3/4)*n*g(n*t); ϕ[n](t):= if abs(t)>1/n then 0 else h[n](t); ϕn (t) := if |t| ≥
1 then 0 else hn (t) n
(% o9)
Condition (6.22) holds: (% i11) integrate(h[n](t),t,-1/n,1/n); 1
Figure 6.1.: The functions φ1 (blue), φ2 (red) and φ3 (green).
Now, for each n ∈ N, let’s define the functional given by (6.24):
(% o11)
6.2. Weak ∗ convergence and topology σ(E ′ , E)
241
(% i12) ν(u):= integrate(u(t)*h[n](t),t,-1/n,1/n); 1 n
Z ν(u) :=
−1 n
u(t)hn (t)dt
(% o12)
Let’s verify that (6.24) holds for a couple of functions: (% i13) ν(cos); 3n 4 sin
1 n
n2 − 4 cos 4
1 n
n
(% o13)
(% i14) limit(ν(cos),n,inf); 1
(% o14)
3
(% o18)
(% i18) limit(ν(f),n,inf);
Proposition 6.10 (Weak implies weak ∗ convergence) Let E be a Banach space, (ηn )n∈N ⊆ E ′ and η ∈ E ′ . Assume that (ηn )n∈N weakly converges to η. Then (ηn )n∈N weakly ∗ converges to η. This last result is an immediate consequence of (6.16). Proposition 6.11 (Properties of weak ∗ convergence) Let E be a Banach space and (ηn )n∈N ⊆ E ′ a sequence that weakly ∗ converges to η ∈ E ′ . We have that 1. (ηn )n∈N is bounded and ∥η∥ ≤ lim inf ∥ηn ∥; n→+∞
2. if (un )n∈N ⊆ E strongly converges to u ∈ E, then lim ⟨ηn , un ⟩ = ⟨η, u⟩ .
n→+∞
The proof is required as an exercise at the end of the chapter. The idea is to adapt the proof of Proposition 6.5. Proposition 6.12 (Continuity and weak ∗ convergence) Let (Z, T ) be a topological space and E a Banach space. Then g : Z −→ (E ′ , σ(E ′ , E)) is continuous iff ∀u ∈ E : ψu ◦ g ∈ C(Z). The proof is an immediate consequence of Theorem 2.21. Next, we show that a linear functional on E ′ which is weakly ∗ continuous can be characterized in a Riesz-like way.
242
Chapter 6. Weak topologies, reflexivity and separability
Theorem 6.5 (Weakly ∗ continuous linear functionals) Let E be a Banach space. Assume that the linear functional ψ : (E ′ , σ(E ′ , E)) −→ R is continuous. Then, there exists u0 ∈ E such that ∀η ∈ E ′ :
⟨ψ, η⟩ = ⟨η, u0 ⟩ .
To build a proof of Theorem 6.5 we will use the following result, required as an exercise at the end of the chapter. Lemma 6.1 (Linear dependence of functionals) Let X be a linear space and η, η1 , ..., ηk : X −→ R linear. Assume that ∀u ∈ X :
[∀j ∈ Ik : ⟨ηj , u⟩ = 0] ⇒ ⟨η, v⟩ = 0.
Then, η ∈ ⟨{η1 , ..., ηk }⟩, i.e., there exist λ1 , ..., λk ∈ R such that k X
η=
λj ηj .
j=1
Proof. [of Theorem 6.5] Since ψ is weakly ∗ continuous at zero, ∀ϵ > 0, ∃Uϵ ∈ Fw∗ (0) :
η ∈ Uϵ ⇒ | ⟨ψ, η⟩ | < ϵ.
(6.25)
Let’s consider ϵ = 1 in (6.25). Since U1 ∈ Fw∗ (0), it has the form U1 = V0 (u1 , ..., un ; δ) = {η ∈ E ′ / ∀j ∈ In : | ⟨η, uj ⟩ | < δ, } for some u1 , ..., un ∈ E and δ > 0. Now, we claim that for a generic η ∈ E ′ \ {0},
[∀j ∈ In : ψuj , η = 0] ⇒ ⟨ψ, η⟩ = 0, (6.26) where ψuj = J(uj ), j = 1, ..., n. Then Lemma 6.1 implies that there exist λ1 , ..., λn ∈ R such that n X ψ= λj ψuj , (6.27) j=1
Let’s choose u0 =
n X
λj uj . Then, by (6.27), we get, for an arbitrary η ∈ E ′ , that
j=1
ψ(η) =
* n X j=1
+ λj ψuj , η
=
n X
n
X λj ψuj , η = λj ⟨η, uj ⟩ = ⟨η, u0 ⟩ ,
j=1
j=1
and we are done. Let’s prove by Reduction to Absurdity that (6.26) holds. So let’s suppose that
⟨ψ, η⟩ = ̸ 0 and ∀j ∈ In : ψuj , η = 0, (6.28) Point (6.28) and the arbitrariness of η imply that ∀η ∈ E ′ \ {0} :
| ⟨ψ, η⟩ | = a > 0,
which implies that (6.25) fails for 0 < ϵ < a, a contradiction. To show this, take some u ∈ Uϵ \ {0}. ■
6.2. Weak ∗ convergence and topology σ(E ′ , E)
243
Corollary 6.3 (Weakly ∗ closed hyperplanes) Let E be a Banach space and H ⊆ E ′ a hyperplane which is closed in σ(E ′ , E). Then there are u0 ∈ E \ {0} and α ∈ R such that H = {η ∈ E ′ / ⟨η, u0 ⟩ = α}. Proof. 1. Since H is an hyperplane in E ′ , there are α ∈ R and φ ∈ L(E ′ , R) \ {0} such that H = {η ∈ E ′ / ⟨φ, η⟩ = α}. 2. In virtue of Theorem 6.5, we just need to show that φ is weakly ∗ continuous. For this, it’s enough to show that φ is weakly ∗ continuous at 0 ∈ E ′ .
(6.29)
3. Let’s prove (6.29). Let η0 ∈ H c . Since H c is open in σ(E ′ , E), there are u1 , ..., un ∈ E and ϵ > 0 such that η0 ∈ V = Vη0 (u1 , ..., un ; ϵ) = {η ∈ E ′ / ∀k ∈ In : | ⟨η − η0 , uk ⟩ | < ϵ}. Since V is convex we have that, for every η ∈ V , either ⟨φ, η⟩ < α or ⟨φ, η⟩ > α. Without loss of generality, let’s assume that ∀η ∈ V :
⟨φ, η⟩ < α.
Then, by putting W = V − η0 , we get that W is a neighborhood of 0 and ∀ν ∈ W :
⟨φ, ν⟩ < α − ⟨φ, η0 ⟩ .
Since W = −W , we also obtain ∀ν ∈ W :
| ⟨φ, ν⟩ | < |α − ⟨φ, η0 ⟩ |.
The last shows (6.29). ■ Remark 6.15 (Closed convex sets) We do not have a result like Theorem 6.2 for the weak ∗ topology σ(E ′ , E). Then a strongly closed convex set could not be weakly ∗ closed. We finish this section by stating a very important compactness result, BanachAlouglu-Bourbaki theorem. Remark 6.16 Given a normed space W , we shall denote its closed unit ball by B W = {v ∈ W / ∥v∥ ≤ 1}. Theorem 6.6 (Banach-Alouglu-Bourbaki) Let E be a Banach space. Then B E ′ is compact in the weak ∗ topology σ(E ′ , E). A totally general proof of this result can be found in [4]. We shall provide a proof only for the much simpler case of E separable. For this we need the following important result.
244
Chapter 6. Weak topologies, reflexivity and separability
Theorem 6.7 (Metrizability of B E ′ ) Let E be a Banach space. Then E is separable iff B E ′ is metrizable in the weak ∗ topology σ(E ′ , E). A proof of this result can be found in [4, Th.3.28]. The first implication is quite similar to the first implication appearing in the proof of Theorem 6.9 below but changing the roles of E and E ′ . Let’s just show a metric for B E ′ . Let S = {un / n ∈ N} be dense in B E , in the sense of the norm ∥ · ∥E . The functional [·] : E ′ → R, given by +∞ X 1 | ⟨η, un ⟩ |, [η] = n 2 n=1 is a norm such that ∀η ∈ E ′ :
[η] ≤ ∥η∥.
Let’s denote by d the metric induced by [·] on E ′ . Then σ(E ′ , E) coincides with the topology induced by [·] when both are restricted to B E ′ . Proof. [of Theorem 6.6 for E separable] By Theorem 6.7, B E ′ is metrizable. Therefore, it’s enough to show that B E ′ is sequentially compact. Let (ηm )m∈N ⊆ B E ′ , generic. We have to prove that (ηm )m∈N has a subsequence which is weakly ∗ convergent in σ(E ′ , E). Since E is separable there is D = {un / n ∈ N} ⊆ E dense. It’s clear that D is also total: ⟨D⟩ = E. We have that ∀m ∈ N : | ⟨ηm , u1 ⟩ | ≤ ∥ηm ∥ ∥u1 ∥ ≤ ∥u1 ∥, (1)
so that (⟨ηm , u1 ⟩)m∈N Then we pick (ηm )m∈N , subsequence of D⊆ R is bounded. E (1) (ηm )m∈N , such that ηm , u 1 ⊆ R is convergent. We have that m∈N
D E (1) (1) ∥ ∥u2 ∥ ≤ ∥u2 ∥, ηm , u2 ≤ ∥ηm
∀m ∈ N : D
(1)
E
(2)
⊆ R is bounded. Then we pick (ηm )m∈N , subsequence E (1) (2) of (ηm )m∈N , such that ηm , u2 ⊆ R is convergent. Working in this so that
ηm , u 2
m∈ND
m∈N
(n)
(n−1)
way, D for every nE∈ N, we pick (ηm )m∈N , subsequence of (ηm )m∈N , such (m) (n) that ηm , u n ⊆ R is convergent. Then (ηm )m∈N is a subsequence of m∈N
(ηm )m∈N such that ∀n ∈ N :
D E (m) ( ηm , un )m∈N ⊆ R (m)
is convergent.
Therefore, by Remark 6.13, it follows that (ηm )m∈N is weakly ∗ convergent in σ(E ′ , E). ■ As a consequence of Theorems 6.6 and 6.7, we have the following very useful result.
6.3. Reflexivity and separability (II)
245
Corollary 6.4 Let E be a separable Banach space and (ηn )n∈N ⊆ E ′ , bounded. Then there exists a subsequence (ηnk )k∈N that converges in the weak ∗ topology σ(E ′ , E). The proof is required as an exercise at the end of the chapter.
6.3. Reflexivity and separability (II) In this section, we continue with the study that was initiated in Section 5.9 about spaces which are separable or reflexive. Let’s state a characterization of reflexive spaces in terms of the compactness of the closed unit ball. Theorem 6.8 (Kakutani’s reflexivity criterion) Let E be a Banach space. Then E is reflexive iff B E is compact in the weak topology σ(E, E ′ ). To prove Theorem 6.8 we shall use Lemma 6.3, below. Lemma 6.2 Let E be a Banach space, η1 , ...ηk ∈ E ′ and γ1 , ..., γk ∈ R. Then the following properties are equivalent: ∀ϵ > 0, ∃uϵ ∈ B E , ∀j ∈ Ik : ∀β1 , ..., βk ∈ R :
| ⟨ηj , uϵ ⟩E ′ ,E − γj | < ϵ;
k k
X X
β j ηj βj γ j ≤
.
j=1 j=1
(6.30) (6.31)
A proof of Lemma 6.2 can be found in [4]. Lemma 6.3 (Density of J(E)) Let E be a Banach space. Then J(B E ) is dense in B E ′′ in the weak topology σ(E ′′ , E ′ ). Therefore J(E) is dense in E ′′ in σ(E ′′ , E ′ ), i.e., J(E) = E ′′ ,
in σ(E ′′ , E ′ ).
Proof. We have to prove that clσ(E ′′ ,E ′ ) (J(B E )) ⊇ B E ′′ , i.e., ∀ψ ∈ B E ′′ , ∀V ∈ Fw∗ (ψ) :
V ∩ J(B E ) ̸= ∅.
(6.32)
Let ψ ∈ B E ′′ and V ∈ Fw∗ (ψ). By Proposition 6.7, there are η1 , ..., ηn ∈ E ′ and ϵ > 0 such that V = Vψ (η1 , ..., ηn ; ϵ) = {θ ∈ E ′′ / ∀j ∈ In : | ⟨ψ − θ, ηj ⟩E ′′ ,E ′ | < ϵ}. To prove that V ∩ J(B E ) ̸= ∅, we have to find an element u ∈ B E such that ψu = J(u) ∈ V , i.e., for j ∈ In , ⟨ψu − ψ, ηj ⟩E ′′ ,E ′ = ⟨ηj , u⟩E ′ ,E − ⟨ψ, ηj ⟩E ′′ ,E ′ < ϵ,
246
Chapter 6. Weak topologies, reflexivity and separability
which can be written as
⟨ηj , u⟩E ′ ,E − γj < ϵ,
(6.33)
where γj = ⟨ψ, ηj ⟩E ′′ ,E ′ . Since ψ ∈ B E ′′ , ∥ψ∥E ′′ ≤ 1 and for generic β1 , ..., βn ∈ R we get * + n n n X X X ψ, = β η β ⟨ψ, η ⟩ = β γ j j j j E ′′ ,E ′ j j j=1 j=1 j=1 E ′′ ,E ′
X
X
n
n
≤ β η ≤ ∥ψ∥E ′′ β η j j . j j
j=1
j=1
(6.34)
By (6.34) and Lemma 6.2, we get an element u ∈ B E verifying (6.33). We conclude (6.32), by the arbitrariness of ψ and V . ■ Proof. [of Theorem 6.8] i) Let’s assume that E is reflexive. By Theorem 6.6, is compact in σ(E ′′ , E ′ ).
B E ′′
(6.35)
Since B E ′′ = J(B E ) and E ′′ ∼ = E, it follows from (6.35) that B E is compact in σ(E, E ′ ). ii) Let’s assume that B E is compact in σ(E, E ′ ). (6.36) By Theorem 6.3, we have that J : (E, σ(E, E ′ )) −→ (E ′′ , σ(E ′′ , E ′′′ )) ′′
′′′
′′
(6.37)
′
is continuous. Since σ(E , E ) ⊇ σ(E , E ), point (6.37) implies that J : (E, σ(E, E ′ )) −→ (E ′′ , σ(E ′′ , E ′ )) is also continuous. Therefore, by (6.36) and Theorem 2.25, it follows that J(B E ) is compact (in particular closed) in σ(E ′′ , E ′ ).
(6.38)
Moreover, by Lemma 6.3, J(B E ) is dense in B E ′′ in the weak topology σ(E ′′ , E ′ ).
(6.39)
From (6.38) and (6.39), we get B E ′′ = J(B E ), and, therefore, E = J(E), i.e., E is reflexive. ■ In the next result, we state that reflexivity is a property inherited from closed subspaces. Proposition 6.13 (Reflexivity of closed subspaces) Let E be a reflexive Banach space and M = M a linear subspace of E. Then M is reflexive. The proof of Proposition 6.13 is required as a guided exercise at the end of the chapter. With help of this proposition we get the following corollaries that complement Theorems 5.16 and 5.15.
6.3. Reflexivity and separability (II)
247
Corollary 6.5 (Reflexivity of the dual) Let E be a Banach space. Then E is reflexive iff E ′ is reflexive. Proof. i) That the reflexivity of E implies the reflexivity of E ′ was proved in Theorem 5.16. ii) Let’s assume that E ′ is reflexive. Then, by Theorem 5.16, E ′′ is reflexive. Since J(E) is strongly closed in E ′′ , Proposition 6.13 implies that J(E) is reflexive. As a consequence, E is reflexive. ■ Corollary 6.6 (Reflexivity and separability of the dual) Let E be a Banach space. Then E is separable and reflexive iff E ′ is separable and reflexive. The proof of Corollary 6.6 is required as an exercise at the end of the chapter. In Theorem 6.7, we stated that a Banach space E is separable iff B E ′ is metrizable in the weak ∗ topology σ(E ′ , E). Next, we write a dual version of this result. Theorem 6.9 (Metrizability of B E ) Let E be a Banach space. Then E ′ is separable iff B E is metrizable in the weak topology σ(E, E ′ ). Remark 6.17 Let E be a reflexive Banach space such that E ′ is separable. Then, by Corollaries 6.5 and 6.6, it follows that E is separable and E ′ is reflexive. By Theorem 6.7, B E ′′ = J(B E ) is metrizable in the weak ∗ topology σ(E ′′ , E ′ ),
(6.40)
and denote by d : B E ′′ ×B E ′′ → R a corresponding metric. By (6.40) and E ∼ = E ′′ ′ it follows that B E is metrizable in the weak topology σ(E, E ), and we could use as metric ρ : B E × B E −→ R, given by ρ(u, v) = d(J(u), J(v)). Proof. i) Let’s assume that E ′ is separable. Then there exists D = {ηm / m ∈ N} ⊆ B E ′ , dense in B E ′ . Let’s define [·] : E −→ R by [u] =
+∞ X 1 | ⟨ηm , u⟩ |. m 2 m=1
The functional [·] is a norm on E and ∀u ∈ E :
[u] ≤ ∥u∥.
Then, by Theorem 4.1, we have that T[·] ⊆ T∥·∥ . Let d be the metric induced by [·]. We shall prove that T[·] = σ(E, E ′ ), on B E .
248
Chapter 6. Weak topologies, reflexivity and separability a) Let’s prove that σ(E, E ′ ) ⊆ T[·] , on B E , i.e., ∀u0 ∈ B E , ∀V ∈ Fw (u0 ) :
V is [·] − open in B E ,
which is equivalent to ∀u0 ∈ B E , ∀V ∈ Fw (u0 ), ∃r > 0 :
Bd (u0 , r) ∩ B E ⊆ V.
Let u0 ∈ B E and V ∈ Fw (u0 ). By Proposition 6.1, V ∩ B E = Vu0 (ξ1 , ..., ξk ; ϵ) ∩ B E = {u ∈ B E / ∀i ∈ Ik : | ⟨ξi , u − u0 ⟩ | < ϵ}, for some ξ1 , ..., ξk ∈ E ′ and ϵ > 0. Without loss of generality, we may assume that ∥ξi ∥ ≤ 1, for every i = 1, ..., k. We have to find some r > 0 such that Bd (u0 , r) ∩ B E ⊆ V ∩ B E . (6.41) Since D is dense, ∀i ∈ Ik , ∃mi ∈ N :
∥ξi − ηmi ∥ < ϵ/4.
Now, we pick r > 0 so that ∀i ∈ Ik :
2mi r < ϵ/2.
Let’s now see that (6.41) holds. Let’s take u ∈ Bd (uu , r) ∩ B E so that [u − u0 ] < r and ∥u∥ ≤ 1. We have, for every i ∈ Ik , 1 | ⟨ηmi , u − u0 ⟩ | < r, 2mi and, therefore, | ⟨ξi , u − u0 ⟩ | = | ⟨ξi − ηni , u − u0 ⟩ + ⟨ηni , u − u0 ⟩ | ϵ ϵ ≤ ∥ξi − ηmi ∥ · 2 + 2mi · r < + = ϵ, 2 2 so that u ∈ V . b) Let’s prove that T[·] ⊆ σ(E, E ′ ), on B E , i.e., ∀u0 ∈ B E , ∀r > 0, ∃V ∈ Fw (u0 ) :
V ∩ B E ⊆ Bd (u0 , r)B E .
Let u0 ∈ B E and r > 0. Let’s pick ϵ = r/2, k ∈ N such that 1 r < 2k−1 2 and V = Vu0 (η1 , ..., ηk ; ϵ) = {u ∈ E / ∀i ∈ Ik : | ⟨ηi , u − u0 ⟩ | < ϵ}. For u ∈ V ∩ B E it holds [u − u0 ] =
k +∞ X X 1 1 | ⟨η , u − u ⟩ | + | ⟨ηm , u − u0 ⟩ | m 0 m 2 2m m=1 m=k+1
0, ∃δ > 0, ∀u, v ∈ B E : ∥u − v∥ > ϵ ⇒ (6.43) 2 Grossly speaking, condition (6.43) requires the closed unit ball B E to be not only convex but actually “round”. We shall prove that Hilbert and Lp spaces are uniformly convex. Remark 6.18 Sometimes uniform convexity is used in the following form, clearly equivalent to (6.43),
u + v
≥ 1 − δ ⇒ ∥u − v∥ ≤ ϵ ∀ϵ > 0, ∃δ > 0, ∀u, v ∈ B E : (6.44) 2
250
Chapter 6. Weak topologies, reflexivity and separability
Figure 6.2.: The p norms in R2 . Source: https://wiki.math.ntnu.no
Example 6.8 (Norms on R2 ) Let’s recall that on R2 we have the collection of norms: p ∥(x, y)∥∞ = max{|x|, |y|}, ∥(x, y)∥p = p |x|p + |y|p , p ≥ 1. The norm ∥ · ∥2 is strictly convex but the norms ∥ · ∥1 and ∥ · ∥∞ are not. Example 6.8 clearly shows that uniform convexity is a geometric property which depends on the norm. Theorem 6.11 (Milman-Pettis) Let E be an uniformly convex Banach space. Then E is reflexive. Proof. 1. Since J(B E ) is strongly closed, to show that J(B E ) = B E ′′ it’s enough to prove that J(B E ) is dense in B E ′′ , i.e., ∀ψ ∈ B E ′′ , ∀ϵ > 0, ∃u ∈ B E :
∥ψ − J(u)∥ ≤ ϵ.
(6.45)
Let ψ ∈ B E ′′ and ϵ > 0. Without loss of generality, we can assume that ∥ψ∥ = 1. We pick δ > 0 so that (6.43) holds. Since 1 = ∥ψ∥ = sup | ⟨ψ, η⟩E ′′ ,E ′ |, ∥η∥=1
there is some η0 ∈ E ′ such that ∥η0 ∥ = 1 and
δ ⟨ψ, η0 ⟩E ′′ ,E ′ > 1 − . 2
(6.46)
2. The set V = {ξ ∈ E ′′ / | ⟨ξ − ψ, η0 ⟩E ′′ ,E ′ | < δ/2}
(6.47)
belongs to Nσ(E ′′ ,E ′ ) (ψ). Since, by Lemma 6.3, J(B E ) is dense in B E ′′ in the weak topology σ(E ′′ , E ′ ), it follows that V ∩ J(B E ) ̸= ∅, whence there exists u ∈ B E such that J(u) ∈ V. (6.48) 3. Let’s prove that ∥ψ − J(u)∥ ≤ ϵ. Let’s assume that ∥ψ − J(u)∥ > ϵ, so that ψ ∈ W , where c W = B(J(u), ϵ)c = J(u) + ϵ B E ′′ . (6.49)
6.4. Uniformly convex spaces
251
By Theorem 6.6, B E ′′ is compact, and therefore closed, in the weak ∗ topology σ(E ′′ , E ′ ). This implies that W is open and belongs to Nσ(E ′′ ,E ′ ) (ψ). By Lemma 6.3, we have that (V ∩ W ) ∩ J(B E ) ̸= ∅, whence there exists v ∈ B E such that J(v) ∈ V ∩ W.
(6.50)
From (6.47), (6.48) and (6.50), we get | ⟨η0 , u⟩E ′ ,E − ⟨ψ, η0 ⟩E ′′ ,E ′ | < δ/2, | ⟨η0 , v⟩E ′ ,E − ⟨ψ, η0 ⟩E ′′ ,E ′ | < δ/2,
(6.51)
which imply that 2 ⟨ψ, η0 ⟩E ′′ ,E ′ < ⟨η0 , u + v⟩E ′ ,E + δ < ∥η0 ∥ ∥u + v∥ + δ = ∥u + v∥ + δ, which, together with (6.46) and (6.43), provide
u + v
2 > 1 − δ and ∥u − v∥ ≤ ϵ, which it’s absurd, as J(v) ∈ W and (6.49) imply ϵ < ∥J(u) − J(v)∥ = ∥J(u − v)∥ = ∥u − v∥. ■ The following property allows us to recover strong convergence from weak convergence. Theorem 6.12 (From weak to strong convergence) Let E be an uniformly convex Banach space, u ∈ E and (un )n∈N ⊆ E such that un ⇀ u,
weakly in σ(E, E ′ ),
lim sup ∥un ∥ ≤ ∥u∥.
(6.52)
n→+∞
Then, lim ∥u − un ∥ = 0.
n→+∞
(6.53)
Idea of the proof. To get (6.53) we need to show that ∀ϵ > 0, ∃N ∈ N :
n > N ⇒ ∥u − un ∥ < ϵ.
For this, we fix a generic ϵ > 0 and choose N such that δ = 1/N is compatible with (6.44). With this in mind the coming proof can be better understood.
252
Chapter 6. Weak topologies, reflexivity and separability
Proof. The case of u = 0 is easy. So let’s assume that u ̸= 0. We write λn = max{∥un ∥, ∥u∥},
vn =
1 un , λn
and v =
1 u, ∥u∥
so that λn −→ ∥u∥, as n −→ +∞, and vn ⇀ v, weakly in σ(E, E ′ ). Then, by Proposition 6.5,
vn + v
(6.54) ∥v∥ ≤ lim inf n→+∞ 2 Since ∥v∥ = 1 and ∥vn ∥ ≤ 1, n ∈ N, it follows, from (6.52) and (6.54), that
vn + v
= 1, lim n→+∞ 2 which, by (6.44), implies that ∥vn − v∥ −→ 0, as n −→ +∞ as well as (6.53). ■
6.5. Problems Problem 6.1 Let E be a Banach space and u0 ∈ E. Prove that in the weak topology σ(E, E ′ ) an open fundamental system of neighborhoods of u0 is F(u0 ) = {Vu0 (η1 , ..., ηn ; ϵ) / ηk ∈ E ′ ∧ ϵ > 0}, where Vu0 (η1 , ..., ηn ; ϵ) = {u ∈ E / ∀k ∈ In : | ⟨ηk , u − u0 ⟩ | < ϵ}. Problem 6.2 Let E be a normed space with dim(E) < +∞. Prove that the weak topology σ(E, E ′ ) coincides with the topology induced by the norm. Problem 6.3 Let E be a Banach space 1. Assume that E ′ has a Schauder basis S ′ = {ηm / m ∈ N}. Prove that un ⇀ u, in σ(E, E ′ ) iff ∀m ∈ N :
lim ⟨ηm , un ⟩ = ⟨ηm , u⟩ .
n→+∞
∗
2. Assume that E has a Schauder basis S = {um /m ∈ N}. Prove that ηn ⇀ η in σ(E ′ , E) iff ∀m ∈ N :
lim ⟨ηn , um ⟩ = ⟨η, um ⟩ .
n→+∞
Problem 6.4 Let H be a separable Hilbert space and B = {em / m ∈ N} ⊆ H an orthonormal set. Prove that (em )m∈N converges weakly to 0 but not strongly. Problem 6.5 Let E be a Banach space. Assume that un ⇀ u, weakly in σ(E, E ′ ). Let’s denote A = {un / n ∈ N}. Prove that there is (vn )n∈N ⊆ Conv(A) such that lim ∥vn − u∥ = 0. n→+∞
Problem 6.6 Let E be a Banach space Assume that φ : E −→ R is convex and strongly l.s.c. Prove that φ is weakly l.s.c. Hint. Use Proposition 4.3.
6.5. Problems
253
Problem 6.7 Let E and F be Banach spaces and T ∈ L(E, F ). We denote Ew = (E, σ(E, E ′ )) and Fw = (F, σ(F, F ′ )) Prove that T ∈ L(E, F ) iff T : Ew → Fw is continuous. Hint. Use the closed graph theorem. Problem 6.8 Let E be a Banach space, K ⊆ E strongly compact and (un )n∈N ⊆ K such that un ⇀ u, weakly in σ(E, E ′ ). Prove that un → u, strongly. Problem 6.9 Let (X, T ) be a topological space and E a Banach space. Assume that f, g : (X, T ) −→ (E, σ(E, E ′ )) are continuous. 1. Prove that f + g : (X, T ) −→ (E, σ(E, E ′ )) is continuous. 2. Let h ∈ C(X). Prove that h · f : (X, T ) −→ (E, σ(E, E ′ )) is continuous. Problem 6.10 Let E be a Banach space and η0 ∈ E ′ . Prove that in the weak ∗ topology σ(E ′ , E) an open fundamental system of η0 is Fw∗ (η0 ) = {Vη0 (u1 , ..., un ; ϵ) / uk ∈ E ∧ ϵ > 0}, where Vη0 (u1 , ..., un ; ϵ) = {η ∈ E ′ / ∀k ∈ In : | ⟨η − η0 , uk ⟩ | < ϵ}. Problem 6.11 Let X be a linear space and η, η1 , ..., ηk : X −→ R linear. Assume that for v ∈ X, [∀j ∈ Ik : ηj (u) = 0] ⇒ η(v) = 0. Prove that η ∈ ⟨{η1 , ..., ηk }⟩. Problem 6.12 Let (E, ∥ · ∥) be a reflexive Banach space and M = M a linear subspace of E. 1. Prove that for the space (M, ∥ · ∥), the topology σ(M, M ′ ) coincides with Tσ(E,E ′ ) , the topology induced by σ(E, E ′ ). Hint. Consider Hahn-Banach’s theorem. 2. Prove that B M is compact in σ(M, M ′ ). 3. Prove that M is reflexive. Problem 6.13 Let E be a Banach space. Prove that E is separable and reflexive iff E ′ is separable and reflexive. Problem 6.14 Using as guide the proof of [4, Th.3.28], write your own detailed proof of Theorem 6.7, i.e., that a Banach space E is separable iff B E ′ is metrizable in the weak ∗ topology σ(E ′ , E). Problem 6.15 (Completing the proof of Theorem 6.9) Let E be a Banach space such that B E is d-metrizable in the weak topology σ(E, E ′ ). For n ∈ N, write Un = Bd (0, 1/n) ∩ B E . Let Vn ∈ Nσ(E,E ′ ) (0) such that Vn = {u ∈ E / ∀η ∈ Φn : | ⟨η, u⟩ | < ϵn } ⊆ Un , ′
where ϵn > 0, Φn ⊆ E with Φn finite. Let D =
+∞ [ n=1
that the following is false: F = E′,
in T∥·∥E′ .
Φn and F = ⟨D⟩. Assume
254
Chapter 6. Weak topologies, reflexivity and separability
1. Prove that there exist ψ ∈ E ′′ and η0 ∈ E ′ such that ∥ψ∥ = 1,
⟨ψ, η0 ⟩E ′′ ,E ′ > 1
∀η ∈ F :
and
⟨ψ, η⟩E ′′ ,E ′ = 0.
2. Let W = {u ∈ B E / | ⟨η0 , u⟩E ′ ,E | < 1/2}. Prove that there exists n0 ∈ N such that Vn0 ⊆ W . 3. Prove that there exists u1 ∈ B E such that ∀η ∈ Φn0 :
| ⟨η, u1 ⟩E ′ ,E − ⟨ψ, η⟩E ′′ ,E ′ | < ϵn0 , | ⟨η0 , u1 ⟩E ′ ,E − ⟨ψ, η0 ⟩E ′′ ,E ′ | < 1/2.
(6.55)
4. Prove that u1 ∈ Vn0 and ⟨η0 , u1 ⟩E ′ ,E > 1/2. 5. Conclude that E ′ is separable. Problem 6.16 Let E be a separable Banach space and (ηn )n∈N ⊆ E ′ , bounded. Prove that there exists a subsequence (ηnk )k∈N that converges in the weak ∗ topology σ(E ′ , E). Problem 6.17 Let’s recall that on R2 , the norm ∥ · ∥3 is given by p ∥(x, y)∥3 = 3 |x|3 + |y|3 . Is (R2 , ∥ · ∥3 ) a strictly convex space? Problem 6.18 For each n ∈ N, let xn : [0, 1] −→ R, given by xn (t) = n2 te−nt . Also consider x : [0, 1] −→ R, given by x(t) = 0. 1. Prove that (xn )n∈N weakly converges in C([0, 1]) to x. 2. Prove that (xn )n∈N does not converge to x in L2 ([0, 1]). 3. Does (xn )n∈N strongly converge in C([0, 1]) to x? Problem 6.19 Let 1 ≤ p ≤ +∞, x ∈ lp (R) and (xn )n∈N ⊆ lp (R) such that ′ xn ⇀ x weakly in σ(lp , lp ). 1. Prove that (xn )n∈N is bounded in lp (R). 2. Prove that ∀i ∈ N : lim ξn,i = ξi , n→+∞
where x = (ξi )i∈N and, for each n ∈ N, xn = (ξn,i )i∈N . Problem 6.20 Let 1 < p ≤ +∞ and xn ⊆ lp (R) bounded in lp (R). For each n ∈ N, xn = (ξn,i )i∈N Assume that ∀i ∈ N, ∃ξi ∈ R :
lim ξn,i = ξi .
n→+∞
′
Prove that x = (ξi )i∈N ∈ lp (R) and that xn ⇀ x weakly in σ(lp , lp ). Problem 6.21 Consider the sequence (un )n∈N ⊆ L2 ([0, 2π]), given by un (t) =
1 sin(nt), π
t ∈ [0, 2π], n ∈ N.
6.5. Problems
255
1. Prove that un ⇀ 0, as n → +∞. 2. Prove that (un )n∈N does not converge strongly. Problem 6.22 Let H be a Hilbert space, u ∈ H and (un )n∈N ⊆ H such that un ⇀ u and ∥un ∥ → ∥u∥, as n → +∞. Prove that un → u, strongly. Problem 6.23 Let 1 < p < +∞, x ∈ Lp ([a, b]) and (xn )n∈N ⊆ Lp ([a, b]). Prove that xn ⇀ x iff (xn )n∈N is ∥ · ∥p -bounded and Z ∀β ∈ [a, b] :
n→+∞
β
Z
a
β
x(t)dt.
xn (t)dt =
lim
a
Problem 6.24 Consider the functionals η0 , ηϵ : C1 ([−1, 1]) −→ R, 0 < |ϵ| < 1, given by x(ϵ) − x(−ϵ) ⟨η0 , x⟩ = x′ (0), ⟨ηϵ , x⟩ = . 2ϵ 1. 2. 3. 4.
Prove that η0 , ηϵ ∈ [C1 ([−1, 1])]′ , 0 < |ϵ| < 1. Compute ∥η0 ∥ and ∥ηϵ ∥, 0 < |ϵ| < 1. ∗ Prove that ηϵ ⇀ η0 , as ϵ → 0. Does it hold ηϵ → η0 , strongly?
Problem 6.25 Let E be a Banach space and (un )n∈N ⊆ E such that un ⇀ u n 1X un . Prove that pn ⇀ u weakly in σ(E, E ′ ). For each n ∈ N, we write pn = n k=1 weakly in σ(E, E ′ ).
Part II.
An introduction to the Calculus of Variations
257
7. Calculus on normed spaces The idea behind this chapter is to present tools, definitions and results of calculus on normed spaces, taking advantage of the material worked in the previous chapters, and advance to topics of calculus on normed spaces. Here, grossly speaking, the main reference is [5]. The student will really learn the material by working on a good number of exercises after studying new concepts. For this it will be useful a number of detaily solved problems which are present in this document. At the end of each section, a list of exercises is presented; most of them are selected and (in many cases) adapted to modern notation and terminology from several sources. In Chapter 8 we will focus on the most important points of classical Calculus of Variations, taking full advantage of the formalism built in this chapter.
7.1. What Calculus of Variations is about Let’s now consider a couple of problems that will help us to describe what Calculus of Variations is about. The first can be dealt with classical one-variable Calculus and the other requires Calculus of Variations.
7.1.1. A problem from classical Calculus A kind of optimization problems of basic Calculus can be treated following these steps: a) b) c) d)
understanding the situation, mathematical modeling, defining the mathematical problem, and solving the problem.
The situation An expeditionary team is training so that its members can survive and perform a task in quasi-desertic zones. In order to keep the team in good health condition, each member has to take exactly a volume V of a special drink, every three hours. For this we need to design a circular cylindrical can using the least amount of material. Mathematical modeling 1. Drawing of the situation. The following drawings help to understand the current situation. 2. Target or dependent variable. We want to minimize 259
260
Chapter 7. Calculus on normed spaces
Figure 7.1.: A simplified scheme of a can. S: the surface area of the circular cylinder. 3. Control variable. We shall control the target variable with r: the radius of the cylinder.1 4. Relation between variables. The surface area, see Figure 7.1, is given by S = SL + 2ST .
(7.1)
We have that ST = πr2 and SL = ah = 2πrh. In other hand, since the V volume is constant, V = ST h = πr2 h, we have that h = 2 . By replacing πr all this in (7.1) we get 2V S = 2πr2 + . r 5. Mathematical model. In our framework, the mathematical model corresponds to the function S :]0, +∞[ −→ R r 7−→ S(r) = 2πr2 +
2V . r
Observe that S belongs to C2 (]0, +∞[) and is bounded from below: inf
S(r) ≥ 0.
r∈]0,+∞[
Mathematical problem The problem can be mathematically written as Find r0 ∈]0, +∞[ such that S(r0 ) = inf S(r),
(P’)
r∈]0,+∞[
or, simply, inf{S(r) / r ∈]0, +∞[}. In words, the goal is to minimize the function S. 1 It
could be chosen a or h as well.
(P)
7.1. What Calculus of Variations is about
261
Solving the problem Now we apply the classical Calculus machinery to solve (P). 1. Derivatives. For r ∈]0, +∞[, we have S ′ (r) = 4πr −
2V , r2
S ′′ (r) = 4π +
4V . r3
(7.2)
2. Critical points of S. By using (7.2), we solve the equation S ′ (r) = 0,
r ∈]0, +∞[,
(7.3)
i.e., we find the set of critical points of S, K(S) = {r ∈]0, +∞[ / S ′ (r) = 0}. It’s easy to verify that K(S) = {r0 }, where r0 =
V 2π
1/3 .
3. Local solution. Since S ′′ (r0 ) = 12π > 0, we have that r0 is a point of local minimum of S. 4. Solution. Since ∀r ∈]0, +∞[: S ′′ (r) > 0, it follows that S is strictly convex in its entire domain. Therefore r0 is the unique global minimum point of S and the minimum surface area is S(r0 ) = 3(2π)1/3 V 2/3 .
7.1.2. A situation to deal with Calculus of Variations We present a problem that can not be attacked using classical Calculus. We will not be searching for a number that minimizes some target variable; instead we will be looking for infinitely many points, those corresponding to the trajectory of a “minimizer curve”. Let’s follow the steps mentioned in Section 7.1.1. The situation What makes the slide attractive to the kids? Of course it’s the velocity that a child takes thanks to gravity. As it’s well known, a good slide does not follow the line between the highest position and the lowest one - where the father happily receives the mother’s most beloved rocket. It’s the shape of the slide that can make things boring or exciting, see Figure 7.2. Actually, the nicest shape for the children’s slide corresponds to the solution of a famous mathematical problem, the brachistochrone. The problem was first considered by Galileo (around 1638) and yields in finding the minimum-time path between two points that follows a point mass moving under the influence of gravity without friction.
262
Chapter 7. Calculus on normed spaces
Figure 7.2.: Several shapes for a slide. Source: http://www.etudes.ru
Figure 7.3.: A simple scheme for the slide.
Mathematical modeling 1. Drawing of the situation. Figure 7.3 helps to understand the current situation. 2. Target or dependent variable. We want to minimize T : the time it takes a point mass to travel from A to B. 3. Control or independent variable. We shall control the target variable with u(·): the function which describes the path. 4. Relation between variables. The velocity is given by ds dt
ds . (7.4) v p At infinitesimal level, Pithagoras still lives: ds = dx2 + dy 2 , as well as the 1 law of conservation of energy: mv 2 = mgy. By replacing this into (7.4) 2 we get s 1 1 + u′ (x)2 dt = dx, 2g u(x) v=
or
dt =
so that 1 T = 2g
Z 0
xb
s
1 + u′ (x)2 dx. u(x)
7.1. What Calculus of Variations is about
263
5. Mathematical model. The mathematical model corresponds to the functional T : E −→ R 1 u 7−→ T (u) = 2g
Z
xb
s
0
1 + u′ (x)2 dx, u(x)
whose domain is E, the set of all the functions in C1 ([0, xb ]) that verify the boundary conditions presented in Figure 7.3. Observe that the functional T is bounded from below: inf T (u) ≥ 0. u∈E
Mathematical problem The problem can be mathematically written as ( Find u0 ∈ E such that T (u0 ) = inf T (u),
(P’)
u∈E
or, equivalently, inf{T (u) / u ∈ E}.
(P)
In words the goal is to minimize the functional T . To solve the problem... In this chapter we shall learn a machinery similar to that of Calculus which was applied in Section 7.1.1. So, we will introduce concepts like derivatives, critical points, local minimum and convexity for functionals, i.e., for functions of infinitely many variables. This is what the Calculus of Variations is about. For the moment let’s mention that, instead of (7.3), the candidates to solve (P’) should verify an ordinary differential equation known as the Euler-Lagrange equation. The actual solution of (P’) is a segment of a cycloid which, in parametric form, can be written as θ − sin(θ) x(θ) = , 4gC 2 θ ∈ [0, θb ], y(θ) = 1 − cos(θ) , 4gC 2 where C = C(xb , yb ) is a positive constant.
7.1.3. Euler’s finite-difference method Let’s reinforce a bit our recent claim that Calculus of Variations is like classical Calculus but for infinitely many variables. Let’s consider the functional J : X −→ R Z u 7−→ J(u) = a
b
F (x, u(x), u′ (x)) dx,
264
Chapter 7. Calculus on normed spaces
where the domain is X = u ∈ C1 ([a, b]) / u(a) = A ∧ u(b) = B , and the function F ∈ C1 ([a, b] × R × R) is given. Let’s assume that J is bounded from below. We are interested in the problem ( Find u0 ∈ X such that (7.5) J(u0 ) = inf J(u). u∈X
For a meshing p = (x0 , ..., xn+1 ) ∈ Rn+2 , a = x0 < x1 < x2 < ... < xn < xn+1 = b, we consider the discretization of J given by n X yi − yi−1 Jdisc (y) = F xi , yi , h, h i=1 where, y0 = A, yn+1 = B, y = (y1 , ..., yn ) ∈ Rn , and yi = u(xi ),
i = 1, ..., n,
h = xi − xi−1 ,
i = 1, ..., n + 1.
Observe that we have replaced the curve u by the polygonal line with vertices (xi , yi ),
i = 0, 1, ..., n, n + 1.
Then, hopefully, for n ∈ N big enough, J(u) ≈ Jdisc (y), and the value in (7.5) can be approximated by inf {Jdisc (y) / y ∈ Rn }, which corresponds, actually, to a classical Calculus problem. The strategy just described is known as Euler’s finitedifference method.
7.2. A couple of important things Let’s recall some concepts which will be useful along the chapter.
7.2.1. Normed and Banach algebras In Definition 1.16 we introduced the concept of algebra. In Section 4.7.2 we quickly introduced the concept of Banach algebra to provide context to the very important Stone- Weierstrass theorem. Let’s recall the concepts as we will be dealing with these objects. Let E be an algebra on which a norm ∥·∥ is defined. We say that E is a normed algebra iff ∀u, v ∈ E : ∥uv∥ ≤ ∥u∥ ∥v∥. (7.6) In many situations the algebra E has a unit 1. In this case, it’s required the condition ∥1∥ = 1. Let’s remark that (7.6) implies that ∗ : E × E → E is continuous, i.e., ∀(u0 , v0 ) ∈ E × E, ∀ϵ > 0, ∃δ > 0 : (∥u − u0 ∥ < δ ∧ ∥v − v0 ∥ < δ) ⇒ ∥uv − u0 v0 ∥ < ϵ.
7.2. A couple of important things
265
From (7.6) it follows that ∥un ∥ ≤ ∥u∥n .
∀n ∈ N, ∀u ∈ E :
If E is complete as a normed space, then we say that E is a Banach algebra. Example 7.1 In Corollary 4.6 it was stated that (C([a, b]), +, ·, ∗) is a commutative Banach algebra with unit when it’s endowed with the ∥ · ∥∞ -norm. Here the unit is the constant function 1f : [a, b] −→ R, given by 1f (t) = 1. The set of elements which have multiplicative inverses is C([a, b])× = {f ∈ C([a, b]) /
∀t ∈ [a, b] : f (t) ̸= 0}.
Example 7.2 In the course of Linear Algebra the student learned that (Mn (R), +, ·, ∗) is a non-commutative algebra with unit Id = (δij ), the identity matrix. Here the set of elements which have multiplicative inverses is Mn (R)× = {A ∈ Mn (R) / det(A) ̸= 0}. Also recall that Mn (R) ∼ = L(Rn ), so that a norm that makes Mn (R) a Banach algebra is given by |Ax| , ∥A∥ = sup x∈Rn |x| where | · | is any norm on Rn . Theorem 7.1 (X × is open) Let X be a Banach algebra with unit 1, and x ∈ X such that ∥x − 1∥ < 1. Then x is invertible and, actually, x−1 =
+∞ X
(1 − x)n .
n=0
As a consequence, X × is open. Proof. For y ∈ X and N ∈ N we have that 1 − y N +1 = (1 − y) · (y N + y N −1 + ... + y + 1) = (1 − y) ·
N X
yn ,
n=0
whence, by putting y = 1 − x and letting N −→ +∞, we get 1=x·
+∞ X
(1 − x)n .
n=0
We leave to the student the proof that X × is open. Example 7.3 Let E and F be normed spaces. Let’s write I(E, F ) = {T ∈ L(E, F ) / T is invertible}.
■
266
Chapter 7. Calculus on normed spaces
It’s easy to check that (L(E), +, ·, ◦) is a non-commutative normed algebra with unity. Here ◦ denotes the composition product of operators. The unity is the identity operator Id : E −→ E. Recall that ∀T, S ∈ L(E) :
∥T S∥ ≤ ∥T ∥ ∥S∥.
A consequence of Theorem 7.1 is that L(E)× = I(E, E) is open in L(E). Actually, it can be proved that I(E, F ) is open in L(E, F ).
7.2.2. Exponential mapping Let E be a Banach algebra with unit. For m ∈ N we define Sm : E −→ E by Sm (u) =
m X 1 n u . n! n=0
Then the exponential mapping exp : E −→ E is given by exp(u) = eu =
+∞ X 1 n u . n! n=0
It’s clear that ∥eu ∥ ≤ e∥u∥ , for every u ∈ E. Theorem 7.2 (exp is continuous) Let E be a Banach algebra. It holds ∀u, v ∈ E :
∥eu − ev ∥ ≤ eM ∥u − v∥,
where M = max{∥u∥, ∥v∥}. The proof of Theorem 7.2 is required as an exercise at the end of the chapter. Theorem 7.3 Let E be a Banach algebra with unity and u, v ∈ E commuting elements. Then eu+v = eu ev . The proof of Theorem 7.3 is required as an exercise at the end of the chapter. As a consequence of the previous result we have that ∀u ∈ E :
eu ∈ E × .
7.2.3. Small o Let’s assume that 0 ∈ O ⊆ E open and g : O ⊆ E −→ F is such that g(0) = 0. We say that g is a small o of h, denoted g(h) = o(h), iff g(h) = ∥h∥ ϵ(h), for some mapping ϵ : B(0, r) ⊆ E −→ F such that lim ϵ(h) = 0.
h→0
This is equivalent to the following conditions 1 g(h) = 0, h→0 ∥h∥ lim
∥g(h)∥ = 0. h→0 ∥h∥ lim
7.3. The differential
267
7.2.4. Uniform continuity. Canonical basis of Rn Along this subsection, E and F shall denote normed spaces. As usual, we write C(E, F ) for the space of all continuous operators from E into F ; in particular, C(E) = C(E, R). Then T belongs to C(E, F ) iff ∀u0 ∈ E, ∀ϵ > 0, ∃δ = δ(u0 , ϵ) > 0 : ∥u − u0 ∥ < δ ⇒ ∥T (u) − T (u0 )∥ < ϵ. Let’s recall that T is uniformly continuous iff ∀ϵ > 0, ∃δ = δ(ϵ) > 0, ∀u0 ∈ E : ∥u − u0 ∥ < δ ⇒ ∥T (u) − T (u0 )∥ < ϵ.
(7.7)
The term uniform, in (7.7), means that δ no longer depends on the point u0 .
7.2.5. Problems Problem 7.1 Let E be a normed algebra. Prove that the internal multiplication is a continuous mapping. Problem 7.2 Prove Theorem 7.2. For this, prove first that ∀n ∈ N ∪ {0}, ∀u, v ∈ E :
∥un − v n ∥ ≤ nM n−1 ∥u − v∥,
and write un − v n = un−1 (u − v) + un−2 (u − v)v + un−3 (u − v)v 2 + ... + (u − v)v n−1 . Problem 7.3 Let E be a Banach algebra. Define sin : E −→ E and prove its continuity. Problem 7.4 Let E be a Banach algebra with unity and u, v ∈ E commuting elements. Prove that eu+v = eu ev . Problem 7.5 Let X be a Banach algebra with unity and x ∈ E. Prove that n 1 ex = lim 1+ x . n→+∞ n
7.3. The differential In this section we shall introduce the idea of differential of an operator working between normed spaces. As a motivation for the coming definitions, let’s recall that for f ∈ Cn ([a, b]) and x0 , x ∈ [a, b] it holds f (x) =
n X f (k) (x0 ) k=0
k!
(x − x0 )k + o(|x − x0 |n ).
268
Chapter 7. Calculus on normed spaces
By putting ϵ = x − x0 and x = x0 + ϵ, we get f (x0 + ϵ) − f (x0 ) =
n X f (k) (x0 ) k=1
k!
· ϵk + o(|ϵ|n ).
In particular, for n = 1, f (x0 + ϵ) − f (x0 ) = f ′ (x0 ) ϵ + o(|ϵ|), i.e., f (x0 + ϵ) − f (x0 ) = f ′ (x0 ) ϵ + |ϵ| w(ϵ), where lim w(ϵ) = 0, in R. ϵ→0
7.3.1. Directional derivatives and Gateaux differential Let E and F be normed spaces and f : O ⊆ E → F , where O is open. Let u ∈ O be a point and h ∈ E a direction. See Figure 7.4.
Figure 7.4.: Scheme for the directional derivative. Whenever exists, the limit ∂h f (u) = lim
t→0
1 [f (u + th) − f (u)] , t
is referred to as the directional derivative of f at the point u in the direction h. For λ ∈ R∗ , we have that ∂λh f (u) = lim
t→0
1 1 [f (u + tλh) − f (u)] = λ lim [f (u + αh) − f (u)] = λ∂h f (u). α→0 α t
Remark 7.1 Keep in mind that ∂h f (u) =
d f (u + th)|t=0 , dt
(7.8)
and that ∂h f (u) ∈ F , i.e., the directional derivative belongs to Cod(f ). ′ If ∂h f (u) does exist for every direction h ∈ E and there exists fG (u) ∈ L(E, F ) such that ′ ∀h ∈ E : ∂h f (u) = fG (u) h, (7.9)
then we say that f is Gateaux (or weakly) differentiable at u. Since the bounded ′ linear operator that verifies (7.9) is unique, we refer to the operator fG (u) as the Gateaux differential of f at u (or the weak differential of f at u). So, in this case, the directional derivarives can be computed by using the Gateaux differential.
7.3. The differential
269
7.3.2. Fr´ echet differential Let E and F be normed spaces, O ⊆ E open and f : O ⊆ E → F . Let u ∈ O be a point. See Figure 7.5. If there exists ϕ ∈ L(E, F ) such that ∀h ∈ E :
u + h ∈ O ⇒ f (u + h) − f (u) = ϕ(h) + o(h),
(7.10)
then we say that f is differentiable or Fr´echet differentiable at u.
Figure 7.5.: Scheme for the Fr´echet differential.
Remark 7.2 If f is differentiable at every point of O1 ⊆ O, we say that f is differentiable on O1 . If f is differentiable at every point of its domain we simply say that it is differentiable. Remark 7.3 In (7.10), h = ∆u is sometimes called the increment of the argument and its left side, f (u + h) − f (u), is referred to as the increment of the functional (corresponding to the increment h). Proposition 7.1 The operator ϕ ∈ L(E, F ) in (7.10) is unique.
Proof. Let φ ∈ L(E, F ) such that f (u + h) − f (u) = φ(h) + o(h),
(7.11)
for all h ∈ E such that u + h ∈ O. We have to prove that ϕ = φ, i.e., ∀v ∈ E :
ϕ(v) = φ(v).
(7.12)
Let v ∈ E, generic. Since O is open, there exists r > 0 such that B(u, r) = u + B(0, r) ⊆ O. Then, from (7.10) and (7.11), we get ∀h ∈ B(0, r) :
ϕ(h) + ∥h∥ϵ1 (h) = φ(h) + ∥h∥ϵ2 (h),
(7.13)
where the functions ϵ1 , ϵ2 : B(0, r) −→ F are such that lim ϵ1 (h) = lim ϵ2 (h) = 0.
h→0
h→0
1. If v = 0, then, by linearity, we immediately have ϕ(v) = φ(v) = 0.
(7.14)
270
Chapter 7. Calculus on normed spaces
2. If v ̸= 0 then we choose N ∈ N such that for every n ∈ N with n > N , 1 1 v ∈ B(0, r), so that (7.13) provides hn = · n ∥v∥ ϕ(hn ) − φ(hn ) = ∥hn ∥ [ϵ2 (hn ) − ϵ1 (hn )] and ϕ(v) − φ(v) = ∥v∥ · [ϵ2 (hn ) − ϵ1 (hn )] . By letting n → ∞ in the last relation we obtain, thanks to (7.14), that ϕ(v) = φ(v). Since v was chosen arbitrarily, we have proved (7.12).
■
Remark 7.4 (Notation for the differential) Thanks to Proposition 7.1, we rewrite (7.10) as f (u + h) − f (u) = f ′ (u)h + o(h), (7.15) and refer to the operator f ′ (u) ∈ L(E, F ) as the differential of f at u. The continuous linear operator f ′ (u) is sometimes referred to as the strong differential of f at u. In the literature you can also find the notations f ′ (u) = df (u) = Df (u). Remark 7.5 (Variation) If f is a functional, then f ′ (u) ∈ E ′ is also called the (first) variation of f at u. In this case, (7.15) could be written as f (u + h) − f (u) = ⟨f ′ (u), h⟩ + o(h). In the literature it can also be found the notations f ′ (u)h = δf (u; h) = δf (h), with the last equality making full sense whenever the point u is preassumed. Remark 7.6 It’s clear that differentiability implies Geateaux differentiablity which, in its turn, implies the existence of all the directional derivatives at a point. As in one-dimensional Calculus, we have the following result. Proposition 7.2 (Differentiability and continuity) Let E and F be normed spaces, O ⊆ E open and f : O ⊆ E → F differentiable at u ∈ O. Then f is continuous at u. Proof. We have that ∀h ∈ E :
u + h ∈ O ⇒ f (u + h) = f (u) + f ′ (u)h + o(h).
Then, since f ′ (u) is a continuous linear operator and o(h) −→ 0, as h −→ 0, we have that lim f (u + h) = f (u), h→0
i.e., lim f (x) = f (u),
x→u
so that f is continuous at u.
■
7.3. The differential
271
Proposition 7.3 (Differential of a constant operator) Let E and F be normed spaces, O ⊆ E open and f : O ⊆ E → F a constant operator. Then ∀u ∈ O :
f ′ (u) = 0.
The reciprocal of the last theorem is not true, i.e., having a null differential does not imply that the function is constant. Proposition 7.4 (Differential of a linear operator) Let E and F be normed spaces and f ∈ L(E, F ). Then ∀u ∈ E :
f ′ (u) = f.
The proofs of Propositions 7.3 and 7.4 are required as exercises at the end of the section. Now let’s show that the process of taking differential is linear. Proposition 7.5 (Linearity) Let E and F be normed spaces and O ⊆ E open. Let λ ∈ R and f, g : O ⊆ E −→ F differentiable at u ∈ O. Then λf + g is differentiable at u and (λf + g)′ (u) = λ f ′ (u) + g ′ (u). Proof. That f and g are differentiable at u ∈ O means u + h ∈ O ⇒ f (u + h) − f (u) = f ′ (u)(h) + ∥h∥ϵ1 (h),
∀h ∈ E : ∀h ∈ E :
(7.16)
′
u + h ∈ O ⇒ g(u + h) − g(u) = g (u)(h) + ∥h∥ϵ2 (h),
(7.17)
with ϵ1 (h) −→ 0 and ϵ2 (h) −→ 0, as h −→ 0. We have to prove that ∀h ∈ E :
u+h ∈ O ⇒ (λf +g)(u+h)−(λf +g)(u) = φ(h)+∥h∥ϵ(h), (7.18)
where φ = λf ′ (u) + g ′ (u) ∈ L(E, F ) and ϵ(h) −→ 0, as h −→ 0. Let h ∈ E such that u + h ∈ O. Then, by (7.16) and (7.17), we have that (λf + g)(u + h) − (λf + g)(u) = λ[f (u + h) − f (u)] + [g(u + h) − g(u)] = λ · [f ′ (u)(h) + ∥h∥ · ϵ1 (h)] + [g ′ (u)(h) + ∥h∥ · ϵ2 (h)] = φ(h) + ∥h∥ · ϵ(h), where, clearly, ϵ(h) = ϵ1 (h) + ϵ2 (h) −→ 0, as h −→ 0. Since h was chosen arbitrarily, we have proved (7.18). ■ Proposition 7.6 (Differential of a product) Let E be a normed space, F a commutative algebra and f : O ⊆ E → F , where O is open. Let f, g : O −→ F differentiable at u ∈ O. Then f · g is differentiable at u and (f · g)′ (u) = g(u) f ′ (u) + f (u) g ′ (u).
(7.19)
272
Chapter 7. Calculus on normed spaces
Before proving Proposition 7.6, let’s observe that f · g : E −→ F is given by (f · g)(x) = f (x) · g(x),
x ∈ O.
In (7.19), g(u) f ′ (u) ∈ L(E, F ) is given by [g(u) f ′ (u)](x) = g(u) · f ′ (u)x,
x ∈ E.
In the same way works f (u) g ′ (u). Proof. Since f and g are differentiable at u, we have that ∀h ∈ E : ∀h ∈ E :
u + h ∈ O ⇒ f (u + h) − f (u) = f ′ (u) h + ∥h∥ ϵ1 (h),
(7.20)
u + h ∈ O ⇒ g(u + h) − g(u) = g ′ (u) h + ∥h∥ ϵ2 (h),
(7.21)
where, for small enough r > 0, the functions ϵ1 , ϵ2 : B(0, r) ⊆ E −→ F are such that lim ϵ1 (h) = 0, lim ϵ2 (h) = 0. h→0
h→0
1. Let’s consider the linear operator ϕ : E −→ F , given by ϕ = g(u) f ′ (u) + f (u) g ′ (u). By using the triangle inequality, (7.6) and the boundedness of f ′ (u) and g ′ (u) we have, for c > ∥g(u)∥ ∥f ′ (u)∥L(E,F ) + ∥f (u)∥ ∥g ′ (u)∥L(E,F ) ≥ 0, and a generic x ∈ E, that ∥ϕ(x)∥ =∥g(u) f ′ (u)x + f (u) g ′ (u)x∥ ≤∥g(u)∥ ∥f ′ (u)x∥ + ∥f (u)∥ ∥g ′ (u)x∥ ≤∥g(u)∥ ∥f ′ (u)∥L(E,F ) ∥x∥ + ∥f (u)∥ ∥g ′ (u)∥L(E,F ) ∥x∥. Therefore ϕ ∈ L(E, F ). 2. For h ∈ B(0, r) ⊆ E we have that (f g)(u + h) − (f g)(u) = f (u + h) g(u + h) − f (u) g(u) = f (u + h) g(u + h) − f (u) g(u) ± f (u + h)g(u) = f (u + h) [g(u + h) − g(u)] + g(u) [f (u + h) − f (u)] = g(u) [f ′ (u)h + ∥h∥ ϵ1 (h)] + f (u) [g ′ (u)h + ∥h∥ ϵ2 (h)] + [f (u + h) − f (u)] [g ′ (u)h + ∥h∥ ϵ2 (h)] = ϕ(h) + ω(h), where ω : B(0, r) ⊆ E → F is given by ω(h) =g(u)∥h∥ ϵ1 (h) + f (u)∥h∥ ϵ2 (h) + [f ′ (u)h + ∥h∥ ϵ1 (h)] [g ′ (u)h + ∥h∥ ϵ2 (h)] .
(7.22)
7.3. The differential
273
Since ∥f ′ (u)h · g ′ (u)h∥ ∥f ′ (u)h∥ · ∥g ′ (u)h∥ ≤ lim h→0 h→0 ∥h∥ ∥h∥ ′ ′ ∥f (u)∥ ∥h∥ ∥g (u)∥ ∥h∥ ≤ lim = 0, h→0 ∥h∥
0 ≤ lim
it follows that ω(h) = o(h). Since h was chosen arbitrarily, we have proved that ∀h ∈ E :
u + h ∈ O ⇒ (f g)(u + h) − (f g)(u) = ϕ(h) + o(h),
whence (7.19) follows. ■
7.3.3. Gateaux or not-Gateaux? That’s the question In this section we consider a very simple functional. We analize its differentiabily properties with and without using the concept of Gateaux differential. Let a ∈ R. Let’s analize the differentiability of the functional I : C1 ([a, b]) −→ R given by 1/2 I(x) = 1 + x′2 (a) . We can use or not Gateaux differentiability as well as formula (7.8). As it was said, both approaches are important and useful for concrete situations. In this particular situation the second approach works better. However which one is the best can not be foreseen in most of the cases. I) In this first approach we shall use Gateaux differentiability and (7.8). a) Let x, h ∈ C1 ([a, b]) and λ ∈ R. We have 1/2 I(x + λh) = 1 + x′2 (a) + 2λx′ (a)h′ (a) + λ2 h′2 (a) , so that d I(x + λh) = dλ −1/2 ′ = 1 + x′2 (a) + 2λx′ (a)h′ (a) + λ2 h′2 (a) h (a)x′ (a) + λh′2 (a) , and, by (7.8), −1/2 ′ ∂h I(x) = x′ (a) · 1 + x′2 (a) h (a).
(7.23)
b) Now let’s define a functional Ψ : C1 ([a, b]) → R by −1/2 ′ y (a). Ψ(y) = x′ (a) 1 + x′2 (a) If we prove that Ψ belongs to C1 ([a, b])′ , the dual of C1 ([a, b]), then I is Gateaux differentiable and, by (7.23), −1/2 ′ ′ IG (u)h = x′ (a) · 1 + x′2 (a) h (a).
274
Chapter 7. Calculus on normed spaces Ψ is clearly linear. So let’s prove that Ψ is bounded, i.e., (7.24) ∃c > 0, ∀y ∈ C1 ([a, b]) : |Ψ(y)| ≤ c∥y∥C 1 −1/2 . Then, for y ∈ C1 ([a, b]), Let’s pick c > |x′ (a)| 1 + x′2 (a) −1/2 ′ |Ψ(y)| = |x′ (a)| 1 + x′2 (a) |y (a)| ≤ c ∥y∥∞ ≤ c ∥y∥C 1 . Since y was chosen arbitrarily, we have proved (7.24). c) To prove that I is Fr´echet differentiable at u and that −1/2 ′ h (a), I ′ (x)h = x′ (a) · 1 + x′2 (a) we need to show that g(h) = o(h), where ′ g(h) = I(x + h) − I(x) − IG (x)h 1/2 1/2 = 1 + (x + h)′2 (a) − 1 + x′2 (a) −1/2 ′ − x′ (a) · 1 + x′2 (a) h (a).
However, we stop here because the remainder almost corresponds to the whole second approach... II) In this second approach we shall directly consider Fr´echet differentiability. a) Let’s observe that I(x) = f (x′ (a)),
x ∈ C1 ([a, b]),
(7.25)
where f : R −→ R is given by f (t) = (1 + t2 )1/2 . b) It’s clear that f ∈ C∞ (R). Then, for t, γ ∈ R, 1 f (t + γ) − f (t) = f ′ (t)γ + f ′′ (t)γ 2 + o(γ 2 ). 2 We have that −1/2 f ′ (t) = t 1 + t2 ,
(7.26)
−1/2 −3/2 f ′′ (t) = 1 + t2 − t2 1 + t2
so that −1/2 f (t + γ) − f (t) = t 1 + t2 γ −3/2 2 1 −1/2 + 1 + t2 − t2 1 + t2 γ + γ 2 ϵ˜(γ), 2
(7.27)
where ϵ˜(γ) −→ 0,
as γ −→ 0.
(7.28)
1
c) Let x, h ∈ C ([a, b]). By (7.27) and (7.25) we have I(x + h) − I(x) = f (x′ (a) + h′ (a)) − f (x′ (a)) −1/2 ′ = x′ (a) 1 + x′2 (a) h (a) + −3/2 ′2 1 −1/2 1 + x′2 (a) − x′2 (a) 1 + x′2 (a) h (a) + 2 +h′2 (a) ϵ˜(h′ (a)). (7.29)
7.3. The differential
275
d) We take the linear part of (7.29). Let’s prove that Φ : C1 ([a, b]) −→ R given by −1/2 ′ Φ(y) = x′ (a) 1 + x′2 (a) y (a), bounded and linear. Φ is clearly linear. So let’s prove that Φ is bounded, i.e., ∃c > 0, ∀y ∈ C1 ([a, b]) :
|Φ(y)| ≤ c∥y∥C 1
(7.30)
−1/2 . Then, for y ∈ C1 ([a, b]), Let’s pick c > |x′ (a)| 1 + x′2 (a) −1/2 ′ |Φ(y)| = |x′ (a)| 1 + x′2 (a) |y (a)| ≤ c ∥y∥∞ ≤ c ∥y∥C 1 . Since y was chosen arbitrarily, we have proved (7.30). e) Now we take the remainder part of (7.29), g(h) =
−1/2 −3/2 ′2 1 1 + x′2 (a) − x′2 (a) 1 + x′2 (a) h (a)+h′2 (a) ϵ˜(h(a)). 2
By using (7.28), we have that −1/2 −3/2 |g(h)| 1 ≤ − x′2 (a) 1 + x′2 (a) ϵ˜(h(a)) · h′2 (a) 1 + x′2 (a) ∥h∥C 1 ∥h∥C 1 −1/2 −3/2 ≤ 1 + x′2 (a) − x′2 (a) 1 + x′2 (a) ϵ˜(h(a)) · ∥h∥C 1 . By the last and the fact that ∥h∥C 1 −→ 0 implies that |h(a)| → 0, it follows that g(h) = o(h). f) By points d) and e) and the arbitrariness of x it follows that I is differentiable and ∀x, h ∈ C1 ([a, b]) :
−1/2 ′ I ′ (x)h = x′ (a) 1 + x′2 (a) h (a).
7.3.4. Differentiability of a generalized vector field Let’s recall from our course of Calculus that a mapping f = (f1 , ..., fm ) : Ω ⊆ Rn −→ Rm is usually referred to as a vector field. We know that the differentiablity properties of f are linked to those of fk , k = 1, ..., m. The following result extends those properties to a much more general setting. Proposition 7.7 Let (E, ∥ · ∥) and (Fk , ∥ · ∥k ), k = 1, ..., p, be normed spaces and f : O ⊆ E → F = F1 × F2 × ... × Fp , where O is open. We write f = (f1 , ..., fp ), where fk : O → Fk is the k-th coordinate mapping. Let a ∈ O. Then f is differentiable at a iff all the coordinate mappings are differentiable at a and, in this case, f ′ (a) = (f1′ (a), ..., fp′ (a)).
276
Chapter 7. Calculus on normed spaces
Proof. Let’s recall that ∥w∥ =
p X
∥wk ∥Fk ,
w = (w1 , ..., wp ) ∈ F.
k=1
Also, for k ∈ {1, ..., p}, the projection mapping Pk , given by Pk (w) = wk , belongs to L(F, Fk ). 1. Let’s assume that fk is differentiable at a ∈ O, k = 1, ..., p. We have to prove that ∀h ∈ E :
a + h ∈ O ⇒ f (a + h) − f (a) = φ(h) + ∥h∥ϵ(h),
(7.31)
where φ = (f1′ (a), ..., fp′ (a)) and ϵ(h) −→ 0, as h −→ 0. Let h ∈ E such that a + h ∈ O. Then f (a + h) − f (a) = (f1 (a + h) − f1 (a), ..., fp (a + h) − fp (a)) = f1′ (a)h + ∥h∥ ϵ1 (h), ..., fp′ (a)h + ∥h∥ ϵp (h) = φ(h) + ∥h∥ · ϵ(h), where ϵk (h) −→ 0,
as h −→ 0,
k = 1, 2, ..., p.
(7.32)
Point (7.32) implies that ϵ(h) −→ 0, as h −→ 0. Since h was chosen arbitrarily, it remains to prove that φ ∈ L(E, F ), i.e., ∃c > 0, ∀y ∈ E : Let’s pick c >
p X
∥φ(y)∥F ≤ c∥y∥E .
(7.33)
∥fk′ (a)∥. Then, for y ∈ E, we have that
k=1
∥φ(y)∥F = ∥(f1′ (a), ..., fp′ (a)) · y∥F =
p X
∥fk′ (a)y∥Fk
k=1
≤
p X
! ∥fk′ (a)y∥
· ∥y∥E ≤ c∥y∥E .
k=1
Since y was chosen arbitrarily, we have proved (7.33). 2. Let’s assume that f is differentiable at a. Let k ∈ {1, ..., p}. Let h ∈ E such that a + h ∈ O. We have that fk (a + h) − fk (a) = (Pk ◦ f )(a + h) − (Pk ◦ f )(a) = Pk (f (a + h) − f (a)) = Pk (f ′ (a)h + ∥h∥ · ϵ(h)) = (Pk ◦ f ′ )(a)h + ∥h∥ (Pk ◦ ϵ)(h). Since Pk ∈ L(F, Fk ), we have that Pk ◦ f ′ (a) ∈ L(E, Fk ) and ϵk (h) = (Pk ◦ ϵ)(h) −→ 0, as h −→ 0. The arbitrariness of h proves that fk is differentiable at a and fk′ (a) = Pk ◦ f ′ (a),
k = 1, ..., p. ■
7.3. The differential
277
7.3.5. Examples Example 7.4 Let E and F be normed spaces and f : E −→ F such that ∥f (u)∥ ≤ ∥u∥2 .
∀u ∈ E :
(7.34)
Let’s show that f is differentiable at a = 0. Point (7.34) implies that f (0) = 0, as well as ∥f (h)∥ ∥h∥2 0 ≤ lim ≤ = 0, h→0 ∥h∥ ∥h∥ so that f (h) = o(h) and, therefore, we get f (h) = f (h) − f (0) = Oop (h) + o(h), where Oop denotes the zero operator from E into F . Then f is differentiable at a = 0 and f ′ (0) = Oop . Example 7.5 Let b ∈ Rn , A ∈ Mn (R) symmetric and f : Rn −→ R, given by f (x) =
1 t x Ax − bt x. 2
√ Let’s equip Rn with the Euclidean norm, given by |x|2 = xt x. Let’s show that f is differentiable. Let’s first observe that f = η + ν, where η, ν : Rn → R are given 1 by η(x) = xt Ax and ν(x) = −bt x. Since our framework is finite-dimensional, 2 the linearity of ν immediately implies that ν ∈ (Rn )′ and, by Proposition 7.3, ν ′ (a) = ν.
∀a ∈ Rn :
(7.35)
Therefore, having in consideration Proposition 7.5, it remains to show that η is differentiable. Let a ∈ Rn , generic. We have, for h ∈ Rn , that 1 η(a + h) − η(a) = at Ah + ht Ah. 2 ′ ′ ′ Then ηG (a) ∈ (Rn ) is given by ηG (a)h = at Ah. Recalling that Mn (R) ∼ = L(Rn ) and using (3.6), we get 1 t 1 t t 1 h Ah = (A h) h = (Ah)t h ≤ 1 |Ah|2 |h|2 ≤ |A|2 |h|22 , 2 2 2 2
1 t h Ah = o(h). Since a was chosen arbitrarily, we have proved that η is 2 differentiable and this, together with (7.35), shows that f is differentiable and so that
f ′ (a) = at A − bt . Example 7.6 Consider the functional J : C([a, b]) −→ R, given by Z J(u) = a
b
u2 (t)dt.
278
Chapter 7. Calculus on normed spaces
Let’s show that J is differentiable. By adjusting a bit the result of Example 7.4, we have that J is differentiable at u = 0. Let u, h ∈ C([a, b]) with u ̸= 0, generic. Then Z Z b
b
J(u + h) − J(u) = 2
h2 (t)dt.
u(t)h(t)dt + a
a
1. Let’s consider the linear functional ϕ : C([a, b]) −→ R given by b
Z ϕ(v) = 2
u(t)v(t)dt. a
′
Let’s show that ϕ ∈ [C([a, b])] , i.e., ∃c > 0, ∀v ∈ C([a, b]) :
|ϕ(v)| ≤ c∥v∥∞ .
(7.36)
Let’s take c = 2∥u∥∞ (b − a) > 0. Let v ∈ C([a, b]), generic. Then Z Z b b |u(t)v(t)|dt ≤ ∥u∥∞ (b−a)∥v∥∞ = c∥v∥∞ . |ϕ(v)| = 2 u(t)v(t)dt ≤ 2 a a Since v was chosen arbitrarily, we have proved (7.36). As a consequence, we ′ have that ϕ ∈ [C([a, b])] . 2. Let’s write Z b g(h) = h2 (t)dt. a
Then, ∥h∥2∞ (b − a) |g(h)| ≤ lim = 0, h→0 h→0 ∥h∥∞ ∥h∥∞ lim
so that g(h) = o(h). Since u was chosen arbitrarily, we have shown that J is differentiable. Actually we have that Z b ∀u, h ∈ C([a, b]) : J ′ (u)h = 2 u(t)h(t)dt. a
Example 7.7 (An integrand depending on time and position) Let’s consider a functional that obtains its values by using integration of a function whose arguments could be interpreted as time t and position u(t). Let f ∈ C1 ([a, b] × R) with formula f (t, s) and consider the functional J : C([a, b]) −→ R, given by Z J(u) =
b
f (t, u(t))dt. a
Let’s show that J is differentiable and that ∀u, h ∈ C([a, b]) :
′
Z
J (u)h = a
b
∂f (t, u(t)) h(t)dt. ∂s
(7.37)
7.3. The differential
279
Proof. 1. Since f ∈ C1 ([a, b] × R) we have, for t ∈ [a, b], s, γ2 ∈ R and γ1 ∈ R such that t + γ1 ∈ [a, b], the Taylor expansion ∂f ∂f (t, s) γ1 + (t, s) γ2 + o(|(γ1 , γ2 )|), ∂t ∂s
f (t + γ1 , s + γ2 ) − f (t, s) = so that
f (t, s + γ2 ) − f (t, s) =
∂f (t, s) γ2 + o(|γ2 )|). ∂s
(7.38)
2. Let u, h ∈ C([a, b]). We have, by using (7.38), Z
b
J(u + h) − J(u) =
[f (t, u(t) + h(t)) − f (t, u(t))] dt a
Z
b
= a
∂f (t, u(t)) h(t) + |h(t)| · ϵ˜(h(t)) dt, ∂s
where ϵ˜(γ) −→ 0,
as γ −→ 0.
(7.39)
3. Let’s consider the functional Φ : C([a, b]) −→ R, given by Z
b
Φ(y) = a
∂f (t, u(t)) y(t) dt. ∂s
Let’s prove that Φ ∈ C([a, b])′ . Since Φ is linear, we have to show that ∃c > 0, ∀y ∈ C([a, b]) :
|Φ(y)| ≤ c∥y∥∞ .
(7.40)
b
∂f We pick c > ∂s (t, u(t)) dt. Then, for y ∈ C([a, b]) it holds a Z b ∂f |Φ(y)| = (t, u(t)) y(t) dt a ∂s Z b ∂f ≤ ∥y∥∞ · ∂s (t, u(t)) dt ≤ c∥y∥∞ . a Z
Since y was chosen arbitrarily, we have proved (7.40). 4. Let’s consider Z
b
|h(t)| ϵ˜(h(t)) dt.
g(h) = a
Let’s prove that lim
h→0
|g(h)| = 0, i.e., ∥h∥∞
∀µ > 0, ∃α > 0 :
∥h∥∞ < α ⇒
|g(h)| < µ. ∥h∥∞
(7.41)
280
Chapter 7. Calculus on normed spaces Let’s recall that (7.39) means that ∀ν > 0, ∃β > 0 :
|γ| < β ⇒ |˜ ϵ(γ)| < ν.
Let µ > 0 and ν = µ/(b − a). Then, by having in mind that ∀t ∈ [a, b] :
|h(t)| ≤ ∥h∥∞ ,
we get, for ∥h∥∞ < α = β(ν), Z 1 b |g(h)| |h(t)| ϵ˜(h(t)) dt = ∥h∥∞ ∥h∥∞ a Z b Z b µ ≤ |˜ ϵ(h(t))| dt ≤ dt = µ. b−a a a Since µ was chosen arbitrarily, we have proved (7.41). 5. From iii) and iv) and the arbitrariness of u, it follows that J is differentiable as well as (7.37). ■ Example 7.8 Let α, β, γ ∈ R, f ∈ C([a, b]) and X = C2 ([a, b]). Consider the functional J : X −→ R, given by Z b β γ α 2 I(u) = u (t) + u′4 (t) + u′′2 (t) + f (t)u(t) dt. 2 4 2 a Let’s determine if I is differentiable. Proof. Let u, h ∈ X. We have Z α b I(u + h)−I(u) = 2u(t)h(t) + h2 (t) dt 2 a Z b ′3 β + 4u (t)h′ (t) + 6u′2 (t)h′2 (t) + 4u′ (t)h′3 (t) + h′4 (t) dt 4 a Z Z b γ b ′′ ′′ ′′2 + 2u (t)h (t) + h (t) dt + f (t)h(t) dt. (7.42) 2 a a 1. Let’s consider the linear part of (7.42). So let’s define Φ : X −→ R by Z
b
Φ(y) =
αu(t)y(t) + βu′3 (t)y ′ (t) + γu′′ (t)y ′′ (t) + f (t)y(t) dt.
a
Let’s prove that Φ ∈ X ′ , i.e., ∃M > 0, ∀y ∈ X : For Z M> a
|Φ(y)| ≤ M ∥y∥C 2 .
b
|α||u(t)| + |β||u′3 (t)| + |γ||u′′ (t)| + |f (t)| dt
(7.43)
7.3. The differential
281
and any y ∈ X, it holds b
Z
|α||u(t)| + |β||u′3 (t)| + |γ||u′′ (t)| + |f (t)| dt
|Φ(y)| ≤ ∥y∥C 2 · a
≤ M ∥y∥C 2 , so that (7.43) is verified. 2. Let’s consider the nonlinear part of (7.42): b
Z
g(h) = a
γ β ′2 α 2 h (t) + 6u (t)h′2 (t) + 4u′ (t)h′3 (t) + h′4 (t) + h′′2 (t) 2 4 2
dt.
Therefore |g(h)| ≤
∥h∥2C 2
Z a
b
|γ| |α| |β| ′2 + 6|u (t)| + 4|u′ (t)| · ∥h∥C 2 + ∥h∥2C 2 + 2 4 2
dt,
whence g(h) = o(h), in X. 3. From a), b) and the arbitrariness of u it follows that I is differentiable on X and, for u, h ∈ X, Z
′
b
I (u)h =
αu(t)h(t) + βu′3 (t)h′ (t) + γu′′ (t)h′′ (t) + f (t)h(t) dt.
a
■ Example 7.9 (A Gateaux-friendly functional) This example is very instructive. Here we deal with a functional whose directional derivatives and Gateaux differential are easy to compute. At the same time, Fr´echet differentiability holds but it’s tricky to prove it. So, let’s consider the functional J : C1 ([0, π]) −→ R, given by Z π J(u) = u′ (t) sin(u(t)) dt. 0
We shall 1) analize the existence of directional derivatives of J; 2) analize the Gateaux differentiability of J; 3) analize the Fr´echet differentiability of J. Proof. 1. For u, h ∈ C1 ([0, π]) and λ ∈ R, we have that Z J(u + λh) =
π
[u′ (t) + λh′ (t)] · sin (u(t) + λh(t)) dt,
0
d J(u + λh) = dλ Z π
{h′ (t) sin(u(t) + λh(t)) + [u′ (t) + λh′ (t)] cos(u(t) + λh(t)) · h(t)} dt,
= 0
282
Chapter 7. Calculus on normed spaces so that ∂h J(u) =
d J(u + λh)|λ=0 dλ Z π
[sin(u(t)) · h′ (t) + u′ (t) cos(u(t)) · h(t)] dt.
=
(7.44)
0
Therefore, for every point u ∈ C1 ([0, π]) and direction h ∈ C1 ([0, π]), there exists ∂h J(u) and it’s given by (7.44). 2. Let u ∈ C1 ([0, π]). By considering (7.44), we define Φ : C1 ([0, π]) −→ R by Z π Φ(y) = [sin(u(t)) · y ′ (t) + u′ (t) cos(u(t)) · y(t)] dt. (7.45) 0
Φ is clearly linear. Let’s prove that Φ ∈ C1 ([0, π])′ , i.e., ∃c > 0, ∀y ∈ C1 ([0, π]) :
|Φ(y)| ≤ c∥y∥C 1 .
(7.46)
Let’s pick c = π · (1 + ∥u′ ∥∞ ). Let y ∈ C1 ([0, π]). Then, by using (7.45), Z π |Φ(y)| ≤ [| sin(u(t))| · |y ′ (t)| + |u′ (t)| | cos(u(t))| · |y(t)|] dt 0 Z π ≤ [∥y ′ ∥∞ + ∥u′ ∥∞ · ∥y∥∞ ] dt ≤ ∥y∥C 1 (1 + ∥u′ ∥∞ ) · π = c∥y∥C 1 . 0
Since y was chosen arbitrarily, we have proved (7.46). Then, for every u, h ∈ C1 ([0, π]), Z π ′ JG (u)h = ∂h J(u) = [sin(u(t)) · h′ (t) + u′ (t) cos(u(t)) · h(t)] dt. 0
3. Let u, h ∈ C1 ([0, π]). We have that sin(x0 + ϵ) − sin(x0 ) = cos(x0 ) ϵ + |ϵ| · w(ϵ), where w(ϵ) −→ 0, as ϵ −→ 0. Then, Z π J(u + h)−J(u) = [u′ (t) cos(u(t)) h(t) + u′ (t) |h(t)| w(h(t))] dt 0 Z π + [sin(u(t)) h′ (t) + cos(u(t)) h(t) h′ (t)] dt 0 Z π + [h′ (t) |h(t)| w(h(t))] dt. (7.47) 0
a) The linear part of (7.47) corresponds to Φ(h), where Φ was already shown to belong to C1 ([0, π])′ , see point (7.45). b) The nonlinear part of (7.47) is Z π [u′ (t) |h(t)| w(h(t)) + cos(u(t)) h(t) h′ (t) g(h) = 0
+ h′ (t) |h(t)| w(h(t))] dt.
7.3. The differential
283
Then, π
Z
[∥h∥∞ ∥u′ ∥∞ |w(h(t))| + ∥h∥∞ ∥h′ ∥∞
|g(h)| ≤ 0
+ ∥h′ ∥∞ ∥h∥∞ |w(h(t))|] dt.
(7.48)
In (7.48) it’s quite clear that Z π 1 [∥h∥∞ ∥h′ ∥∞ + ∥h′ ∥∞ ∥h∥∞ |w(h(t))|] dt = 0, lim h→0 ∥h∥C 1 0 so that, to prove that g(h) = o(h) in C1 ([0, π]), we have to show that Z π lim |w(h(t))| dt = 0, in C1 ([0, π]), h→0
0
i.e., ∀α > 0, ∃δ˜ > 0 :
∥h∥C 1
< δ˜ ⇒
Z
π
|w(h(t))| dt < α.
(7.49)
0
We know that ∀γ > 0, ∃δ = δ(γ) > 0 :
|ϵ| < δ ⇒ |w(ϵ)| < γ.
(7.50)
˜ Let α > 0. We choose γ = α/π. Then, for δ˜ = δ(γ) and ∥h∥C 1 < δ, it follows that Z π Z π |w(h(t))| dt < γ dt = γπ = α. 0
0
Since α was chosen arbitrarily, we have proved (7.49) c) By points a) and b), J is differentiable and, for every u, h ∈ C1 ([0, π]), Z π ′ J (u)h = [sin(u(t)) · h′ (t) + u′ (t) cos(u(t)) · h(t)] dt. 0
■
7.3.6. Problems Problem 7.6 1. Prove Proposition 7.3. 2. Prove that the reciprocal of Proposition 7.3 is not true, i.e., find an example of a non-constant function which has a null differential. Problem 7.7 Let E and F be normed spaces and f ∈ L(E, F ). Prove that f ′ (u) = f , for every u ∈ E. Problem 7.8 Redo Example 7.5 by using (7.8). What happens if the Euclidean norm is replaced by other norm on Rn ? Is the functional f still differentiable?
284
Chapter 7. Calculus on normed spaces
Problem 7.9 Let’s consider the normed space (Mn (R), ∥ · ∥) where ∥ · ∥ is given in Example 7.3. Prove that the operator T : Mn (R) −→ Mn (R), given by T (x) = X t X, is differentiable and find the formula for the differential. Problem 7.10 Let’s consider the function f : R2 −→ R, given by ! ! 3 3 x 0 x − y , if ̸= , x f = x2 + y 2 y 0 y 0, otherwise. 1. Prove that f has all the partial derivatives. 2. Prove that f is Gateaux differentiable. 3. Prove that f is not differentiable at a =
0 0
.
Problem 7.11 Let f : O ⊆ (E, ∥ · ∥1 ) −→ (F, ∥ · ∥2 ) differentiable at a ∈ O, where O is ∥ · ∥1 -open. Assume also that ∥ · ∥3 and ∥ · ∥4 are other norms on E and F , respectively. 1. Prove that if ∥ · ∥1 ∼ ∥ · ∥3 and ∥ · ∥2 ∼ ∥ · ∥4 , then f : O ⊆ (E, ∥ · ∥3 ) → (F, ∥ · ∥4 ) is also differentiable at a and the differential is the same. 2. Assume that ∥ · ∥1 is dominated by ∥ · ∥3 and that ∥ · ∥2 is dominated by ∥ · ∥4 . Is f : O ⊆ (E, ∥ · ∥3 ) → (F, ∥ · ∥4 ) still differentiable at a? 3. Assume that ∥ · ∥1 dominates ∥ · ∥3 and that ∥ · ∥2 dominates ∥ · ∥4 . Is f : O ⊆ (E, ∥ · ∥3 ) → (F, ∥ · ∥4 ) still differentiable at a? Problem 7.12 Let (X, ∥ · ∥) and (Yk , ∥ · ∥k ), k = 1, ..., p, be normed spaces and T : O ⊆ X → Y = Y1 × Y2 × ... × Yp , where O is open. We write T = (T1 , ..., Tp ), where Tk : O ⊆ X → Yk is the k-th coordinate mapping. Let u ∈ O. Prove that T is differentiable at u iff all the coordinate mappings are differentiable at u and that, in this case, T ′ (u) = (T1′ (u), ..., Tp′ (u)). Problem 7.13 Consider the functional J : C([0, 1]) −→ R, given by 1
Z
u2 (t)dt.
J(u) = 0
Consider u0 , hα ∈ C([0, 1]), given by u0 (t) = 2t and hα (t) = αt2 , α ∈ R. Compare the values of J(u0 + hα ) − J(u0 ) and J ′ (u0 )hα , for α ∈ {1, −0.1, 0.01}. Problem 7.14 Consider the functional J : C([0, 1]) → R, given by Z J(u) =
1
t u3 (t)dt.
0
1. Prove that J is differentiable at u0 ∈ C([0, 1]) where u0 (t) = et . 2. Consider hα ∈ C([0, 1]), given by hα (t) = αt, α ∈ R. Compare the values of J(u0 + hα ) − J(u0 ) and J ′ (u0 )hα , for α ∈ {1, 0.1, 0.01}.
7.3. The differential
285
3. Is J differentiable? Problem 7.15 Analize the differentiablity of the functional J : X −→ R. 1. 2. 3. 4.
X X X X
= C([a, b]), J(u) = u(a); = C1 ([a, b]), J(u) = u′ (a); p = C1 ([a, b]), J(u) = 1 + u′2 (a); = C([a, b]), J(u) = |u(a)|.
Problem 7.16 Let E be a normed space and J : E −→ R differentiable. Prove that Q : E −→ R, given by Q(u) = J2(u) is also differentiable, and find the formula for the differential of Q. Problem 7.17 Let γ ∈ C2 ([a, b] × R) with formula γ(t, s) and consider the functional I : C([a, b]) −→ R, given by Z b I(u) = γ(t, eu(t) )dt. a
Prove that I is differentiable and find the formula of I ′ (x)h for x, h ∈ C([a, b]). Problem 7.18 Consider the functional J : X −→ R. 1. Find the Gateaux diffferential of J at a generic point u ∈ X. 2. Is J differentiable? Z X = C([a, b]),
J(u) =
b
[t + u(t)] dt; a
X = C1 ([a, b]),
b
Z
2 y (t) + y ′2 (t) dt;
J(y) = a
1 J(x) = x (0) + t x(t) + x′2 (t) dt; Z π 0 2 X = C ([0, π]), J(u) = u′′ (t) cos(u(t)) dt. 1
X = C ([0, 1]),
Z
2
0
Problem 7.19 (*) Let a, b, c ∈ R with a < b, and ϕ ∈ C([a, b]) Consider the functional J : H3 (]a, b[) −→ R, given by Z t1 b a c ′′2 x (t) + x′4 (t) + x2 (t) + ϕ(t)u(t) dt. J(x) = 2 4 2 t0 Here H3 (]a, b[) is a Sobolev space. Is well defined the functional J? In that case, is it differentiable? If so, find the formula for J ′ (x). Hint. First suppose that Dom(J) = C2 ([a, b]). Then use the following result. Let Ω ⊆ RN be open, bounded with boundary of class C 1 . Assume that m − N/p ∈ ]0, +∞[\N. Then Wm,p (Ω) ⊆ Ck (Ω) continuously, where k = [m − N/p]. [d] denotes the integer part of d ∈ R. Problem 7.20 Consider the functional J : X −→ R.
286
Chapter 7. Calculus on normed spaces
1. Find the Gateaux diffferential of J at a generic point u ∈ X. 2. Is J differentiable? Z
1
X = C ([0, 1]), X = C1 ([a, b]),
1
J(u) = 0 Z b
J(y) =
p
1 + u′2 (t) dt;
p y(t) 1 + y ′2 (t) dt;
a
X = C1 ([0, 1]), X ⊆ C1 ([0, 1]),
Z 1 t x(t) + x′2 (t) dt; J(x) = x2 (0) + 0 Z 1s 1 + y ′2 (t) J(u) = dt. y(t) 0
Problem 7.21 Consider the functional J : C1 ([0, 1]) × C1 ([0, 1]) → R, given by Z J(u, v) = 0
1
1 2 ′ ′ u (t)v (t) + v (t) − αu(t) dt, 2
α ∈ R.
1. Find the Gateaux diffferential of J at a generic point. 2. Is J differentiable? Problem 7.22 Let’s recall that on Mn (R) the canonical basis is C = {eij / i, j ∈ In }, where for each i, j ∈ I: eij = (δki δjl )k,l∈In . 1. Show that ∂eij det(X) = γij (X), where γij (X) denotes the ij-cofactor of X. 2. Prove that det : Mn (R) → R is differentiable.
7.4. Chain rule. Class C1 . 7.4.1. Chain rule Along this section we shall assume that E, F and G are normed spaces, O ⊆ E open, U ⊆ F open and that the operators f : O ⊆ E −→ F and g : U ⊆ F −→ G verify f (O) ⊆ U, so that there exists g ◦ f :O −→ G u 7−→ (g ◦ f )(u) = g(f (u)).
Theorem 7.4 (Chain rule) Assume that f (O) ⊆ U, f is differentiable at u ∈ O and g is differentiable at f (u). Then g ◦ f is differentiable at u and (g ◦ f )′ (u) = g ′ (f (u)) ◦ f ′ (u).
(7.51)
7.4. Chain rule. Class C1 .
287
Proof. Since f is differentiable at u, we have that f (u + h) = f (u) + f ′ (u)h + ∥h∥ ϵ1 (h),
(7.52)
for all h ∈ E such that u + h ∈ O, with lim ϵ1 (h) = 0.
h→0
(7.53)
Since g is differentiable at b = f (u), we have that g(b + k) = g(b) + g ′ (b)k + ∥k∥ ϵ2 (k),
(7.54)
for all k ∈ F such that b + k ∈ U, with lim ϵ2 (k) = 0.
k→0
(7.55)
We have to prove that (g ◦ f )(u + h) = (g ◦ f )(u) + g ′ (f (u)) ◦ f ′ (u)h + o(h), i.e., (g ◦ f )(u + h) = g(b) + g ′ (b)f ′ (u)h + o(h),
(7.56)
for all h ∈ E such that u + h ∈ O. Let h ∈ E such that u + h ∈ O, generic. From (7.52) and (7.54) it follows that (g ◦ f )(u + h) = g(f (u + h)) = g (b + f ′ (u)h + ∥h∥ϵ1 (h)) = g(b) + g ′ (b) k + ∥k∥ ϵ2 (k), = g(b) + g ′ (b) f ′ (u)h + w(h), where k = k(h) = f ′ (u)h + ∥h∥ ϵ1 (h), w(h) = ∥h∥ g ′ (b) ϵ1 (h) + ∥k∥ ϵ2 (k). Now, by (7.53) and (7.55) it follows that lim k(h) = 0 and
h→0
lim
h→0
∥w(h)∥ = 0, ∥h∥
so that w(h) = o(h). We conclude by the arbitrariness of h because g ′ (f (u))◦f ′ (u) clearly belongs to L(E, G). ■ Example 7.10 If in Theorem 7.4, E = Rn , F = Rm , and G = Rs , then the chain rule, (7.51), becomes Jg◦f (u) = Jg (f (u)) Jf (u), where Jw (b) denotes the Jacobian matrix of w = (w1 , ..., wp ) : Ω ⊆ Rp −→ Rq at the point b: ∂wi Jw (b) = (b) . ∂xj Remark 7.7 Let O be an open subset of the normed space E, U an open subset of the normed space F and f : O −→ U bijective and differentiable. Then we say that f is a diffeomorphism iff f −1 is also differentiable.
288
Chapter 7. Calculus on normed spaces
Corollary 7.1 (Differential of the inverse) Let O be an open subset of the normed space E, U an open subset of the normed space F and f : O −→ U a diffeomorphism. Then for every x ∈ O, f ′ (x) is an isomorphism and ′ f −1 (f (x)) = f ′ (x)−1 .
(7.57)
The proof of this result is required as an exercise at the end of the chapter.
7.4.2. Mappings of class C1 Let E and F be normed spaces and O ⊆ E open. A function f : O ⊆ E −→ F belongs to the class C1 (O, F ) iff it’s differentiable and the function f ′ : O ⊆E −→ L(E, F ) x 7−→ f ′ (x) is continuous. In this case, if there is no confusion, we simply say that f is of class C1 . Remark 7.8 It’s quite clear that C1 (O, F ) is a linear space. Moreover, whenever F is a commutative algebra with unit (e.g. F = R), then so is C1 (O, F ). Remark 7.9 (C1 in finite dimension) Let Ω ⊆ Rn open and f : Ω −→ R such that all its partial derivatives are defined and continuous on Ω. Then f ∈ C1 (Ω). Example 7.11 (Proving C 1 — an easy situation) In Example 7.6 we showed that the functional J : C([a, b]) −→ R, given by Z b J(u) = u2 (t)dt, a
is differentiable and that, for every u, h ∈ C([a, b]), Z b J ′ (u)h = ⟨J ′ (u), h⟩ = 2 u(t)h(t)dt. a
Let’s prove that J is of class C , i.e., that the mapping J ′ : C([a, b]) → [C([a, b])]′ is continuous. Since J ′ is linear, we have to prove that 1
∃c > 0, ∀u ∈ C([a, b]) :
∥J ′ (u)∥ ≤ c∥u∥∞ ,
(7.58)
where the norm in the left side is the norm in the dual space [C([a, b])]′ . Let’s pick c ≥ 2(b − a) > 0. Let u, v ∈ C([a, b]), generic. Then Z Z b b ′ |u(t)h(t)|dt |J (u)v| = 2 u(t)v(t)dt ≤ 2 a a ≤ ∥u∥∞ ∥v∥∞ (b − a) ≤ c∥u∥∞ ∥v∥∞ . Since v was chosen arbitrarily, the last inequality shows that ∥J ′ (u)∥ ≤ c∥u∥∞ . Since u was chosen arbitrarily, we have proved (7.58).
7.4. Chain rule. Class C1 .
289
Remark 7.10 (Important) In Example 7.11, proving that J ′ is continuous was simple because of its linearity. In most of the interesting cases, this is not the case; proving the continuity of J ′ is normally tricky. See the following example. Example 7.12 (Proving C 1 — a not-so-easy situation) Let V ∈ C∞ ([a, b]) be given. We know that the functional J : C1 ([a, b]) −→ R, given by Z 1 b V (x)u′4 (x)dx, J(u) = 4 a is Fr´echet differentiable and that, for u, h ∈ C∞ ([a, b]), Z b ′ ⟨J (u), h⟩ = V (x)u′3 (x)h(x)dx. a
Let’s prove that J is of class C , i.e., that J ′ is continuous. Let u0 ∈ C1 ([a, b]), generic. We have to prove that 1
∀ϵ > 0, ∃δ = δ(u0 , ϵ) > 0 :
∥u−u0 ∥C 1 < δ ⇒ ∥J(u)−J(u0 )∥(C 1 )′ < ϵ. (7.59)
Let’s observe that the last inequality in (7.59) is equivalent to ∀h ∈ C1 ([a, b]) :
|⟨J(u) − J(u0 ), h⟩| < ϵ∥h∥C 1 .
Let ϵ > 0, generic. We choose δ = δ(u0 , ϵ) > 0 such that ∥V ∥∞ δ (δ + ∥u0 ∥C 1 )2 + ∥u0 ∥C 1 (δ + ∥u0 ∥C 1 + ∥u0 ∥2C 1 ) < ϵ.
(7.60)
(7.61)
1
Let h, u ∈ C ([a, b]). Then, Z b ′3 ′3 |⟨J(u) − J(u0 ), h⟩| = V (x) u (x) − u0 (x) h(x)dx a Z b ≤ ∥V ∥∞ ∥h∥C 1 |u′ (x) − u′0 (x)| · |u′2 (x) + u′ (x)u′0 (x) + u′2 0 (x)|dx a
Z ≤ ∥V ∥∞ ∥h∥C 1 ∥u − u0 ∥C 1
b
|u′2 (x) + u′ (x)u′0 (x) + u′2 0 (x)|dx.
(7.62)
a
Now let’s assume that ∥u − u0 ∥C 1 < δ. For x ∈ [a, b], we have that |u′2 (x)| ≤ ∥u∥2C 1 = ∥(u − u0 ) + u0 ∥2C 1 2
2
≤ [∥u − u0 ∥C 1 + ∥u0 ∥C 1 ] < (δ + ∥u0 ∥C 1 ) , ′
|u
(x)u′0 (x)|
≤ ∥u∥C 1 ∥u0 ∥C 1 ≤ ∥u0 ∥C 1 (∥u − u0 ∥C 1 + ∥u0 ∥C 1 ) ≤ ∥u0 ∥C 1 (δ + ∥u0 ∥C 1 )
|u′2 0 (x)|
(7.63)
≤
∥u0 ∥2C 1
(7.64) (7.65)
From (7.62), (7.63), (7.64) and (7.65), we get |⟨J(u) − J(u0 ), h⟩| ≤ ∥h∥C 1 ∥V ∥∞ δ[(δ + ∥u0 ∥C 1 )2 + ∥u0 ∥C 1 (δ + ∥u0 ∥C 1 ) + ∥u0 ∥2C 1 ] ≤ ϵ∥h∥C 1 . Since h was chosen arbitrarily, we have proved (7.60). Since ϵ was chosen arbitrarily, we have proved (7.59). We conclude by the arbitrariness of u0 .
290
Chapter 7. Calculus on normed spaces
7.4.3. Problems Problem 7.23 Prove Corollary 7.1.
Problem 7.24 Let I ⊆ R open, O ⊆ Rm , f : I −→ Rm and g : O −→ R such that f (I) ⊆ O. Assume that a ∈ I and the existence of f ′ (a) and g ′ (f (a)). Show that m
X ∂g dfk d (g ◦ f )(t) = (f (t)) (t). dt ∂xk dt k=1
Problem 7.25 Let O be an open subset of the normed space E. Prove that C1 (O) is a commutative algebra with unity.
Problem 7.26 Let Ω ⊆ Rn open and f : Ω −→ R. 1. (*) Let a ∈ Ω. Prove that if there is some r > 0 such that for all k ∈ In , ∂f is defined on B(a, r) and it’s continuous at a, then f is differentiable ∂xk at a. 2. Prove that if all the partial derivatives of f are defined and continuous on Ω, then f ∈ C1 (Ω).
Problem 7.27 Let E be a commutative Banach algebra with unity. 1. Prove that q : E −→ E, given by q(u) = u2 is differentiable and find the formula for the differential. Is q of class C1 ? 2. Prove that for every n ∈ N, pn : E → E, given by pn (u) = un is differentiable and find the formula for the differential. Is pn of class C1 ? 3. Try to prove that the exponential function, exp : E −→ E given by
exp(u) = eu =
+∞ X 1 n u , n! n=0
is of class C1 . Find the formula for the differential. 4. Define the functions sin : E −→ E and cos : E −→ E. Try to prove that sin ∈ C1 (E, E). Find the formula for the differential of sin.
Problem 7.28 Let α > 0. We know that the functional J : X −→ R, is differen-
7.5. Critical points and extremum
291
tiable. Determine if J is of class C 1 . 1
X = C ([0, 1]) × C ([0, 1]), Z 1 X = C ([0, π]), J(u) = X = C1 ([a, b]), X = C1 ([0, 1]), X = C1 ([1, 2]),
J(u, v) = 0
1 2 ′3 ′ u (t)v (t) + v (t) − αu(t) dt; 2
π
u′ (t) sin(u(t))dt;
0 Z b
2 t u (t) + u′2 (t) dt; a Z 1 1 1 J(x) = x2 (0) + t x(t) + 2 e−t x′4 (t) dt; α α 0 Z 2 ′2 J(y) = y (x) − 2x y(x) dx;
J(u) =
Z X = C([a, b]),
1
Z
1
1 b
J(u) =
γ(t, u(t))dt. a
7.5. Critical points and extremum 7.5.1. Definitions Along this section we shall assume that X is a non-void set and f : X −→ R. 1. We say that a ∈ X is a point of (global) minimum iff ∀x ∈ X :
f (a) ≤ f (x).
In this case we say that the value f (a) is the minimum of f . 2. We say that a ∈ X is a point of (global) maximum iff ∀x ∈ X :
f (a) ≥ f (x).
In this case we say that the value f (a) is the maximum of f . 3. A point of either minimum or maximum is called a point of (global) extremum. In this case we say that the value f (a) is an extremum of f . Let (X, τ ) be a topological space, Y ⊆ X and f : Y −→ R. We say that a ∈ Y is a point of local minimum of f iff ∃G ∈ N (a), ∀x ∈ G ∩ Y :
f (a) ≤ f (x).
(7.66)
In this case, we say that f (a) is a local minimum of f . In the same way we speak of local maximum and local extremum. Whenever the topology τ is induced by a metric d on X, then (7.66) can be written as ∃r > 0, ∀x ∈ B(a, r) ∩ Y :
f (a) ≤ f (x).
Remark 7.11 (Strict extremum) Whenever a functional has a unique point of (global or local) minimum, we sometimes say that such a point is the point of (global or local) strict minimum and that its image is the strict (global or local) minimum. In the same way we speak of strict maximum.
292
Chapter 7. Calculus on normed spaces
Example 7.13 1. The function tan :] − π/2, π/2[−→ R has no extremum. However, the restriction of tan to any closed interval has both maximum and minimum. 2. The function sin : R −→ R has an infinite number of points of maximum and minimum. The maximum of sin is 1 and its minimum is −1. 3. The function R ∋ x 7−→ f (x) = x2 has strict minimum and no maximum. Let O be an open subset of the normed space E, and f : O ⊆ X −→ R. We say that x ∈ O is a critical point of f iff f is differentiable at x and f ′ (x) = 0. The set of critical points of f is denoted by K(f ), i.e., K(f ) = {x ∈ O / f ′ (x) = 0}.
7.5.2. Necessary condition for an extremum In Section 4.8 we introduced the concepts of convex set and convex function together with some basic properties. We will use these concepts in this section. Let V be a linear space and a, b ∈ V . The segment joining a and b is the set [a, b] = {x = λa + (1 − λ)b}. Then A ⊆ V is convex iff [x, y] ⊆ A, for every x, y ∈ A. We also denote ]a, b[= [a, b] \ {a, b}. Any subset of a linear space V of the form a + U = {a + u / u ∈ U }, where a ∈ V and U is a linear subspace of V , is called an affine subspace of V . It’s not difficult to prove that any affine space is convex. Theorem 7.5 (Euler’s inequality) Let E be a normed space, O ⊆ E open and f : O ⊆ E −→ R. Assume that 1. X ⊆ O is convex; 2. f |X has a local minimum at x ∈ X; 3. f is differentiable at x. Then, ∀y ∈ X :
f ′ (x)(y − x) ≥ 0.
(7.67)
Before proving Theorem 7.5 let’s give the intuition behind it. By the conditions of Theorem 7.5 we have for x, y ∈ X and y = x + h that 0 ≤ f (x + h) − f (x) = f ′ (x)h + o(h), so that what we need is just to remove the second term in the right side. Proof. Let y ∈ X, generic. Since X is convex we have that ∀λ ∈ [0, 1] :
x + λ(y − x) = (1 − λ)x + λy ∈ X.
7.5. Critical points and extremum
293
Since O is open, we choose λ ∈]0, 1[ small enough so that x + λ(y − x) ∈ O. Then, as f is differentiable at x, we get f (x + λ(y − x)) − f (x) = f ′ (x) (λ(y − x)) + o(λ(y − x)) = λf ′ (x)(y − x) + λ ∥y − x∥ ϵ(λ(y − x)),
(7.68)
where ϵ(h) −→ 0, as h −→ 0. Now, since x is a local minimum of f |X , there is some r > 0 such that, for λ ∈]0, min{r, 1}[, we get, by (7.68), 0≤
f (x + λ(y − x)) − f (x) = f ′ (x)(y − x) + ∥y − x∥ ϵ(λ(y − x)). λ
By letting λ −→ 0 in the last relation we obtain ∂y−x f (x) = f ′ (x)(y − x) ≥ 0. Since y was chosen arbitrarily, we have proved (7.67).
■
Corollary 7.2 (Necessary condition for an extremum (I)) Let E be a normed space, O ⊆ E open and f : O ⊆ E −→ R. Let V a linear subspace of E. Assume that 1. X = a + V , for some a ∈ E; 2. f |O∩X has a local extremum at x ∈ O ∩ X; 3. f is differentiable at x. Then, ∀v ∈ V :
f ′ (x)v = 0.
(7.69)
Proof. See Figure 7.6.
Figure 7.6.: The situation of Corollary 7.2.
Let’s assume that x ∈ X is a point of local minimum for f |O∩X . The case of a point of local maximum is handled in a similar way. Let’s reason by Reduction to Absurdity. Let’s assume that (7.69) is false, i.e., there exists v ∈ V such that f ′ (x)v ̸= 0. We can assume that f ′ (x)v > 0.
(7.70)
294
Chapter 7. Calculus on normed spaces
In fact, if f ′ (x)v < 0 we could take −v instead of v. Let’s choose r > 0 such that B(x, r) ⊆ O. The set X1 = X ∩ B(x, r) ⊆ O is convex. Now we choose 0 < λ 0, ∀t ∈]a, b[:
∥f˙(t)∥F ≤ K.
Then, ∥f (b) − f (a)∥F ≤ K (b − a).
7.6. Mean Value Theorem. Connectedness.
297
For the proof, we just apply Theorem 7.9 with g(t) = Kt.
7.6.4. Proof of the Mean Value Theorem Let’s restate the generalized Mean Value Theorem, Theorem 7.7. Let E and F be normed spaces, O ⊆ E open and f : O −→ F differentiable. If [a, b] ⊆ O, then ∥f (b) − f (a)∥ ≤ sup ∥f ′ (x)∥ · ∥b − a∥. x∈]a,b[
Proof.
[of Theorem 7.7] If sup ∥f ′ (x)∥ = +∞, then the result immediately x∈]a,b[
holds. So let’s assume that sup ∥f ′ (x)∥ < +∞. Let’s consider the curve u : x∈]a,b[
[0, 1] −→ E, given by u(t) = (1 − t)a + bt. It’s clear that u is continuous on [0, 1] and differentiable on ]0, 1[, and that u(0) = a and u(1) = b. Therefore g = f ◦ u ∈ C([0, 1], F ) is differentiable on ]0, 1[. Hence, by the Chain Rule and (7.74), we get, for t ∈]0, 1[, g(t) ˙ = 1 · g(t) ˙ = g ′ (t) 1 = f ′ (u(t)) ◦ u′ (t) 1 = f ′ (u(t)) (b − a). Then, ′ ′ ′ ∥g(t)∥ ˙ F = ∥f (u(t)) (b−a)∥F ≤ ∥f (u(t))∥L(R,F ) ∥b−a∥E ≤ sup ∥f (x)∥L(R,F ) ∥b−a∥E , x∈]a,b[
which, by Corollary 7.4, implies that ∥g(1) − g(0)∥F ≤ sup ∥f ′ (x)∥L(R,F ) ∥b − a∥E (1 − 0), x∈]a,b[
i.e., ∥f (b) − f (a)∥F ≤ sup ∥f ′ (x)∥L(R,F ) ∥b − a∥E . x∈]a,b[
■
7.6.5. An important result that depends on connectedness Grossly speaking, in this section we shall prove that on a connected domain a null differential has to come from a constant function, Theorem 7.12. Let (X, τ ) be a topological space. We say that X is a connected (space) iff ∀ U, V ∈ τ \ {∅} :
U ∪ V = X ⇒ U ∩ V ̸= ∅.
A set S ⊆ X is said to be connected if (S, τS ) is connected, where τS = {S ∩ A / A ∈ τ } is the topology induced on S by τ . If a set is not connected, we say that it is disconnected. Remark 7.13 A topological space X is connected iff the only parts of X that are both open and closed are ∅ and X. This is the case of a normed space.
298
Chapter 7. Calculus on normed spaces
Proposition 7.9 (Characterization of a connected set) Let (X, τ ) be a topological space and S ⊆ X. Then, S is connected iff ∀ U, V ∈ τ :
(U ∪ V ⊇ S ∧ U ∩ S ̸= ∅ ∧ V ∩ S ̸= ∅) ⇒ U ∩ V ∩ S ̸= ∅.
Remark 7.14 As a consequence of Proposition 7.9, an open set S ⊆ X is connected iff ∀ U, V ∈ τ \ {∅} : U ∪ V = S ⇒ U ∩ V ̸= ∅. Therefore an open set is disconnected iff it’s the disjoint union of two non-void open sets. A curve in the topological space X is any continuous mapping φ : [α, β] ⊆ R −→ X. The trace of the curve φ is its image, φ([α, β]) = {φ(t) / t ∈ [α, β]}. A path is a curve γ ∈ C([0, 1], X). If there is a meshing q = (t0 , t1 , ..., tp ) ∈ Rp+1 ,
0 = t0 < t1 < ... < tp = 1,
such that γ restricted to each subinterval [ti , ti+1 ] is affine, then we say that the path γ is polygonal. A set S ⊆ X is said to be arcwise connected iff for every a, b ∈ X there exists a curve φ : [α, β] ⊆ R −→ (S, τS ) such that φ(α) = a and φ(β) = b. It’s easy to prove that a convex subset of a normed space is arcwise connected. Theorem 7.10 (Arcwise connected implies connected) Any arcwise connected set in a topological space is connected. The proof of this result is required as an exercise at the end of the chapter. The next result shows, in particular, that an open connected subset of a normed space is arcwise connected. Theorem 7.11 (Open connected implies arcwise connected) If O is an open connected subset of a normed space E and a, b ∈ O, then there is a polygonal path lying in O joining a and b. For a proof of this result, see e.g. [5, Lemma 3.1]. Theorem 7.12 (Null-differential implies constant) Let O be an open connected subset of a normed space E and f : O ⊆ E −→ R differentiable and such that ∀x ∈ O : f ′ (x) = 0. Then, f is a constant function.
Proof. We have to prove that ∀x, a ∈ O :
f (x) = f (a).
(7.75)
7.6. Mean Value Theorem. Connectedness.
299
Let’s take some a, x ∈ O such that a ̸= x. By Theorem 7.11, there is a polygonal path γ connecting a to x. By Theorem 7.8, we have f (γ(ti+1 )) − f (γ(ti )) = 0,
i = 0, 1, ..., p − 1,
which implies that f (x) = f (a). Since a, x were chosen arbitrarily, we have proved (7.75). ■
7.6.6. Problems Problem 7.31 Let E be a normed space and f : E −→ E differentiable. Assume that ∃κ ∈]0, 1[, ∀x ∈ E : ∥f ′ (x) − IdE ∥L(E) ≤ κ. Prove that f is injective and that the inverse image of a bounded set is bounded. Problem 7.32 Let E and F be normed spaces, O ⊆ E open, a ∈ O and f ∈ C(O, F ) which is differentiable on O \ {a}. Assume that there exists T ∈ L(E, F ) such that lim ∥f ′ (x) − T ∥L(E,F ) = 0. x→a
Show that f is differentiable at a and that f ′ (a) = T . Problem 7.33 Let E and F be normed spaces, O ⊆ E open and f : O → F differentiable. Assume that [a, b] ⊆ O. Prove that ∥f (b) − f (a) − f ′ (a)(b − a)∥F ≤ sup ∥f ′ (x) − f ′ (a)∥L(E,F ) ∥b − a∥E . x∈]a,b[
Hint. Consider the operator ϕ : O → F given by ϕ(x) = f (x) − f ′ (a)x. Problem 7.34 Let E and F be normed spaces, O ⊆ E open and connected, and f : O → F differentiable. Prove that if f ′ = 0, then f is a constant mapping. Hint. Adapt the proof of Theorem 7.12. Problem 7.35 Let E and F be normed spaces, O ⊆ E open and connected, and f : O → F differentiable. Assume that ∃α ∈ L(E, F ), ∀x ∈ O :
f ′ (x) = α.
Prove that there is c ∈ F such that ∀x ∈ O :
f (x) = αx + c.
Problem 7.36 Let (X, τ ) be a topological space. 1. Prove that X is connected iff the only parts of X that are both open and closed are ∅ and X. 2. By using point i), prove that a normed space is connected. Problem 7.37 Prove Proposition 7.9.
300
Chapter 7. Calculus on normed spaces
Problem 7.38 Let E be a normed space and S ⊆ E. Prove that if S is convex, then it’s arcwise connected. Problem 7.39 Let X and Y be topological spaces and f ∈ C(X, Y ). Prove that if A ⊆ X is connected, then f (A) is also connected. Problem 7.40 Let (X, τ ) be a topological space. On X a relation ∼c is defined by x ∼c y ⇐⇒ ∃A ⊆ X connected : x, y ∈ A. 1. Prove that ∼c is an equivalence relation on X. 2. Let x ∈ X. Prove that the equivalence class of x, denoted C(x) is the largest connected set containing x. It’s referred as the connected component of x. 3. Prove that X is connected iff for some (and then for all) x ∈ X, X = C(x). 4. Prove that if X is arcwise connected, then it’s connected. Hint. Take some a ∈ X and prove that X = C(a).
7.7. Some useful lemmas In this section we present a couple of lemmas which are very useful. We write I ([a, b]) = {u ∈ C([a, b]) / u(a) = u(b) = 0} , and, for each k ∈ N, I k ([a, b]) the set of functions u ∈ Ck ([a, b]) such that u(m) (a) = u(m) (b) = 0,
m = 0, ..., k − 1
Lemma 7.1 Let α ∈ C([a, b]). Assume that ∀h ∈ I ([a, b]) :
Z
b
α(x)h(x) dx = 0.
(7.76)
a
Then, α = 0. Proof. Let’s proceed by Reduction to Absurdity. So let’s assume that α ̸= 0, i.e., that there is some c ∈ [a, b] for which α(c) ̸= 0. Without loss of generality let’s assume that α(c) > 0. Since α is continuous, we can find values a ≤ x1 ≤ c ≤ x2 ≤ b such that ∀x ∈ [x1 , x2 ] : α(x) > 0. Now we consider the function h0 : [a, b] −→ R such that ( −(x − x1 )(x − x2 ), if x ∈ [x1 , x2 ], h0 (x) = 0, if x ∈ [a, b] \ [x1 , x2 ]. It’s clear that h0 ∈ I ([a, b]). We have that Z b Z x2 α(x)(x − x1 )(x − x2 ) dx > 0, α(x)h0 (x) dx = − a
x1
which contradicts (7.76). We are done.
■
7.7. Some useful lemmas
301
Lemma 7.2 Let α ∈ C([a, b]). Assume that b
Z
∀h ∈ I 1 ([a, b]) :
α(x)h′ (x) dx = 0.
(7.77)
a
Then, α is a constant function, i.e., ∃c ∈ R, ∀x ∈ [a, b] :
α(x) = c.
(7.78)
Proof. Let’s pick c ∈ R such that Z b [α(x) − c] dx = 0, a
and define h0 : [a, b] −→ R by Z
x
[α(t) − c] dt.
h0 (x) =
(7.79)
a
It’s clear that h0 ∈ I 1 ([a, b]). By (7.77) we have that Z b Z b [α(x) − c] h′0 (x) dx = α(x)h′0 (x) − c[h(b) − h(a)] = 0. a
(7.80)
a
In other hand, by (7.79), we get Z b Z b ′ [α(x) − c] h0 (x) dx = [α(x) − c]2 dx. a
(7.81)
a
From (7.80), (7.81) and the continuity of α, it follows that ∀x ∈ [a, b] :
α(x) − c = 0,
so that (7.78) is true.
■
Lemma 7.3 Let α ∈ C([a, b]). Assume that ∀h ∈ I ([a, b]) : 2
Z
b
α(x)h′′ (x) dx = 0.
a
Then α is an affine function, i.e., ∃c0 , c1 ∈ R, ∀x ∈ [a, b] :
α(x) = c0 + c1 x.
The proof of Lemma 7.3 can be worked out as in the proof of Lemma 7.2. Lemma 7.4 Let α, β ∈ C([a, b]). Assume that ∀h ∈ I 1 ([a, b]) :
Z a
b
[α(x)h(x) + β(x)h′ (x)] dx = 0.
(7.82)
302
Chapter 7. Calculus on normed spaces
Then, β ∈ C1 ([a, b]) and β ′ = α. Proof. Let’s define A : [a, b] −→ R by Z A(x) =
x
α(t) dt.
a
It’s clear that A ∈ C1 ([a, b]). By using integration by parts we get Z b Z b A(x)h′ (x) dx. α(x)h(x) dx = − ∀h ∈ I 1 ([a, b]) : a
a
The last and (7.82) provide b
Z
∀h ∈ I 1 ([a, b]) :
[−A(x) + β(x)] h′ (x) dx = 0.
a
Hence, by Lemma 7.2, it follows that ∃c ∈ R, ∀x ∈ [a, b] :
β(x) − A(x) = c.
Therefore, β ∈ C1 ([a, b]) and we are done.
■
Remark 7.15 Let’s remark that, in Lemma 7.4, the integral condition (7.82) provides both the differentiablity of the function β and a formula to compute β ′ .
7.8. Problems Problem 7.41 (*) Let I = [a, b], E a Banach space and A ∈ C(I, E ′ ) such that Z b ⟨A(t), v(t)⟩dt = 0, a
Z for every v ∈ C(I, E) such that
b
v(t) = 0. Prove that A is constant. a
Hint. Argue by contradiction.
Problem 7.42 Let I = [a, b], E a Banach space and A : I −→ E ′ and .u : I −→ E both differentable at t ∈ I. Then Au = ⟨A(·), u(·)⟩ : I −→ R is differentiable at t and d ˙ A(t)u(t) = A(t)u(t) + A(t)u(t). ˙ dt Problem 7.43 (*) Let I = [a, b], E a Banach space and A, B ∈ C(I, E ′ ). Prove that the following statements are equivalent. 1. B is differentiable and B˙ = A. 2. For every γ ∈ C1 (I, E) such that γ(a) = γ(b) = 0, Z b [A(t)γ(t) + B(t)γ(t)] ˙ dt. a
Hint. Use the result of Problem 7.42.
7.9. Partial differentials
303
7.9. Partial differentials Along this section we shall assume that E1 , ..., En and F are normed spaces and that the product space E = E1 × E2 × ... × En , is endowed with the (product) norm ∥x∥E = max ∥xk ∥Ek , k=1,...,n
where x = (x1 , ..., xn ). Let’s recall that a set Ω ⊆ E is open iff for each k ∈ In , there exists Ωk ⊆ Ek open such that Ω1 × Ω2 × ... × Ωn ⊆ Ω.
7.9.1. Differential in terms of partial differentials Let O ⊆ E be an open set, f : O ⊆ E −→ F and a = (a1 , ..., an ) ∈ E. For each k ∈ In we consider a function fa,k = f (a1 , ..., ak−1 , ·, ak+1 , ..., an ) : Ok ⊆ Ek −→ F, where Ok is an open subset of Ek . If fa,k is differentiable at ak , then we call ′ ∂k f (a) = fa,k (ak ) ∈ L(Ek , F ),
the k-th partial differential of f at a. Proposition 7.10 (Differential in terms of partial differentials) If f : O ⊆ E −→ F is differentiable at a = (a1 , ..., an ) ∈ E, then for every k ∈ In , the function fa,k is differentiable and ∀h = (h1 , ..., hn ) ∈ E :
f ′ (a)h =
n X
∂k f (a)hk .
k=1
Proof. Let’s take h = (h1 , ..., hn ) ∈ E, generic. Let’s observe that, for each k ∈ In , fa,k = f ◦ qa,k , where qa,k : Ek −→ E is given by qa,k (y) = (a1 , ..., ak−1 , y, ak+1 , ..., an ). Since qa,k is affine, it’s differentiable and, therefore, so is fa,k . By the Chain Rule, ′ ∂k f (a)hk = f ′ (a) qa,k hk = f ′ (a) (0, ..., hk , ..., 0).
(7.83)
Therefore, by the linearity of f ′ (a), f ′ (a)h = f ′ (a)
n X
(0, ..., hk , ..., 0) =
k=1
n X
∂k f (a)hk .
k=1
■ Remark 7.16 In the context of Proposition 7.10, if f is differentiable, then ∂k f : O → L(Ek , F ) is called the k-th partial differential (mapping).
304
Chapter 7. Calculus on normed spaces
7.9.2. Partial differentials and the class C1 Given u = (u1 , ..., un ) ∈ E, we shall denote u(0) = (0, ...., 0) and u(k) = (u1 , ..., uk , 0, ..., 0),
k = 1, ..., n.
Therefore, u(n) = u and u(k) − u(k−1) = (0, ..., uk , ..., 0),
k ∈ In ,
and ∀k ∈ In :
∥u(k) ∥E ≤ ∥u∥E .
(7.84)
Theorem 7.13 (Partial differentials and the class C1 ) The function f : O ⊆ E → F belongs to C1 (O, F ) iff it has continuous partial differentials defined on O. Proof. I) Let’s assume that f ∈ C1 (O, F ). Let’s take k ∈ In and a ∈ O, generic. We have to prove that ∂k f is continuous at a, i.e., ∀ϵ > 0, ∃δ > 0 :
∥y∥E < δ ⇒ ∥∂k f (a+y)−∂k f (a)∥L(Ek ,F ) < ϵ. (7.85)
Let ϵ > 0, generic. Let’s observe that for hk ∈ Ek and y ∈ E such that a + y ∈ O, we have, by (7.83), that ∥(∂k f (a + y) − ∂k f (a)) hk ∥F = ∥ (f ′ (a + y) − f ′ (a)) (0, ..., hk , ..., 0)∥F ≤ ∥f ′ (a + y) − f ′ (a)∥L(E,F ) · ∥hk ∥Ek . Since hk was arbitrary, the last implies that ∥∂k f (a + y) − ∂k f (a)∥L(Ek ,F ) ≤ ∥f ′ (a + y) − f ′ (a)∥L(E,F )
(7.86)
Now, by the continuity of f ′ , we choose δ > 0 such that ∥y∥E < δ ⇒ ∥f ′ (a + y) − f ′ (a)∥L(E,F ) < ϵ. Hence, by (7.86) we have that ∥y∥E < δ ⇒ ∥∂k f (a + y) − ∂k f (a)∥L(Ek ,F ) < ϵ. Since ϵ was chosen arbitrarily, we have proved (7.85). We conclude by the arbitrariness of a and k. II) Let’s assume that f has continuous partial differentials defined on O. We have to prove that f ∈ C1 (O, F ), i.e., a) f is differentiable at every point a ∈ O, and b) f ′ : O → L(E, F ) is continuous at every point a ∈ O. Let a ∈ O, generic.
7.9. Partial differentials
305
a) We shall prove that, for h = (h1 , ..., hn ) ∈ E small enough, f (a + h) − f (a) − ϕ(h) = G(h), where ϕ(h) =
n X
∂k f (a)hi , and lim
h→0
k=1
∀ϵ > 0, ∃δ > 0 :
1 G(h) = 0, i.e., ∥h∥
∥h∥ < δ ⇒
∥G(h)∥ < ϵ. ∥h∥
(7.87)
Once this is done we will be able to write f ′ (a)h =
n X
∂k f (a)hi .
(7.88)
k=1
Let ϵ > 0, generic. Let’s choose δ > 0 such that B(a, δ) ⊆ O. By (7.84) it follows that h ∈ B(0, δ) ⇒ a + h(k) ∈ B(a, δ), k ∈ In . i. Let’s define g : B(a, δ) −→ F by g(x) = f (x) − f (a) −
n X
∂k f (a) (xk − ak ).
k=1
Then, g(a) = 0 and, for x ∈ B(a, δ), ∂k g(x) = ∂k f (x) − ∂k f (a).
(7.89)
Observe also that, for h ∈ B(0, δ), G(h) = g(a + h).
(7.90)
ii. For k ∈ In , the continuity of ∂k f allows us pick δk ∈]0, δ[ such that h ∈ B(0, δk ) ⇒ ∥∂k f (a + h) − ∂k f (a)∥L(Ek ,F )
0, ∃δ > 0 : 0 < |h| < δ ⇒ ∥F (x + h) − F (x)∥E < ϵ.
(7.98)
7.10. Riemann integral
313
Let’s assume that f ̸= 0.2 Let x ∈ [a, b] and ϵ > 0, generic. Let’s pick 0 < δ < ϵ/∥f ∥E . Then x+h
Z F (x + h) − F (x) =
x
Z f−
Z
x+h
f=
c
c
f, x
so that ∥F (x + h) − F (x)∥E ≤ |h| ·
∥f (t)∥E ≤ |h| · ∥f ∥E < ϵ.
sup t∈[x,x+h]
Since ϵ and x were chosen arbitrarily, we have proved (7.98). 2. Let’s assume that f s continuous at x0 ∈ [a, b]. We have, for h ̸= 0 small enough, that 1 1 [F (x0 + h) − F (x0 )] − f (x0 ) = h h
Z
x+h
[f (t) − f (x0 )] dt. x
Hence
1
[F (x0 + h) − F (x0 )] − f (x0 ) ≤ sup ∥f (t) − f (x0 )∥E
h
t∈[x,x+h] E which, by letting h −→ 0, produces (7.97). ■ Remark 7.21 As a consequence of Theorem 7.16, if f is continuous, then F˙ = f , i.e., f has a primitive. Moreover, if G is another primitive of f , then ∃u ∈ E, ∀x ∈ [a, b] :
G(x) = u + F (x).
Corollary 7.5 (FThC) Let E be a Banach space. If f ∈ C([a, b], E) and F is a primitive of f , then Z x Z x ∀x ∈ [a, b] : F (x) − F (a) = f= dt f (t). a
a
In [5, Sec. 3.4] it’s built the following generalization of the formula for differentiation under the integral sign: Theorem 7.17 (Differentiation under the integral sign) Let a < b. Let E be a normed space, V a Banach space, O ⊆ E open, and f ∈ C([a, b] × O, V ). Assume that the partial differential ∂2 f is defined and continuous on [a, b] × O. Then, the mapping g : O ⊆ E −→ V , given by Z g(x) =
dt f (t, x), a
2 The
case f = 0 follows trivially. Why?
b
314
Chapter 7. Calculus on normed spaces
belongs to C1 (O, V ) and Z
′
g (x) =
b
dt ∂2 f (t, x). a
7.10.4. Problems Problem 7.49 Prove that B([a, b], E) is a Banach space whenever it’s equiped with the norm given by ∥f ∥ = sup ∥f (x)∥E . x∈[a,b]
Problem 7.50 Prove that Chasle’s Law holds on S([a, b], E) as well as on R([a, b], E). Problem 7.51 Let E be a Banach space and Zf ∈ C([a, b], E). Assume that x c ∈ [a, b] and define F : [a, b] −→ R by F (x) = f . Prove that c
1. F˙ = f ; 2. if G is another primitive of f , then ∃u ∈ E, ∀x ∈ [a, b] :
G(x) = u + F (x).
Problem 7.52 Let E be a Banach space, f ∈ C([a, b], E) and F a primitive of f . Prove that Z x ∀x ∈ [a, b] :
F (x) − F (a) =
f a
Problem 7.53 Let E be a Banach space. Assume that α ∈ C([a, b]) and that f : [a, b] −→ E has derivative at t ∈ [a, b]. Prove that αf has derivative at t and that d (αf )(t) = α(t) f˙(t) + α(t) ˙ f (t). dt
7.11. Taylor expansion. Second order conditions for extremum. 7.11.1. Higher-order differentials Let E and F be normed spaces, O ⊆ E open, a ∈ O and f : O ⊆ E −→ F . Let’s take advantage of the alternative notation D1 f (a) = f ′ (a) ∈ L1 (E, F ), where L1 (E, F ) := L(E, F ). If for some r1 ∈]0, +∞[, f is differentiable on V1 = B(a, r1 ) ⊆ O, then it is well defined the function D1 f : V1 ⊆ E −→ L1 (E, F ) If it exists, the second differential of f at a is D2 f (a) = (D1 f )′ (a)
7.11. Taylor expansion. Second order conditions for extremum.
315
and belongs to L2 (E, F ) := L (E, L1 (E, F )) = L (E, L(E, F )) . If for some r2 ∈]0, r1 [, D1 f is differentiable on V2 = B(a, r2 ) ⊆ V1 ⊆ O, then it is well defined the function D2 f : V2 ⊆ E −→ L2 (E, F ). If it exists, the third differential of f at a is D3 f (a) = (D2 f )′ (a) and belongs to L3 (E, F ) := L(E, L2 (E, F )) = L(E, L (E, L(E, F ))). If for some r3 ∈]0, r2 [, D2 f is differentiable on V3 = B(a, r3 ) ⊆ V2 ⊆ V1 ⊆ O, then it is well defined the function D3 f : V3 ⊆ E −→ L3 (E, F ). Let’s generalize the idea just presented. Let k ∈ N \ {1}. If it exists, the k-th differential of f at a is Dk f (a) = (Dk−1 f )′ (a) and belongs to Lk (E, F ) := L(E, Lk−1 (E, F )). If for some rk ∈]0, rk−1 [, Dk−1 f is differentiable on Vk = B(a, rk ) ⊆ Vk−1 ⊆ ... ⊆ V1 ⊆ O, then it is well defined the function Dk f : Vk ⊆ E −→ Lk (E, F ). Remark 7.22 From what has just been presented it’s clear that, to deal with the “pure” concept of a higher differential of f , one needs to handle an object that lives in a very wild space, Lk (E, F ). This shall be smartly handled, via appropiate isomorphisms, in Section 7.11.3 by using the tools of Section 7.11.2. As a side effect we will recover the usual notation, f ′′ (x), f ′′′ (x), ..., f (k) (x).
7.11.2. Multilinear mappings Let E1 , ..., Ek and F be normed spaces. We equip the product space E = E1 × E2 × ... × Ek with the norm ∥x∥E = ∥(x1 , ..., xk )∥E = max{∥x1 ∥E1 , ..., ∥xk ∥Ek }.
(7.99)
We say that an operator w : E −→ F is a k-linear (or multilinear ) mapping iff it’s linear in each variable, i.e., for every i ∈ Ik and every xj ∈ Ej , j = 1, ..., i − 1, i + 1, ..., k, the function w(x1 , ..., xi−1 , · , xi+1 , ..., xk ) : Ei −→ F,
316
Chapter 7. Calculus on normed spaces
is linear: for all λ ∈ R and all y, z ∈ Ei it holds w(x1 , ..., xi−1 , λy + z, xi+1 , ..., xk ) = λ w(x1 , ..., xi−1 , y, xi+1 , ..., xk ) +w(x1 , ..., xi−1 , z, xi+1 , ..., xk ). Whenever, k = 2, we say that w is a bilinear mapping ; if k = 3, we say that w is a trilinear mapping ; if F = R the term “mapping” is changed for either functional or form. Theorem 7.18 (Characterization of a continuous multilinear mapping) Let E1 , ..., Ek and F be normed spaces, E = E1 × ... × Ek and w : E −→ F a k-linear mapping. Then the following conditions are equivalent: 1. 2. 3. 4.
w is continuous; w is continuous at 0; w is bounded on B(0, 1); there exists µ > 0 such that for every x = (x1 , ..., xk ) ∈ E, ∥w(x)∥F ≤ µ∥x1 ∥E1 ∥x2 ∥E2 . . . ∥xk ∥Ek .
(7.100)
The proof of Theorem 7.18 is required as an exercise at the end of the section. Observe that (7.100) means that w is a bounded linear mapping in each of its variables, and implies, using (7.99), that ∥w(x)∥F ≤ µ∥x∥k .
∀x ∈ E :
It’s easy to check that the set of continuous k-linear mappings from E into F , denoted M(E; F ) = M(E1 × E2 × ... × Ek ; F ), is a normed space whenever is endowed with | · | : M(E; F ) → R, given by |w| = inf(Ow ), where Ow is the set of all the values µ > 0 for which (7.100) holds. It also holds the characterization |w| = sup ∥w(x)∥F . ∥x∥E ≤1
Remark 7.23 Whenever U = E1 = E2 = ... = Ek , we write M(U k ; F ) = M(U × .... × U ; F ). For the spaces of multilinear functionals we use the notation M(U k ) = M(U k ; R). Proposition 7.12 Let U be a normed space and F a Banach space. Then, for every k ∈ N, M(U k ; F ) is a Banach space. The proof of Proposition 7.12 is similar to that of Theorem 5.4. Let’s recall that two normed spaces V and W are (normed) isomorphic iff there is a bijective T ∈ L(V, W ) such that ∀u ∈ V :
∥T (u)∥W = ∥u∥V .
In this case we also say that T is an isometric isomorphism.
7.11. Taylor expansion. Second order conditions for extremum.
317
Theorem 7.19 (Isomorphisms for differentials) Let U and F be normed spaces. For all k ∈ N, there is an isometric isomorphism Φk : Lk (U, F ) −→ M(U k ; F ). Let’s get into the context of Theorem 7.19. Case k = 1. We take Φ1 as the identity operator. Case k = 2. Let’s make explicit the isomorphism Φ2 : L2 (U, F ) = L(U, L(U, F )) −→ M(U 2 ; F ) = M(U × U ; F ). For w ∈ L2 (U, F ), the bilinear mapping Φ2 [w] ∈ M(U × U ; F ) will be given by Φ2 [w](x1 , x2 ) = w(x1 ) x2 . (7.101) Case k = 3. Let’s make explicit the isomorphism Φ3 : L3 (U, F ) = L(U, L(U, L(U, F ))) −→ M(U 3 ; F ) = M(U ×U ×U ; F ). For w ∈ L3 (U, F ), the trilinear mapping Φ3 [w] ∈ M(U × U × U ; F ) will be given by Φ3 [w](x1 , x2 , x3 ) = (w(x1 ) x2 ) x3 . (7.102) Case of a general k. Following the idea described for the cases k = 2 and k = 3, it’s found an isomorphism Φk : Lk (U, F ) −→ M(U k ; F ). For w ∈ Lk (U, F ), the k-linear mapping Φk [w] ∈ M(U k ; F ) will be given by Φk [w](x1 , ..., xk ) = (w(x1 ) x2 ) ... xk . It holds ∀w ∈ Lk (U, F ) :
|Φk [w]|M(U k ;F ) ≤ ∥w∥Lk (U,F ) .
The inverse of Φk is denoted Ψk = Φ−1 k . The mappings Φk and Ψk are referred to as the standard isometric isomorphisms. Remark 7.24 (The group Sn ) Given S = {1, 2, ..., n}, a permutation of S is any bijective function σ : S → S. By Sn we denote the set of all the permutations of S. The symmetric group of n letters is (Sn , ◦), where ◦ denotes the composition (product). Let U and F be normed spaces. We say that g ∈ M(U k ; F ) is symmetric iff ∀σ ∈ Sk , ∀x1 , ..., xk ∈ U :
g(x1 , x2 , ..., xk ) = g(xσ(1) , xσ(2) , ..., xσ(k) ).
We write MS (U k ; F ) = {g ∈ M(U k ; F ) / g is symmetric}. It’s easy to prove that MS (U k ; F ) is closed in M(U k ; F ). As a consequence of Proposition 7.12, if F is Banach, then so is MS (U k ; F ).
318
Chapter 7. Calculus on normed spaces
7.11.3. Higher-order differentials and multilinear mappings Let E and F be normed spaces, O ⊆ E open, x0 ∈ O and f : O ⊆ E −→ F . Whenever they exist, we write with help of the standard isometric isomorphisms, f ′ (x0 ) = D1 f (x0 ) ∈ L(E, F ), f ′′ (x0 ) = Φ2 D2 f (x0 ) ∈ M(E 2 ; F ), f ′′′ (x0 ) = Φ3 D3 f (x0 ) ∈ M(E 3 ; F ). So, in general, for k ∈ N, f (k) (x0 ) = Φk Dk f (x0 ) ∈ M(E k ; F ). Since Φk is an isometric isomorphism we say that either f (k) (x0 ) or Dk f (x0 ) is the k-th differential of f at x0 . Theorem 7.20 (Symmetry of the k-th differential) Let k ∈ N. Assume that f is k-th differentiable at x0 ∈ O. Then f (k) (x0 ) ∈ MS (E k ; F ). A proof of this result can be found in [5, Th.4.5 - Cor. 4.3]. Remark 7.25 (Characterizatization of twice-differentiability) Let E and F be normed spaces, O ⊆ E open, x0 ∈ O and f : O ⊆ E −→ F . Then, f is twice differentiable iff there are ϕ1 ∈ L(E, F ) and ϕ2 ∈ MS (E 2 ; F ), such that ∀h ∈ E :
x0 + h ∈ O ⇒ f (x0 + h) = f (x0 ) + ϕ1 (h) + ϕ2 (h, h) + ∥h∥2 · ϵ(h),
where lim ϵ(h) = 0. In this case we can write the Taylor expansion of second h→0
order :
1 f (x0 + h) = f (x0 ) + f ′ (x0 )h + f ′′ (x0 ) (h, h) + o(∥h∥2 ). 2
Remark 7.26 (Computing the second differential) Keep in mind that in many cases the following formula can be used to compute the second differential f ′′ (a)h2 =
d2 f (a + λh)|λ=0 . dλ2
7.11.4. Taylor expansions Let E be a normed space and a vector y ∈ E. In what follows it will be useful the notation y k = (y, ..., y) ∈ E k . The following lemma states a generalization of the basic formula of Calculus: (tn )′ = n tn−1 .
7.11. Taylor expansion. Second order conditions for extremum.
319
Lemma 7.5 Let E and F be normed spaces, ϕ ∈ MS (E k ; F ) and Φ : E → F given by Φ(x) = ϕ(xk ). Then, Φ is differentiable and ∀x, h ∈ E :
Φ′ (x) h = k ϕ(xk−1 , h).
Proof. Let x, h ∈ E, generic. We have that Φ(x + h) = ϕ(x + h, x + h, ..., x + h) = Φ(x) + k ϕ(x
k−1
(k times)
, h) + g(h),
where g(h) =
k−1 X j=2
k ϕ(xk−j , hj ) + ϕ(hk ). j
Since ϕ ∈ MS (E k ; F ) it immediately follows that k ϕ(xk−1 , ·) ∈ L(E, F ).
(7.103)
In other hand, for h ̸= 0, k−1 |ϕ| X k ∥g(h)∥ ≤ ∥x∥k−j ∥h∥jE + ∥h∥kE E ∥h∥ ∥h∥ j=2 j k−1 X k , = |ϕ| ∥x∥k−j ∥h∥j−1 + ∥h∥k−1 E E E j j=2 ∥g(h)∥ = 0, i.e., g(h) = o(h), that, together with (7.103) ∥h∥ and the arbitrariness of x, h let us conclude. ■
which implies that lim
h→0
Theorem 7.21 (Taylor expansion of k-th order) Let E and F be normed spaces, O ⊆ E open, a ∈ O and k ∈ N. If f : O ⊆ E −→ F is (k − 1)differentiable and f (k) (a) exists, then there is δ > 0 such that f (a + y) = f (a) +
k X 1 (j) f (a)(y j ) + o(∥y∥k ), j! j=1
(7.104)
for all y ∈ B(0, δ),
Proof. We use mathematical inducttion. 1. For k = 1 the result holds immediately by the definition of differential. 2. Let’s assume that the statement is true for k − 1 (Inductive Hypothesis) and let’s prove that it also holds for k (Inductive Thesis).
320
Chapter 7. Calculus on normed spaces a) Let’s pick r > 0 such that a + B(0, r) = B(a, r) ⊆ O, and define ϕ : B(0, r) −→ F , by k X 1 (j) ϕ(y) = f (a + y) − f (a) − f (a)(y j ). j! j=1
(7.105)
Then, by applying Lemma 7.5 to (7.105), we get, for y ∈ B(0, r) and h ∈ E, ϕ′ (y)h = f ′ (a+y) h−f ′ (a) h−
k X j=2
1 f (j) (a)(y j−1 , h). (7.106) (j − 1)!
b) Now let’s use the Inductive Hypothesis with respect to the mapping f ′ : O −→ F . Then for y ∈ B(0, r): f ′ (a + y) = f ′ (a) +
k−1 X j=1
1 ′ (j) (f ) (a)(y j ) + o(∥y∥k−1 ), j!
which provides, for h ∈ E: f ′ (a + y)h = f ′ (a)h +
k−1 X j=1
1 (j+1) f (a)(y j , h) + o(∥y∥k−1 ) h. (7.107) j!
From (7.106) and (7.107) it follows that ϕ′ (y)h = o(∥y∥k−1 ) h, so that ϕ′ (y) = o(∥y∥k−1 ).
(7.108)
c) Now let’s prove (7.104), i.e., ϕ(y) = o(∥y∥k ), which is equivalent to ∀ϵ > 0, ∃δ > 0 :
y ∈ B(0, δ) \ {0} ⇒
∥ϕ(y)∥ < ϵ. ∥y∥k
(7.109)
Let ϵ > 0. By (7.108), there is δ ∈]0, r[ such that for y ∈ B(0, δ) \ {0}, |ϕ′ (y)|L(E,F ) < ϵ, ∥y∥k−1 i.e., |ϕ′ (y)|L(E,F ) < ϵ ∥y∥k−1 . Then, by the Mean Value Theorem, Theorem 7.7, we get ∥ϕ(y)∥F = ∥ϕ(y) − ϕ(0)∥F ≤ sup |ϕ′ (x)|L(E,F ) · ∥y − 0∥E < ϵ∥y∥kE . x∈[0,y]
Since ϵ was chosen arbitrarily, we have proved (7.109). ■ Remark 7.27 In the context of Theorem 7.21, if F = R and [a, a + y] ⊆ O, then there is θ ∈]0, 1[ such that f (a + y) = f (a) +
k X 1 (j) 1 f (a)(y j ) + f (k+1) (a + θy)(y k+1 ). (7.110) j! (k + 1)! j=1
7.11. Taylor expansion. Second order conditions for extremum.
321
7.11.5. Second-order conditions for extrema Let E be a normed space. To a bilinear functional B ∈ M(E 2 ) we can always associate a quadratic functional Q : E → R given by Q(x) = B(x, x). We say that the quadratic functional Q is nonnegative iff ∀x ∈ E :
Q(x) ≥ 0;
positive definite iff ∀x ∈ E \ {0} :
Q(x) > 0;
strongly positive or coercive iff Q(x) ≥ α ∥x∥2 .
∃α > 0, ∀x ∈ E :
Remark 7.28 Keep in mind that the second differential of a function at a given point is a quadratic functional defined by a symmetric bilinear operator. Proposition 7.13 (Second-order necessary condition for minimum) Let E be a normed space, O ⊆ E open and f : O ⊆ E −→ R. Assume that f has a second differential at a ∈ O, a point of local minimum. Then f ′′ (a) is nonnegative, i.e., ∀h ∈ E : f ′′ (a)h2 ≥ 0. (7.111) Proof. For h = 0 the result immediately holds. So let’s assume that h ∈ E \ {0}. Since f (a) is a local minimum, there is β > 0 such that ∀t ∈] − β, β[:
f (a) ≤ f (a + th).
(7.112)
As f is twice-differentiable at a, there exists δ > 0 such that ∀y ∈ B(0, δ) :
1 f (a + y) = f (a) + f ′ (a)y + f ′′ (a) y 2 + o(∥y∥2 ) 2 1 ′′ 2 (7.113) = f (a) + f (a) y + o(∥y∥2 ), 2
where we have applied Corollary 7.3. Now we choose t ∈] − β, β[\{0} such that y = th ∈ B(0, δ). Then, by (7.112) and (7.113), we have that 0 ≤ f (a + th) − f (a) =
1 ′′ f (a)(th, th) + o(t2 ∥h∥2 ), 2
whence,
2 · o(t2 ∥h∥2 ) t2 which, by letting t −→ 0, gives 0 ≤ f ′′ (a)h2 . Since h was chosen arbitrarily, we have proved (7.111). ■ 0 ≤ f ′′ (a)(h, h) +
Remark 7.29 In the context of Proposition 7.13, if a is a local maximum, then (7.111) becomes ∀h ∈ E : f ′′ (a)h2 ≤ 0.
322
Chapter 7. Calculus on normed spaces
Proposition 7.14 (A second-order sufficient condition for extremum) Let E be a normed space, O ⊆ E open and f : O ⊆ E → R. Assume that 1. f is twice differentiable and that a ∈ O; 2. a is a critical point of f ; 3. there is r > 0 such that B(a, r) ⊆ O and ∀x ∈ B(a, r), ∀h ∈ E :
f ′′ (x) h2 ≥ 0.
(7.114)
Then, a is a point of local minimum of f . Remark 7.30 Point (7.114) says that f ′′ (x) is non-negative in a ball around x. Proof. We have to prove that there exists β > 0 such that ∀y ∈ B(a, β) :
f (y) ≥ f (a),
i.e., ∀h ∈ B(0, β) :
f (a + h) ≥ f (a).
(7.115)
Let’s choose β = r. Let h ∈ B(0, r), generic. Since a is a critical point of f , by using the Taylor expansion of second order around a in the form (7.110), and (7.114), we get 1 f (a + h) = f (a) + f ′ (a)h + f ′′ (a + θ h) h2 2 1 ′′ = f (a) + f (a + θ h) h2 ≥ f (a), 2 where θ ∈]0, 1[. Since h was chosen arbitrarily, we have proved (7.115).
■
Remark 7.31 In the context of Proposition 7.14, if condition (7.114) is replaced by ∀x ∈ B(a, r), ∀h ∈ E : f ′′ (x) h2 ≤ 0, then the critical point a becomes a point of local maximum.
Proposition 7.15 (Second-order sufficient condition for extremum) Let E be a normed space, O ⊆ E open and f : O ⊆ E −→ R. Assume that 1. f has a second differential at a ∈ O; 2. a is a critical point of f ; 3. f ′′ (a) is strongly positive, i.e. ∃α > 0, ∀h ∈ E :
f ′′ (a) h2 ≥ α∥h∥2 .
Then, a is a point of local strict minimum of f .
(7.116)
7.11. Taylor expansion. Second order conditions for extremum.
323
Proof. Since a is a critical point of f , by the Taylor expansion, (7.104), and (7.116), we have f (a + h) − f (a) =
hα i 1 ′′ f (a) h2 + ϵ(h) ∥h∥2 ≥ + ϵ(h) · ∥h∥2 , 2 2
where ϵ(h) −→ 0, as h −→ 0. Then we choose δ > 0 such that for h ∈ B(0, δ), α |ϵ(h)| < . Then for h ∈ B(0, δ) \ {0}, 4 f (a + h) − f (a) >
α ∥h∥2 > 0, 4
which means that a is a point of local strict minimum.
■
7.11.6. Examples Example 7.14 (Looking for extremums) In this example we shall consider a C 2 -functional to which we can apply several results developed in the course. It even allows an explicit computation of its critical points by using the method of power series for ordinary differential equations. Let’s consider the functional J : C1 ([a, b]) −→ R, given by Z J(u) =
b
2 t u (t) + u′2 (t) dt,
a
where a ≥ 0. We shall a) b) c) d)
prove that J is twice differentiable; write the corresponding Taylor expansion; find the critical points of J; analize if J has local or global extremum.
Proof. 1. Let u, h ∈ C1 ([a, b]). We have that J(u + h) − J(u) = Z b = 2t u(t) h(t) + 2u′ (t) h′ (t) + th2 (t) + h′2 (t) dt + 0.
(7.117)
a
2. The linear-in-h part of (7.117) is associated with the linear functional Φ1 : C1 ([a, b]) −→ R, given by Z Φ1 (w) =
b
[2t u(t) w(t) + 2u′ (t) w′ (t)] dt.
a
Let’s prove that Φ1 ∈ C1 ([a, b])′ , i.e., ∃c > 0, ∀w ∈ C1 ([a, b]) :
|Φ1 (w)| ≤ c∥w∥C 1 .
(7.118)
324
Chapter 7. Calculus on normed spaces Let’s take c > 2∥u∥C 1 (d + 1)(b − a), where d = max{|a|, |b|}. Let w ∈ C1 ([a, b]). Then we have that Z Z b b [2|t| |u(t)| |w(t)| + 2|u′ (t)| |w′ (t)|] dt ... ≤ |Φ1 (w)| = a a Z b ≤ [2d∥u∥∞ ∥w∥∞ + 2∥u′ ∥∞ ∥w′ ∥∞ ] · dt a
≤ 2∥u∥C 1 ∥w∥C 1 (d + 1)(b − a) ≤ c∥w∥C 1 . Since w was chosen arbitrarily, we have proved (7.118). 3. The quadratic-in-h part of (7.117) is associated with the bilinear functional Φ2 : C1 ([a, b]) × C1 ([a, b]) → R, given by b
Z
[t x(t)y(t) + x′ (t)y ′ (t)] dt.
Φ2 (x, y) = a
Φ2 is clearly symmetric: ∀x, y ∈ C1 ([a, b]) :
Φ2 (x, y) = Φ2 (y, x),
so to prove that Φ2 ∈ MS ((C1 ([a, b]))2 ) we have to show that ∃c > 0, ∀x, y ∈ C1 ([a, b]) :
|Φ2 (x, y)| ≤ ∥x∥C 1 ∥y∥C 1 .
(7.119)
Let’s take c = (d + 1)(b − a) > 0. Let x, y ∈ C1 ([a, b]). Then Z b |Φ2 (x, y)| = ... a Z b ≤ [|t| |x(t)| |y(t)| + |x′ (t)| |y ′ (t)|] dt a
≤ (d∥x∥∞ ∥y∥∞ + ∥x′ ∥∞ ∥y ′ ∥∞ ) · (b − a) ≤ (d + 1)(b − a) ∥x∥C 1 ∥y∥C 1 = c ∥x∥C 1 ∥y∥C 1 . Since x and y were chosen arbitrarily, we have proved (7.119). 4. By (7.117), ii) and iii) it follows that J is twice differentiable, because u and h were chosen arbitrarily. Then we have the Taylor expansion ∀u, h ∈ C1 ([a, b]) :
1 J(u + h) = J(u) + J ′ (u)h + J ′′ (u)h2 , 2
where ′
Z
J (u)h = Φ1 (h) =
b
[2t u(t) h(t) + 2u′ (t) h′ (t)] dt,
a
J ′′ (u)h2 = 2Φ2 (h, h) = 2
Z a
b
2 t h (t) + h′2 (t) dt.
(7.120)
7.11. Taylor expansion. Second order conditions for extremum.
325
5. Let’s assume that y ∈ C1 ([a, b]) is a critical point of J, i.e., J ′ (y)h = 0, for every h ∈ C1 ([a, b]), which means Z
1
∀h ∈ C ([a, b]) :
b
[t y(t) h(t) + y ′ (t) h′ (t)] dt = 0,
a
and implies b
Z
∀h ∈ I 1 ([a, b]) :
[t y(t) h(t) + y ′ (t) h′ (t)] dt = 0
(7.121)
a
By (7.121) and Lemma 7.4, it follows that y ∈ C2 ([a, b]) and ∀t ∈ [a, b] :
ty(t) = y ′′ (t).
Then the critical points of J are solutions of the linear equation y ′′ (t) − t y(t) = 0,
t ∈ [a, b].
(7.122)
6. We conjecture that a solution of (7.122) has the form y(t) =
∞ X
ck t k ,
t ∈ [a, b],
(7.123)
k=0
where ck ∈ R, k = 1, 2, ... As it’s well known, the idea is to find the coefficients ck . From (7.123) we get, for t ∈ [a, b], y ′ (t) =
∞ X
k ck tk−1 ,
∞ X
and y ′′ (t) =
k=1
k(k − 1) ck tk−2 .
(7.124)
k=2
Now we replace (7.123) and (7.124) in (7.122). So, ∞ X
k(k − 1) ck tk−2 −
k=2
2c2 +
∞ X
∞ X
ck tk+1 = 0,
k=0
[(k + 2)(k + 1)ck+2 − ck−1 ] tk = 0.
(7.125)
k=1
Since (7.125) is valid for every t ∈ [a, b], it follows that ( c2 = 0, (k + 2)(k + 1)ck+2 − ck−1 = 0, k ∈ N. From (7.126) we get 1 , 3·2 4·1 c6 = c0 , 6! 7·4·1 c9 = c0 , 9! c3 = c0
1 , c5 = 0, 4·3 5·2 c7 = c1 , c8 = 0, 7! 8·5·2 c10 = c1 , c11 = 0. 10! c4 = c1
(7.126)
326
Chapter 7. Calculus on normed spaces Then, the general solution of (7.122) is t ∈ [a, b],
y(t) = c0 y1 (t) + c1 y2 (t), where n−1 Y
y1 (t) =
∞ X n=0
n−1 Y
(3j + 1)
j=0
3n
(3n)!
t
and y2 (t) =
∞ X n=0
(3j + 2)
j=0
(3n + 1)!
t3n+1 .
7. Let’s assume that a ≥ 0, y ∈ C1 ([a, b]) is a critical point of J and that R > 0. By the formula of the second differential we have that Z b 2 ∀z ∈ B(y, R), ∀h ∈ C1 ([a, b]) : J ′′ (z)h2 = t h (t) + h′2 (t) dt ≥ 0. a
Then, by Proposition 7.14, it follows that y is a point of local minimum of J on B(y, R). Since R was arbitrary, y is actually a point of global minimum of J. ■ Example 7.15 (Taylor expansion and points of minimum) In this example we prove that a functional is three times differentiable and find the corresponding Taylor expansion. An explicit integral formula is achieved for the critical points and it’s determined that they are points of minimum. Let V ∈ C∞ ([0, 1]) be strictly positive. Consider the functional J : C1 ([0, 1]) −→ R, given by Z 1 1 2 1 ′4 J(x) = x (0) + t x(t) + V (t) x (t) dt. 2 4 0 We shall a) prove that J is three times differentiable; b) find the formulas of J ′ (x)h, J ′′ (x)h2 and J ′′′ (x)h3 , for x, h ∈ C1 ([0, 1]), and write the corresponding Taylor expansion; c) find the critical points of J; d) determine if J has extremum. Proof. 1. Let x, h ∈ C1 ([0, 1]). We have that Z 1 1 2 x (0) + h2 (0) + 2x(0)h(0) + t [x(t) + h(t)] dt 2 0 Z 1 1 + V (t) x′4 (t) + 4x′3 (t)h′ (t) + 6x′2 (t)h′2 (t) + 4x′ (t)h′3 (t) + h′4 (t) dt, 4 0
J(u + h) =
so that J(u + h) − J(u) = +
1 4
Z 0
1 2 h (0) + x(0)h(0) + 2
Z
1
t h(t) dt
(7.127)
0
1
V (t) 4x′3 (t)h′ (t) + 6x′2 (t)h′2 (t) + 4x′ (t)h′3 (t) + h′4 (t) dt.
7.11. Taylor expansion. Second order conditions for extremum.
327
2. The linear-in-h part of (7.127) is associated with the linear functional Φ1 : C1 ([0, 1]) −→ R, given by Z 1 Z 1 V (t) x′3 (t)w′ (t) dt. t w(t) dt + Φ1 (w) = x(0)w(0) + 0
0
Let’s prove that Φ1 is continuous, i.e., ∃c > 0, ∀w ∈ C1 ([0, 1]) :
|Φ1 (w)| ≤ c∥w∥C 1 .
(7.128)
Let’s take c > |x(0)| + 1 + ∥V ∥∞ ∥x′ ∥3∞ . Let w ∈ C1 ([0, 1]). Then, we have |Φ1 (w)| ≤ |x(0)| ∥w∥∞ + ∥w∥∞ + ∥V ∥∞ ∥x′ ∥3∞ ∥w′ ∥∞ ≤ |x(0)| + 1 + ∥V ∥∞ ∥x′ ∥3∞ ∥w∥C 1 ≤ c∥w∥C 1 . Since w was chosen arbitrarily, we have proved (7.128). 3. The quadratic-in-h part of (7.127) is associated with the bilinear functional Φ2 : C1 ([0, 1]) × C1 ([0, 1]) −→ R, given by Z 1 3 1 Φ2 (u, w) = u(0)w(0) + V (t)x′2 (t)u′ (t)w′ (t) dt. 2 2 0 It’s clear that ∀u, w ∈ C1 ([0, 1]) :
Φ2 (u, w) = Φ2 (w, u).
So, in order to prove that Φ2 ∈ MS (C1 ([0, 1])2 ), we have to show that ∃c > 0, ∀u, w ∈ C1 ([0, 1]) :
|Φ2 (u, w)| ≤ c∥u∥C 1 ∥w∥C 1 .
(7.129)
Let’s take c > (1 + 3∥V ∥∞ ∥x′ ∥∞ )/2. Let u, w ∈ C1 ([0, 1]). Then Z 3 1 1 |Φ2 (u, w)| ≤ |u(0)| |w(0)| + V (t)x′2 (t) |u′ (t)| |w′ (t)| dt 2 2 0 1 3 ≤ ∥u∥∞ ∥w∥∞ + ∥V ∥∞ ∥x′ ∥∞ ∥u′ ∥∞ ∥w′ ∥∞ 2 2 1 3 + ∥V ∥∞ ∥x′ ∥2∞ ∥u∥C 1 ∥w∥C 1 ≤ 2 2 ≤ c ∥u∥C 1 ∥w∥C 1 . Since u and w were chosen arbitrarily, we have proved (7.129). 4. The cubic-in-h part of (7.127) is associated with the multilinear functional Φ3 : C1 ([0, 1]) × C1 ([0, 1]) × C1 ([0, 1]) −→ R, given by Z 1 Φ3 (u, w, y) = V (t)x′ (t)u′ (t)w′ (t)y ′ (t) dt. 0
It’s clear that Φ3 (u, w, y) doesn’t change by permutations of its arguments. Then, to prove that Φ3 ∈ MS (C1 ([0, 1])3 ), we have to show that ∃c > 0, ∀u, w, y ∈ C1 ([0, 1]) : |Φ3 (u, w, y)| ≤ c∥u∥C 1 ∥w∥C 1 ∥y∥C 1 .
(7.130)
328
Chapter 7. Calculus on normed spaces Let’s take c > ∥V ∥∞ ∥x′ ∥∞ . Let u, w, y ∈ C1 ([0, 1]). Then |Φ3 (u, w, y)| ≤ ∥V ∥∞ ∥x′ ∥∞ ∥u′ ∥∞ ∥w′ ∥∞ ∥y ′ ∥∞ ≤ c∥u∥C 1 ∥w∥C 1 ∥y∥C 1 . Since u, w and y were chosen arbitrarily, we have proved (7.130).
5. The remaining part of (7.127) can be associated to a mapping with formula Z 1 1 V (t)h′4 (t) dt. g(h) = 4 0 Let’s prove that g(h) = o(∥h∥3 ).
(7.131)
We have, for h ̸= 0, that |g(h)| ∥V ∥∞ ∥h′ ∥4∞ ∥V ∥∞ ∥h∥C 1 ≤ ≤ , 3 3 ∥h∥C 1 4∥h∥C 1 4 so that (7.131) holds. 6. From (7.127), ii), iii), iv) and v) it follows that J is three times differentiable because x and h were arbitrary. Then the corresponding Taylor expansion is 1 1 J(x + h) = J(x) + J ′ (x)h + J ′′ (x)h2 + J ′′′ (x)h3 g(h), 2 6 where, for x, h ∈ C1 ([0, 1]),
J ′′ (x)h2 = h2 (0) + 3
1
Z
J ′ (x)h = x(0)h(0) +
Z t h(t) dt +
Z
0 1
1
V (t) x′3 (t)h′ (t) dt,
0
V (t) x′2 (t)h′2 (t) dt,
0
J ′′′ (x)h3 = 6
Z
1
V (t) x′ (t)h′3 (t) dt.
0
7. Let’s assume that x ∈ C1 ([0, 1]) is a critical point of J, i.e., J ′ (x)h = 0, for every h ∈ C1 ([0, 1]), so that, in particular, ∀h ∈ I 1 ([0, 1]) :
J ′ (x)h = 0,
i.e., ∀h ∈ I 1 ([0, 1]) :
Z
1
t h(t) + V (t)x′3 (t)h′ (t) dt = 0.
(7.132)
0
By (7.132) and Lemma 7.4, it follows that V x′ ∈ C1 ([0, 1]),
(7.133)
and ∀t ∈ [0, 1] :
[V x′3 ]′ (t) = t.
(7.134)
7.11. Taylor expansion. Second order conditions for extremum.
329
Let’s observe that (7.133) implies that x ∈ C2 ([0, 1]). From (7.134) it follows that 1 V (t) x′3 (t) = t + k1 , t ∈ [0, 1], (7.135) 2 for some constant k1 ∈ R. From (7.135) we get the formula for the critical points of J: Z t x(t) = 0
τ 2 + 2k1 2V (τ )
1/3 dτ + k2 ,
t ∈ [0, 1],
for some k1 , k2 ∈ R. 8. Let’s assume that x ∈ C1 ([0, 1]) is a critical point of J and that R > 0. By the formula of the second differential we have that ∀y ∈ B(x, R), ∀h ∈ C1 ([0, 1]) : Z 1 J ′′ (y)h2 = h2 (0) + 3 V (t)y ′2 (t)h′2 (t)dt ≥ 0. 0
Then, by Proposition 7.14, x is a point of local minimum of J. ■ Example 7.16 (Working on product spaces) In this example, it’s considered a functional defined on a product space. It’s proved that it’s Fr´echet differentiable but not twice differentiable. It’s also found an explicit formula for the critical points. Let’s denote X = C1 ([0, 1]) × C1 ([0, 1]) and consider the functional J : X −→ R, given by Z 1 1 J(u, v) = u′ (t)v ′ (t) + v 2 (t) − αu(t) dt. 2 0 We shall a) prove that J is differentiable but not twice differentiable; b) find the critical points of J. Proof. First, let’s recall that for u, v ∈ C1 ([0, 1]), ∥u∥C 1 = ∥u∥∞ + ∥u′ ∥∞
and ∥(u, v)∥X = ∥u∥C 1 + ∥v∥C 1 .
1. Let u, v, h1 , h2 ∈ C1 ([0, 1]). We have that J(u + h1 , v + h2 ) − J(u, v) = (7.136) Z 1 1 = u′ (t)h′2 (t) + v ′ (t)h′1 (t) + h′1 (t)h′2 (t) + v(t)h2 (t) + h22 (t) − αh1 (t) dt. 2 0 2. The linear-in-(h1 , h2 ) part of (7.136) is associated with the linear functional Φ1 : X → R, given by Z 1 Φ1 (w1 , w2 ) = [u′ (t)w2′ (t) + v ′ (t)w1′ (t) + v(t)w2 (t) − αw1 (t)] dt. 0
330
Chapter 7. Calculus on normed spaces Let’s prove that Φ1 is continuous, i.e., ∃c > 0, ∀(w1 , w2 ) ∈ X :
|Φ1 (w1 , w2 )| ≤ c∥(w1 , w2 )∥X .
(7.137)
Let’s take c > ∥u′ ∥∞ + ∥v∥C 1 + |α|. Let w1 , w2 ∈ C1 ([0, 1]). Then we have that Z 1 |Φ1 (w1 , w2 )| ≤ [|u′ (t)w2′ (t)| + |v ′ (t)w1′ (t)| + |v(t)w2 (t)| + |αw1 (t)|] dt 0
≤ ∥u′ ∥∞ ∥w2′ ∥∞ + ∥v ′ ∥∞ ∥w1′ ∥∞ + ∥v∥∞ ∥w2 ∥∞ + |α| ∥w1 ∥∞ ≤ (∥u′ ∥∞ + ∥v ′ ∥∞ + ∥v∥∞ + |α|) · (∥w1 ∥C 1 + ∥w1 ∥C 2 ) ≤ c∥(w1 , w2 )∥X . Since w1 and w2 were chosen arbitrarily, we have proved (7.137). 3. The nonlinear-in-(h1 , h2 ) part of (7.136)is associated to a mapping with formula Z 1 1 2 ′ ′ g(h1 , h2 ) = h1 (t)h2 (t) + h2 (t) dt. 2 0 We have that 1
1 2 ′ ′ |g(h1 , h2 )| ≤ |h1 (t)| |h2 (t)| + h2 (t) dt 2 0 1 ≤ ∥h′1 ∥∞ ∥h′2 ∥∞ + ∥h2 ∥2∞ 2 3 ≤ ∥(h1 , h2 )∥2X , 2 Z
whence
(7.138)
|g(h1 , h2 )| = 0, ∥(h1 ,h2 )∥X →0 ∥(h1 , h2 )∥X lim
i.e., g(h1 , h2 ) = o(∥(h1 , h2 )∥X ). Therefore, by (7.136) and iii), it follows that J is differentiable at (u, v) and Z 1 ′ J (u, v)(h1 , h2 ) = [u′ (t)h′2 (t) + v ′ (t)h′1 (t) + v(t)h2 (t) − αh1 (t)] dt. 0
4. The quadratic-in-(h1 , h2 ) part of (7.136) is associated with the bilinear functional Φ2 : X × X → R, given by Z 1 1 Φ2 ((w1 , w2 ), (w3 , w4 )) = w2 (t)w4 (t) dt. 2 0 It’s clear that for all (w1 , w2 ), (w3 , w4 ) ∈ X, Φ2 ((w1 , w2 ), (w3 , w4 )) = Φ2 ((w3 , w4 ), (w1 , w2 )) . So, to prove that Φ2 ∈ MS (X 2 ), we have to show that ∃c > 0,∀(w1 , w2 ), (w3 , w4 ) ∈ X :
(7.139)
|Φ2 ((w1 , w2 ), (w3 , w4 )) | ≤ c∥(w1 , w2 )∥X ∥(w3 , w4 )∥X .
7.11. Taylor expansion. Second order conditions for extremum.
331
Take c = 1/2. Let w1 , w2 , w3 , w4 ∈ C1 ([0, 1]). Then Z 1 1 |Φ2 ((w1 , w2 ), (w3 , w4 )) | ≤ |w2 (t)| |w4 (t)| dt 2 0 1 ≤ ∥w2 ∥∞ ∥w4 ∥∞ ≤ c∥(w1 , w2 )∥X ∥(w3 , w4 )∥X . 2 Since w1 , w2 , w3 and w4 were chosen arbitrarily, we have proved (7.139). 5. Let’s observe that (7.136) can be rewritten as J(u+h1 , v+h2 )−J(u, v) = J ′ (u, v)(h1 , h2 )+Φ2 ((h1 , h2 ), (h1 , h2 ))+G(h1 , h2 ), where Z
1
h′1 (t)h′2 (t) dt.
G(h1 , h2 ) = 0
Proving that J is twice differentiable is equivalent to showing that G(h1 , h2 ) = o ∥(h1 , h2 )∥2X , i.e., lim
∥(h1 ,h2 )∥X →0
|G(h1 , h2 )| = 0. ∥(h1 , h2 )∥2X
For each n ∈ N, let Hn : [0, 1] −→ R be given by Hn (t) = lim ∥Hn ∥C 1 ≤ lim
n→+∞
n→+∞
t , so that n
2 = 0. n
Now we choose h1 = h2 = Hn , so that 1 2 G(Hn , Hn ) 1 |G(h1 , h2 )| n = lim = lim ̸= 0. lim = 2 2 16 n→+∞ n→+∞ ∥(Hn , Hn )∥X 16 ∥(h1 ,h2 )∥X →0 ∥(h1 , h2 )∥X n2 Therefore we conclude that J is not twice differentiable. 6. Let’s assume that (u, v) ∈ X is a critical point of J, i.e., J ′ (u, v)(h1 , h2 ) = 0, for every (h1 , h2 ) ∈ X, and, consequently, ∀h1 ∈ C1 ([0, 1]) :
J ′ (u, v)(h1 , 0) = 0,
∀h2 ∈ C1 ([0, 1]) :
J ′ (u, v)(0, h2 ) = 0,
so that, in particular, ∀h1 ∈ I 1 ([0, 1]) :
J ′ (u, v)(h1 , 0) = 0,
∀h2 ∈ I 1 ([0, 1]) :
J ′ (u, v)(0, h2 ) = 0,
i.e., ∀h ∈ I ([0, 1]) : 1
Z
1
[v ′ (t)h′ (t) − αh(t)] dt = 0,
(7.140)
[u′ (t)h′ (t) + v(t)h(t)] dt = 0.
(7.141)
0
∀h ∈ I 1 ([0, 1]) :
Z 0
1
332
Chapter 7. Calculus on normed spaces By Lemma 7.4, (7.140) and (7.141), it follows that u, v ∈ C2 ([0, 1]), and ( u′′ (t) = v(t), t ∈ [0, 1]. v ′′ (t) = −α, Then, a critical point (u, v) ∈ X of J is given by α v(t) = − t2 + k1 t + k2 , 2 u(t) = − α t4 + 1 k1 t3 + 1 t2 + k3 t + k4 , 24 6 2
t ∈ [0, 1],
where k1 , k2 , k3 , k4 ∈ R. ■
7.11.7. Problems Problem 7.54 Consider the normed product space E = E1 × E2 × ... × Ek , where ∥x∥E = ∥(x1 , ..., xk )∥E = max{∥x1 ∥E1 , ..., ∥xk ∥Ek }. Prove that the norms defined by the following formulas are equivalent to the norm ∥ · ∥E :
∥x∥1 =
k X
∥xj ∥Ej ,
1/2 k X ∥x∥2 = ∥xj ∥2Ej ,
j=1
j=1
1/p k X ∥x∥p = ∥xj ∥pEj ,
p > 1.
j=1
Problem 7.55 Prove Theorem 7.18. Consider first the case k = 2. Problem 7.56 Adapt the proof of Theorem 5.4 to prove Proposition 7.12. Problem 7.57 Let U be a normed space and F a Banach space. Prove that for every k ∈ N, M(U k ; F ) is a Banach space. Problem 7.58 Consider the normed product spaces E = E1 × .... × Ek and F = F1 × ... × Fp . Prove that the following mapping is an isometric isomorphism Φ : M(E; F ) −→ M(E; F1 ) × ... × M(E; Fp ) f 7−→ Φ(f ) = (f1 , ..., fp ), where f1 , ..., fp are the coordinate mappings of f . Problem 7.59 Let U and F be normed spaces. Let’s define a mapping Ψ2 : M(U 2 ; F ) = M(U × U ; F ) −→ L2 (U, F ) = L(U, L(U, F )), by Ψ2 [g](x1 )x2 = g(x1 , x2 ).
7.11. Taylor expansion. Second order conditions for extremum.
333
1. Show that Ψ2 is the inverse of Φ2 , given in (7.101). Hint. Observe that g ∈ M(U 2 ; F ), Ψ2 [g] ∈ L(U, L(U, F )) and Ψ2 [g](x1 ) ∈ L(U, F ). 2. Prove that ∀g ∈ M(U 2 ; F ) :
∥Ψ2 [g]∥L2 (U,F ) ≤ |g|M(U 2 ;F ) .
3. Find explicitly Ψ3 , the inverse of Φ3 , defined in (7.102) and prove that ∀g ∈ M(U 3 ; F ) :
∥Ψ3 [g]∥L3 (U,F ) ≤ |g|M(U 3 ;F ) .
Problem 7.60 Let U and F be normed spaces. Prove that MS (U k ; F ) is closed in M(U k ; F ). Problem 7.61 Redo with all the details the proof of Theorem 7.20 as it’s given in [5, Th.4.5 - Cor. 4.3]. Problem 7.62 Prove that the functional J : C1 ([0, 1]) −→ R, given by Z 1 J(y) = ty 2 (t) + y ′2 (t) dt 0
is twice differentiable and write the corresponding Taylor expansion. Problem 7.63 Let f ∈ C([a, b]) and K ∈ C([a, b] × [a, b]) such that ∀s, t ∈ [a, b] :
K(s, t) = K(t, s).
Consider the functional J : C([a, b]) −→ R, given by Z bZ b Z b Z J(φ) = K(s, t)φ(s)φ(t) ds dt + φ2 (s) ds − 2 a
a
a
b
φ(s)f (s) ds.
a
1. Find conditions for u ∈ C([a, b]) to be a critical point of J. 2. Try to find (first and second-order) necessary and sufficient conditions for u to be a point of local minimum of J. Problem 7.64 Consider the functional J : X −→ R. 1. Is J twice differentiable? In that case write the corresponding Taylor expansion. 2. Try to find all the critical points of J. 3. Try to find (first and second-order) necessary and sufficient conditions for a local extremum. b
Z X = C([a, b]),
J(u) =
[t + u(t)] dt; a
X = C1 ([a, b]),
b
Z
J(y) =
y 2 (t) + y ′2 (t) dt;
a
Z 1 J(x) = x2 (0) + t x(t) + x′2 (t) dt; Z π 0 1 X = C ([0, π]), J(u) = u′ (t) sin(u(t)) dt. X = C1 ([0, 1]),
0
334
Chapter 7. Calculus on normed spaces
Problem 7.65 Consider the functional J : X −→ R. 1. Is J twice differentiable? In that case write the corresponding Taylor expansion. 2. Try to find all the critical points of J. 3. Try to find (first and second-order) necessary and sufficient conditions for a local extremum. X = C1 ([0, 1]),
1
Z
p
J(u) =
1 + u′2 (t) dt;
0
Z
1
X = C ([a, b]),
b
J(y) =
p y(t) 1 + y ′2 (t) dt;
a 1
X = C ([0, 1]), X ⊆ C1 ([0, 1]),
Z
1
t x(t) + x′2 (t) dt; 0 Z 1s 1 + y ′2 (t) dt. J(u) = y(t) 0 2
J(x) = x (0) +
Problem 7.66 Consider the functional J : C1 ([0, 1]) × C1 ([0, 1]) −→ R, given by Z J(u, v) = 0
1
1 u′ (t)v ′ (t) + v 2 (t) − αu(t) dt, 2
α ∈ R.
1. Is J twice differentiable? In that case write the corresponding Taylor expansion. 2. Try to find all the critical points of J. 3. Try to find (first and second-order) necessary and sufficient conditions for a local extremum.
8. The Euler-Lagrange equation and other classical topics In this chapter we concentrate on the so-called classical Calculus of Variations, taking full advantage of the formalism built in the previous chapter. For the topics we deal with in this chapter, the works [6], [9], [16], [22]and [21] important complementary sources.
8.1. The elementary problem of the Calculus of Variations 8.1.1. The functional of the elementary problem. Admissible elements. Let F ∈ C2 ([a, b] × R × R). A generic element of the domain of F shall be denoted by (x, u, ξ) ∈ [a, b] × R × R. The partial derivatives of F will be written Fx =
∂F , ∂x
Fu =
∂F , ∂u
Fξ =
∂F , ∂ξ
Fxu =
∂2F , ∂u∂x
Fξu =
∂2F , ∂u∂ξ
etc.
Many mathematical models use a functional J : M ⊆ C1 ([a, b]) −→ R, of the form Z b J(y) = F (x, y(x), y ′ (x)) dx, a
where the set of admissible functions is M = u ∈ C1 ([a, b]) / u(a) = A ∧ u(b) = B , so that M is the affine space that contains the functions of class C1 with fixed end points (a, A) and (b, B). We say that h ∈ C1 ([a, b]) is an admissible increment of y ∈ M iff y + h ∈ M , which is equivalent to require h to belong to the set of admissible increments, I 1 ([a, b]) = {u ∈ C1 ([a, b]) / u(a) = u(b) = 0}. Remark 8.1 In general, the set of admissible functions, M , is not a linear space. The set I 1 ([a, b]) is a linear subspace of C1 ([a, b]) which does not depend on y. We recommend you to review the lemmas presented in Section 7.7 because they are useful results that involve elements of I 1 ([a, b]). Remark 8.2 Observe that M + I 1 ([a, b]) ⊆ M , so that if we add an admissible increment, h, to an admissible function, y, we obtain another admissible function, y + h. 335
336
Chapter 8. The Euler-Lagrange equation and other classical topics
8.1.2. Weak and strong extremum Let E be a linear space, ∥ · ∥p and ∥ · ∥q two norms on E, and O ⊆ E. Let’s assume that ∥ · ∥q dominates ∥ · ∥p , i.e., ∃c > 0, ∀u ∈ E :
∥u∥p ≤ c∥u∥q .
Then, Bq (0, 1) ⊆ Bp (0, c),
(8.1)
where Bq (x, r) = {u ∈ E : ∥x − u∥q ≤ r},
Bp (x, r) = {u ∈ E : ∥x − u∥p ≤ r}.
Now let’s consider a mapping G : O ⊆ E −→ R. Relation (8.1) implies that a point of local extremum of G : (O, ∥ · ∥p ) → R, referred to as point of strong extremum of G, is also a point of local extremum of G : (O, ∥ · ∥q ) → R, referred to as point of weak extremum of G. Let’s recall that ∀u ∈ C1 ([a, b]) :
∥u∥∞ ≤ ∥u∥C1 ,
where, as we well know, and ∥u∥C1 = ∥u∥∞ + ∥u′ ∥∞ .
∥u∥∞ = sup |u(x)|, x∈[a,b]
Therefore, a point of local extremum of J : (M , ∥ · ∥∞ ) → R, referred to as point of strong extremum of J, is also a point of local extremum of J : (M , ∥·∥C1 ) → R, referred to as point of weak extremum of J. Example 8.1 Let M = {u ∈ C1 ([0, π]) / u(0) = u(π) = 0} and J : M −→ R given by Z π J(u) = y 2 (x) (1 − y ′2 (x)) dx, 0
so that here F (x, u, ξ) = u2 (1 − ξ)2 . 1. Let’s show that y = 0 is a point of weak minimum of J, i.e., that in the norm ∥ · ∥C1 , y = 0 is a point of local minimum: ∃α > 0, ∀w ∈ BC1 (0, α) :
J(w) ≥ J(0) = 0.
(8.2)
Let’s pick 0 < α < 1. Let w ∈ BC1 (0, α), generic. Then ∥w∥C1 = ∥w∥∞ + ∥w′ ∥∞ < α < 1, whence, ∀x ∈ [0, π] : ∀x ∈ [0, π] :
|w(x)| ≤ ∥w∥∞ < α < 1, |w′ (x)| ≤ ∥w′ ∥∞ < α < 1.
Therefore we have, for any x ∈ [0, π], that w2 (x) ≥ 0 and 1 − w′2 (x) > 0, so that Z π J(w) = w2 (x) (1 − w′2 (x)) dx > 0 = J(0). 0
Since w was chosen arbitrarily, we have proved (8.2).
8.1. The elementary problem of the Calculus of Variations
337
2. In other hand, by choosing (yn )n∈N ⊆ C1 ([0, π]), 1 yn (x) = √ sin(nx), n √ yn′ (x) = n cos(nt), we find that J(yn ) =
π π π − −→ − < 0, 2n 8 8
as n −→ ∞.
The last let us conclude that y = 0 is not a point of strong minimum because ∀r > 0, ∃n ∈ N :
yn ∈ B∞ (0, r).
8.1.3. Derivation of Euler-Lagrange equation We keep the notation that was used in Sections 8.1.1 and 8.1.2. We are looking for conditions that help us to compute weak extremums of the functional J : M −→ R given by Z b J(y) = F (x, y(x), y ′ (x)) dx, a 2
where F ∈ C ([a, b] × R × R) and the set of admissible functions is M = u ∈ C1 ([a, b]) / u(a) = A ∧ u(b) = B . As every maximization problem can be stated as a minimization one, we focus only in the later case. The elementary problem of the Calculus of Variations (EP) can be mathematically written as ( Find y0 ∈ M such that (EP) J(y0 ) = inf J(u). u∈M
Let y ∈ M and h ∈ I ([a, b]), generic. Let’s compute J ′ (y)h. Since for every x ∈ [a, b] and u, ξ, ε1 , ε2 ∈ R it holds 1
F (x, u + ε1 , ξ + ε2 ) − F (x, u, ξ) = Fu (x, u, ξ) ε1 + Fξ (x, u, ξ) ε2 + o(ε1 ) + o(ε2 ), we find that J ′ (y) h = Z b = [Fu (x, y(x), y ′ (x)) · h(x) + Fξ (x, y(x), y ′ (x)) · h′ (x)] dx.
(8.3)
a
Theorem 8.1 (Euler-Lagrange equation) If y0 ∈ M is an extremum of the functional J, then it’s a solution of the Euler-Lagrange equation Fu (x, y(x), y ′ (x)) −
d Fξ (x, y(x), y ′ (x)) = 0, dx
x ∈ [a, b].
(E-L)
338
Chapter 8. The Euler-Lagrange equation and other classical topics
Proof. By (8.3), we have, for every h ∈ I 1 ([a, b]), that b
Z
[Fu (x, y0 (x), y0′ (x)) h(x) + Fξ (x, y0 (x), y0′ (x)) h′ (x)] dx = 0.
a
Therefore, by Lemma 7.4, it follows (E-L).
■
Remark 8.3 In general (E-L) is a non-linear second order ODE. It’s common to find it in the short form Fu −
d Fξ = 0, dx
x ∈ [a, b].
(E-Ls)
It’s a necessary condition for an extremum of J. Even though it is not a sufficient condition, “the existence of an extremum is often clear from the physical or geometric meaning of the problem, e.g., in the brachistochrone problem”, [9]. Remark 8.4 (Extremal, critical point and extremum) A solution of Euler-Lagrange equation is referred to as an extremal of J. Do not confuse the term extremal with those of extremum or point of extremum which were introduced in Section 7.5. An extremal w ∈ C2 ([a, b]) that verifies the boundary conditions that define M , i.e., u(a) = A, u(b) = B,
is a critical point of J. Keep in mind the scheme for w ∈ C2 ([a, b]): w is extremum of J ⇒ w is a critical point of J ⇒ w is an extremal. Example 8.2 (A functional with extremal but without critical points) Let M = y ∈ C1 ([1, 3]) / y(1) = 1 ∧ y(3) = 9/2 and J : M −→ R given by Z
3
[3x − y(x)] y(x) dx.
J(y) = 1
Here, for (x, u, ξ) ∈ [1, 9/2] × R × R, F (x, u, ξ) = (3x − u) u,
Fu (x, u, ξ) = 3x − 2u and Fξ (x, u, ξ) = 0,
so that the Euler-Lagrange equation is 3x − 2y(x) = 0,
x ∈ [1, 3].
Therefore J has no critical points as the only extremal is given by y(x) =
3x , 2
x ∈ [1, 3],
which does not verify one of the boundary conditions.
8.1. The elementary problem of the Calculus of Variations
339
Example 8.3 (A functional with an infinite number of critical points) Let M = y ∈ C1 ([0, 2π]) / y(0) = 1 ∧ y(2π) = 1 and J : M −→ R given by 2π
Z
′2 y (x) − y 2 (x) dx.
J(y) = 0
Here, for (x, u, ξ) ∈ [0, 2π] × R × R, F (x, u, ξ) = ξ 2 − u2 ,
Fu (x, u, ξ) = −2u and Fξ (x, u, ξ) = 2ξ,
so that the Euler-Lagrange equation is −2y(x) −
d (2y ′ (x)) = 0, dx
x ∈ [0, 2π],
i.e., y ′′ (x) + y(x) = 0,
x ∈ [0, 2π].
The general extremal is y(x) = c1 cos(x) + c sin(x),
x ∈ [0, 2π].
By using the boundary conditions we find that J has an infinite number of critical points: y(x) = cos(x) + c sin(x), x ∈ [0, 2π], where c ∈ R. Example 8.4 (A functional with a unique critical point) Let M = {y ∈ C1 ([1, 2]) / y(1) = 0 ∧ y(2) = −1} and J : M −→ R, given by Z J(y) =
2
′2 y (x) − 2x y(x) dx.
1
Here, for (x, u, ξ) ∈ [1, 2]×R×R, F (x, u, ξ) = ξ 2 −2xu, so that the Euler-Lagrange equation is of the form y ′′ (x) + x = 0,
x ∈ [1, 2].
The general extremal is y(x) = −
x3 + C1 x + C2 , 6
x ∈ [1, 2].
By using the boundary conditions we find that J has only one critical point given by x x3 y(x) = − , x ∈ [1, 2]. 6 6 This example is treated with more detail in a coming section.
340
Chapter 8. The Euler-Lagrange equation and other classical topics
8.1.4. Bernstein’s existence and unicity theorem Let’s state a very general and classical result for second order ODE. Let a < b, A, B ∈ R and W : [a, b] × R × R → R. Again a generic element of the domain of W shall be denoted by (x, u, ξ) ∈ [a, b] × R × R. Let’s consider the boundary value problem ′′ ′ y (x) = W (x, y(x), y (x)), x ∈ [a, b], (8.4) y(a) = A, y(b) = B. Theorem 8.2 (Bernstein’s theorem) Assume the following conditions. 1. For all ξ ∈ R, the functions W (·, ·, ξ), Wu (·, ·, ξ) and Wξ (·, ·, ξ) are continuous. 2. There are k > 0 and non-negative functions α, β : [a, b] × R → R, which are bounded in every finite regiona of the plane, such that ∀(x, u, ξ) ∈ [a, b] × R × R :
Wu (x, u, ξ) > k,
∀(x, u, ξ) ∈ [a, b] × R × R :
|W (x, u, ξ)| ≤ α(x, u) ξ 2 + β(x, u).
Then (8.4) has a unique solution. a Finite
area.
Example 8.5 Let a < b, M = y ∈ C1 ([a, b]) / y(a) = A ∧ y(b) = B and J : M −→ R given by Z b ′2 2 J(y) = y (x) − 1 e−2y (x) dx. a
The Euler-Lagrange equation is of the form y ′′ (x) = 2y(x) (1 + y ′2 (x)),
x ∈ [a, b].
It can be checked that Bernstein’s theorem is applicable with k = 2 and α(x, u) = β(x, u) = 2|u|.
8.1.5. Differentiability of solutions of the Euler-Lagrange equation The following result provides information on the differentiability of an extremal. Theorem 8.3 (Differentiability of extremals) Let F ∈ C2 ([a, b] × R × R) and assume that y ∈ C1 ([a, b]) verifies (E-L): Fu (x, y(x), y ′ (x)) −
d Fξ (x, y(x), y ′ (x)) = 0, dx
x ∈ [a, b].
8.1. The elementary problem of the Calculus of Variations
341
Then y has continuous second derivative around the points x ∈ [a, b] for which Fξξ (x, y(x), y ′ (x)) ̸= 0. Proof. (Sketch) Let’s point out that, whenever is valid, the formula of the Mean Value Theorem for a function G : [a, b] × R × R → R can be written, for (x, u, ξ) ∈ [a, b] × R × R, as G(x + h1 , u + h2 , ξ + h3 ) − G(x, u, ξ) = = Gx (x0 , u0 , ξ0 ) h1 + Gu (x0 , u0 , ξ0 ) h2 + Gξ (x0 , u0 , ξ0 ) h3 , x0 x + h1 x where u0 ∈ y , u + h2 . Therefore ξ0 z ξ + h3
T := Fξ (x + h1 , u + h2 , ξ + h3 ) − Fξ (x, u, ξ) = Fξx (x0 , u0 , ξ0 ) h1 + Fξu (x0 , u0 , ξ0 ) h2 + Fξξ (x0 , u0 , ξ0 ) h3 .
(8.5)
Now we choose u = y(x), ∆x = h1 ,
ξ = y ′ (x),
∆y = h2 = y(x + h1 ) − y(x),
∆y ′ = h3 = y ′ (x + h1 ) − y ′ (x),
so that (8.5) becomes T = Fξ (x + ∆x, y(x + ∆x), y ′ (x + ∆x)) − Fξ (x, y(x), y ′ (x)) = ∆x · Fξx (x0 , u0 , ξ0 ) + ∆y · Fξu (x0 , u0 , ξ0 ) + ∆y ′ · Fξξ (x0 , u0 , ξ0 ),
(8.6)
where u0 = u0 (x) ∈ [y(x), y(x + ∆x)],
ξ0 = ξ0 (x) ∈ [y ′ (x), y ′ (x + ∆x)].
(8.7)
Then ∆y ∆y ′ T = Fξx (x0 , u0 , ξ0 ) + · Fξu (x0 , u0 , ξ0 ) + · Fξξ (x0 , u0 , ξ0 ). ∆x ∆x ∆x
(8.8)
Having in mind that d G(x, y(x), y ′ (x)) = dx G(x + ∆x), y(x + ∆x), y ′ (x + ∆x) − G(x, y(x), y ′ (x)) = lim , ∆x→0 ∆x it follows, from (8.6), that lim
∆x→0
T = Fξx (x, y(x), y ′ (x)) = Fu (x, y(x), y ′ (x)), ∆x
(8.9)
where the second equality comes from Euler-Lagrange’s equation. Since F ∈ C2 ([a, b] × R × R) we have, by (8.7), that lim Fξx (x0 , u0 , ξ0 ) = Fξx (x, y(x), y ′ (x)),
(8.10)
∆y · Fξu (x0 , u0 , ξ0 ) = y ′ (x) · Fξu (x, y(x), y ′ (x)). ∆x
(8.11)
∆x→0
lim
∆x→0
342
Chapter 8. The Euler-Lagrange equation and other classical topics
From (8.9), (8.10), (8.11) and (8.8) we get ∆y ′ · Fξξ (x0 , u0 , ξ0 ) = ∆x→0 ∆x = Fu (x, y(x), y ′ (x)) − Fξx (x, y(x), y ′ (x)) − y ′ (x) · Fξu (x, y(x), y ′ (x)). lim
Now if x is such that Fξξ (x, y(x), y ′ (x)) ̸= 0, then y has second derivative and y ′′ (x) =
Fu (x, y(x), y ′ (x)) − Fξx (x, y(x), y ′ (x)) − y ′ (x) · Fξu (x, y(x), y ′ (x)) . Fξξ (x, y(x), y ′ (x)) ■
8.1.6. Convexity and minimizers Once again we are looking for weak extremums of the functional J : M −→ R given by Z b J(y) = F (x, y(x), y ′ (x)) dx, a
where F ∈ C2 ([a, b] × R × R) and M = u ∈ C1 ([a, b]) / u(a) = A ∧ u(b) = B . As it was shown in Theorem 8.1, a C 2 -extremum of J is a critical point of J, i.e., a solution of the boundary value problem d ′ ′ Fu (x, y(x), y (x)) − dx Fξ (x, y(x), y (x)) = 0, x ∈ [a, b], (8.12) y(a) = A, y(b) = B. We shall prove a result which is a kind of reciprocal for Theorem 8.1. For it we need the concept of convex function. Let S be a convex subset of a linear space V . Let’s recall that a functional η : S ⊆ V → R is convex iff ∀u, v ∈ S, ∀λ ∈ [0, 1] :
η(λ u + (1 − λ) v) ≤ λ η(u) + (1 − λ) η(v).
The functional η is strictly convex iff ∀u, v ∈ S, ∀λ ∈ [0, 1] :
u ̸= v ⇒ η(λ u + (1 − λ) v) < λ η(u) + (1 − λ) η(v).
We shall use the following result brought from the course of Calculus. Lemma 8.1 Let g : RN → R. 1. Assume that g ∈ C1 (RN ). Then g is convex iff ∀z, y ∈ RN :
g(z) ≥ g(y) + ∇g(y) · (z − y).
(8.13)
2. Assume that g ∈ C2 (RN ). Then g is convex iff the Hessian matrix, ∇2 g is positive semi definite, i.e., ∀z, y ∈ RN :
y T ∇2 g(z) y ≥ 0.
8.1. The elementary problem of the Calculus of Variations
343
Now the offered result: Theorem 8.4 (Convexity and E-L equation) Assume that y0 ∈ C2 ([a, b]) is a critical point of J, i.e., a solution of (8.12), and that for every x ∈ [a, b] the function F (x, ·, ·) : R × R → R is convex. Then it holds 1. y0 is a global minimizer of J; 2. if, in addition, for every x ∈ [a, b], F (x, ·, ·) is strictly convex, then J has a unique global minimizer. Proof. Let’s prove point 1. A proof of point 2 can be found e.g. in [6, Th.2.1]. Let x ∈ [a, b] and y ∈ M, generic. Since the function g = F (x, ·, ·) is convex, we have, by (8.13), that F (x, y(x), y ′ (x)) ≥ F (x, y0 (x), y0′ (x))+ +Fu (x, y0 (x), y0′ (x)) [y(x) − y0 (x)] + Fξ (x, y0 (x), y0′ (x)) [y ′ (x) − y0′ (x)], which, by integration on [a, b] implies that, Z b J(u) ≥ J(y0 ) + [Fu (x, y0 (x), y0′ (x)) [y(x) − y0 (x)]] dx+ a
Z +
b
[Fξ (x, y0 (x), y0′ (x)) [y ′ (x) − y0′ (x)]] dx.
(8.14)
a
Integrating by parts in (8.14) and using the boundary condition, which both y and y0 satisfy, we get J(y) ≥ J(y0 )+ Z b d Fξ (x, y0 (x), y0′ (x)) [y(x) − y0 (x)]dx, + Fu (x, y0 (x), y0′ (x)) − dx a whence, as y0 verifies Euler-Lagrange equation and y was arbitrary, it follows that ∀y ∈ M :
J(y0 ) ≤ J(y). ■
Remark 8.5 If in Theorem 8.4, instead of convex, we require F (x, ·, ·) to be concave (−f is convex), then y0 becomes a global maximizer of J.
8.1.7. Second form of the Euler-Lagrange equation The form of the Euler-Lagrange equation implies a different formulation known as Dubois-Reymond equation or as the second form of the Euler-Lagrange equation. Recall that we are looking for weak extremums of the functional J : M ⊆ C1 ([a, b]) −→ R given by Z b J(y) = F (x, y(x), y ′ (x)) dx, a
where F ∈ C ([a, b] × R × R) and M = u ∈ C1 ([a, b]) / u(a) = A ∧ u(b) = B . 2
344
Chapter 8. The Euler-Lagrange equation and other classical topics
Theorem 8.5 (Second form of Euler-Lagrange equation) Assume that y verifies d ′ ′ Fu (x, y(x), y (x)) − dx Fξ (x, y(x), y (x)) = 0, x ∈ [a, b], y(a) = A, y(b) = B. Then y verifies the second form of the Euler-Lagrange equation, i.e., d [F (x, y(x), y ′ (x)) − y ′ (x) · Fξ (x, y(x), y ′ (x))] = dx = Fx (x, y(x), y ′ (x)), x ∈ [a, b].
(8.15)
Equation (8.15) is usually presented in the shorter and easier-to-remember form d [F − y ′ · Fξ ] = Fx , dx
x ∈ [a, b].
(8.16)
The form of (8.16) is particularly useful when x does not explicitly appears in the formula F (x, u, ξ), see Section 8.1.8. Remark 8.6 Keep in mind that a function could verify the second form of EulerLagrange equation without actually being a solution of Euler-Lagrange equation.
8.1.8. Special cases Once again let’s consider a functional J : M ⊆ C1 ([a, b]) −→ R, given by Z b J(y) = F (x, y(x), y ′ (x)) dx, (8.17) a
where M = u ∈ C1 ([a, b]) / y(a) = A ∧ y(b) = B and F ∈ C2 ([a, b] × R × R). Recall that to optimize J one first tries to solve the boundary value problem d ′ ′ Fu (x, y(x), y (x)) − dx Fξ (x, y(x), y (x)) = 0, x ∈ [a, b], (8.18) y(a) = A, y(b) = B. As it was already mentioned, the differential equation appearing in (8.18) is easier to remember in the form d Fξ = 0. (E-Ls) Fu − dx Case F(x, u, ξ) = F(ξ) This is the simplest case: the integrand of (8.17) depends only on y ′ , the derivative of y, but not on y and x. Here, (E-Ls) becomes d [F ′ (y ′ (x))] = 0, dx
x ∈ [a, b],
8.1. The elementary problem of the Calculus of Variations
345
so that there exists a constant c1 ∈ R such that (8.18) is equivalent to ′ ′ F (y (x)) = c1 , x ∈ [a, b], y(a) = A, y(b) = B.
(8.19)
A solution of (8.19) is the affine linear function y0 ∈ M given by y0 (x) =
B−A (x − a) + A, b−a
x ∈ [a, b].
Depending on the properties of F , the function y0 could be a point of extremum of J. Case F(x, u, ξ) = F(x, ξ) In this special case the integrand of (8.17) depends on x and y ′ , the derivative of y, but not on y. Here, (E-Ls) becomes d [Fξ (x, y ′ (x))] = 0, dx
x ∈ [a, b].
so that there exists a constant c1 ∈ R such that (8.18) is equivalent to ′ Fξ (x, y (x)) = c1 , x ∈ [a, b], y(a) = A, y(b) = B.
(8.20)
In general, a solution of (8.20) has the form Z x y(x) = g(t; C) dt, a
where C ∈ R is a constant. Case F(x, u, ξ) = F(u, ξ) — Brachistochrone In this special case the integrand of (8.17) depends on y and y ′ but not on x. Here, (E-Ls) can be written as d [F (y(x), y ′ (x)) − y ′ (x) · Fξ (y(x), y ′ (x))] = 0, dx
x ∈ [a, b],
so that for some constant D ∈ R, F (y(x), y ′ (x)) − y ′ (x) · Fξ (y(x), y ′ (x)) = D,
x ∈ [a, b].
(8.21)
Example 8.6 (Brachistochrone) Let’s retake the problem of the brachistochrone that was presented early in this chapter. Finding the the nicest-for-the-children slide’s shape yields in finding the minimum-time path between two points that follows a point mass moving under the influence of gravity without friction, see Figure 8.1.
346
Chapter 8. The Euler-Lagrange equation and other classical topics
Figure 8.1.: A simple scheme of the slide.
The problem can be mathematically written as (
Find y0 ∈ M such that T (y0 ) = inf T (y),
(P)
y∈M
where M = y ∈ C1 ([0, xb ]) / y(0) = 0 ∧ y(xb ) = yb and T : M ⊆ C1 ([0, xb ]) −→ R is given by Z xb s 1 1 + y ′ (x)2 T (y) = dx. 2g 0 y(x) It’s clear that minimizing T is equivalent to miniming the functional J = 2g T . Taking into consideration (8.17), we have ξ 1 + ξ2 , Fξ (u, ξ) = 1/2 F (u, ξ) = , u u (1 + ξ 2 )1/2 so that, for some constant D ∈ R, (8.21) becomes 1/2 1 + y ′2 (x) y ′2 (x) = D, − 1/2 y 1/2 (x) y 1/2 (x) [1 + y ′2 (x)]
x ∈ [0, xb ],
i.e., y(x) [1 + y ′2 (x)] = 2µ,
x ∈ [0, xb ],
(8.22)
where 2µ = 1/D2 is also a constant. The solution of (8.22) is a cycloid which can be written in implicit form as y(x) = µ 1 − cos ψ −1 (x), , (8.23) with ψ(τ ) = µ [τ − sin(τ )]. Since the function given by (8.23) already satisfies y(0) = 0 it remains to choose µ such that y(xb ) = yb . We let the student verify that the found critical point is in fact a minimizer. Case F(x, u, ξ) = g(x, u)
p 1 + ξ 2 — Fermat’s principle
As it was shown in the example studied p in Section 7.1.2, the infinitesimal arc length (see Figure 8.2) is given by ds = 1 + y ′2 dx.
8.1. The elementary problem of the Calculus of Variations
347
Figure 8.2.: The infinitesimal arc length defined by an infinitesimal version of Pythagoras theorem.
Then, (8.17) could be written as Z b Z ′ J(y) = F (x, y(x), y (x)) dx = a
s2
G(s, Y (s))ds,
(8.24)
s1
where, intuitively, y(x) = Y (s) and g(x, y(x)) = G(s, Y (s)). Therefore J(y) corresponds to the total value, along the trajectory y, of some physical quantity having pointwise density g(x, y(x)) = G(s, Y (s)). The problem of minimizing J is usually called (generalized) Fermat’s principle. By using the equality d y ′ (x) g(x, y(x)) · = gu (x, y(x)) · 1 + y ′2 (x) , ′2 dx (1 + y (x)) it can be proved that (E-Ls) becomes g(x, y(x)) · y ′′ (x) + [gx (x, y(x)) · y ′ (x) − gu (x, y(x))] · [1 + y ′2 (x)] = 0. (8.25)
8.1.9. Examples Example 8.7 (A global strong minimum) In this example it’s proved that a functional has a global strong minimum. Let’s consider the functional J : M ⊆ C1 ([1, 2]) −→ R, given by Z 2 ′2 J(y) = y (x) − 2x y(x) dx, 1
where M = w ∈ C1 ([1, 2]) / w(1) = 0 ∧ w(2) = −1 . We shall prove that J has a point of global strong minimum and find it.
Proof. Here F ∈ C2 ([1, 2] × R × R) is given by F (x, u, ξ) = ξ 2 − 2xu. Then, Fx (x, u, ξ) = −2x,
Fξ (x, u, ξ) = 2ξ,
d d Fξ (x, y(x), y ′ (x)) = [2y ′ (x)] = 2y ′′ (x), dx dx so that the Euler-Lagrange equation is d Fξ = 0, dx −2x − 2y ′′ (x) = 0, Fu −
x ∈ [1, 2].
348
Chapter 8. The Euler-Lagrange equation and other classical topics
Then, the general extremal has the form 1 y(x) = − x3 + c1 x + c2 . 6 By using the boundary conditions, we get that c1 = unique critical point given by
1 6
and c2 = 0. Then J has a
1 1 y0 (x) = − x3 + x. 6 6 Let’s observe that for y ∈ M and h ∈ I 1 ([1, 2]) it holds Z 2 Z J(y + h) − J(y) = [2y ′ (x)h′ (x) − 2xh(x)] dx + 1
2
h′2 (x) dx
1
so that J is of class C 2 with J ′′ (y)h2 = 2
2
Z
h′2 (x) dx.
1
Then, it’s clear that for all r > 0, ∀y ∈ B∞ (y0 , r) ∩ M , ∀h ∈ I 1 ([1, 2]) :
J ′′ (y0 )h2 ≥ 0.
Therefore y0 is a point of global strong minimum for J.
■
Example 8.8 (The simplest geodesic) In this example it’s proved that the shortest way to go from one point in the plane to other is thru the straight line that connects them. So, given a, b, A, B ∈ R, a < b, let’s find the shortest C1 -curve connecting the planar points P = (a, A) and Q = (b, B). Proof. The problem corresponds to minimizing J : M ⊆ C1 ([a, b]) −→ R, given by Z bp 1 + y ′2 (x) dx, J(y) = a
with M = w ∈ C1 ([a, b]) / w(a) = A ∧ w(b) = B . We have the conditions of (8.25) with p F (x, u, ξ) = f (x, u) · 1 + ξ 2 , (x, u, ξ) ∈ [a, b] × R × R, f (x, u) = 1,
(x, u) ∈ [a, b] × R.
Then, the Euler-Lagrange equation can be rewritten as y ′′ = 0, 1 + y ′2
fu − fx · y ′ − f · which in our case is y ′′ (x) = 0,
x ∈ [a, b].
Therefore, the general extremal is y(x) = c1 x + c2 ,
x ∈ [a, b].
8.1. The elementary problem of the Calculus of Variations
349
By using the boundary conditions we find that J has a unique critical point y0 ∈ M given by B−A y0 (x) = (x − a) + A, x ∈ [a, b]. b−a Now let’s compute J ′′ (y0 ) to analize the kind of critical point that y0 is. For y ∈ M , h ∈ I 1 ([a, b]) and λ ∈ R, we have that d J(y + λh) = dλ
b
Z a
−1/2 1 1 + y ′2 (x) + 2λ y ′ (x) h′ (x) + λ2 h′2 (x) · 2 · 2y ′ (x)h′ (x) + 2λh′2 (x) dx,
and so d J (y)h = J(y + λh)|λ=0 = dλ ′
Z a
b
y ′ (x)h′ (x) p dx. 1 + y ′2 (x)
In a similar way we have d2 J(y + λh) = dλ2
Z
b
−3/2 1 1 + y ′2 (x) + 2λy ′ (x)h′ (x) + λ2 h′2 (x) · 4 a 2 · 2y ′ (x)h′ (x) + 2λh′2 (x) −1/2 1 + 1 + y ′2 (x) + 2λy ′ (x)h′ (x) + λ2 h′2 (x) · 2h′2 (x) } dx 2 Z b h′2 (x) d2 ′′ 2 dx. J(y + λh)| = J (y)h = λ=0 3/2 dλ2 a (1 + y ′2 (x)) {−
Then, it’s clear that for some (actually for all) r > 0, it holds ∀y ∈ B∞ (y0 , r) ∩ M , ∀h ∈ I 1 ([a, b]) :
J ′′ (y)h2 ≥ 0.
(8.26)
By (8.26) it follows that y0 is a point of local strong minimum for J. Actually, it’s a global point of minimum. ■
8.1.10. Problems Problem 8.1 Redo with all the details the derivation of the Euler - Lagrange equation presented in Section 8.1.3. Problem 8.2 Consider (
Find y0 ∈ M such that J(y0 ) = inf J(u), u∈M
where M = u ∈ C1 ([a, b]) / u(a) = A ∧ u(b) = B and Z J(y) =
b
F (x, y(x), y ′ (x)) dx.
a
1. Let a = −1, b = 1, A = 0, B = 1, and F (x, u, ξ) = u2 (2x − ξ)2 .
(8.27)
350
Chapter 8. The Euler-Lagrange equation and other classical topics a) Show that the solution of (8.27) is ( 0, if − 1 ≤ x ≤ 0, y(x) = 2 x , if 0 < x ≤ 1. b) Why y ∈ / C2 ([a, b])?
√
2. Let a = 1, b = 2, A = 0, B = 1, and F (x, u, ξ) =
1+ξ 2 . x
a) Show that J has only one extremal, y0 , given by (y0 (x) − 2)2 + x2 = 5. b) Is y0 a point of extremum? Problem 8.3 Consider the following problem: “Among all the curves joining two given points (x0 , y0 ) and (x1 , y1 ), find the one which generates the surface of minimum area when rotated about the x-axis”. 1. Find the weak extremals of the associated functional. 2. Actually the functional could have one, two or none extremals. Discuss these situations. Z x1 p Hint. Consider J(y) = 2π y(x) 1 + y ′2 (x) dx and the catenary y(x) = x0 x+α . β cosh β Problem 8.4 Consider M = {u ∈ C1 ([0, 1]) / u(0) = 0 ∧ u(1) = 1} and the functional Jk : M ⊆ C1 ([0, 1]) −→ R. 1. Find the weak extremals of Jk . Are they of class C2 ? 2. Try to determine if the extremals are points of extremum. Z J1 (y) =
1
Z
y ′ (x) dx,
J2 (y) =
0
1
y(x)y ′ (x) dx,
0
Z J3 (y) =
1
x y(x) y ′ (x) dx.
0
Problem 8.5 Find the extremals of the functional J : C1 ([a, b]) −→ R. Z b 2 1. J(y) = y (x) + y ′2 (x) − 2y(x) sin(x) dx; Zab ′2 y (x) 2. J(y) = dx; x3 Zab 2 3. J(y) = y (x) − y ′2 (x) − 2y(x) cosh(x) dx; Zab 2 4. J(y) = y (x) + y ′2 (x) + 2y(x) ex dx; a
8.1. The elementary problem of the Calculus of Variations
351
b
Z
2 y (x) − y ′2 (x) − 2y(x) sin(x) dx.
5. J(y) = a
Problem 8.6 Consider the functional Jk : M ⊆ C1 ([a, b]) −→ R. 1. Find the weak extremals of Jk . Are they of class C2 ? 2. Try to determine if the extremals are points of extremum. 0
Z
12x y(x) − y ′2 (x) dx,
J1 (y) = J2 (y) =
y(−1) = 1,
−1 Z 2
y ′2 (x) + 2y(x)y ′ (x) + y 2 (x) dx,
y(0) = 0;
y(1) = 1,
y(2) = 0;
1 1
Z
p y(x) (1 + y ′2 (x)) dx,
J3 (y) =
√ y(0) = y(1) = 1/ 2;
0
1
Z
y(x) y ′2 (x) dx,
J4 (y) =
y(0) = 1,
y(1) =
√ 3
4;
0 π
Z
J5 (y) =
4y(x) cos(x) + y ′2 (x) − y 2 (x) dx,
y(0) = 0,
y(π) = 0;
0 1
Z
′2 y (x) − y 2 (x) − y(x) e2x dx,
J6 (y) =
y(1) = e−1 ;
y(0) = 0,
0
1
Z J7 (y) =
−1 Z 0
J8 (y) = J9 (y) =
′2 y (x) − 2xy(x) dx,
y(−1) = −1,
′2 y (x) − 2xy(x) dx,
y(−1) = 0,
−1 Z e
xy ′2 (x) + y(x)y ′ (x) dx,
y(1) = 0,
y(1) = 1; y(0) = 2; y(e) = 1.
1
Problem 8.7 Consider the functional Jk : M ⊆ C1 ([a, b]) −→ R. 1. Find the weak extremals of J. Are they of class C2 ? 2. Try to determine if the extremals are points of extremum.
Z
b
J1 (y) =
h i 2xy(x) + (x2 + ey(x) )y ′ (x) dx,
y(a) = A,
h i ey(x) + xy ′ (x) dx,
y(1) = α;
y(b) = B;
a
Z J2 (y) =
1
y(0) = 0,
0
Z J3 (y) = 0
π/4
′2 y (x) − y 2 (x) dx,
y(0) = 1,
√ y(π/4) = 1/ 2;
352
Chapter 8. The Euler-Lagrange equation and other classical topics Z π ′2 y (x) − y 2 (x) dx, y(0) = 1, y(π) = −1; J4 (y) = 0 1
Z
x + y ′2 (x) dx,
J5 (y) =
y(0) = 1,
y(1) = 2;
0 1
Z
2 y (x) + y ′2 (x) dx,
J6 (y) =
y(0) = 0,
y(1) = 1;
0
Z
1
′2 y (x) + 4y 2 (x) dx,
J7 (y) =
y(0) = e2 ,
y(1) = 1;
0
Z
1
J8 (y) =
h i 2ey(x) − y 2 (x) dx,
y(0) = 1,
y(1) = e;
0
Z J9 (y) =
1
h i ′2 e−y (x) dx,
y(0) = 0,
y(1) = 0;
0
Z J (y) = 10
π/2
2 y (x) − y ′2 (x) − 8y(x) cosh(x) dx,
0
y(0) = 2, y(π/2) = 2 cosh(π/2). Problem 8.8 Consider M = {y ∈ C1 ([0, π/2]) / y(0) = A ∧ y(π/2) = B} and the functional J : M ⊆ C1 ([0, π/2]) −→ R given by π/2
Z
y(x) (2x − y(x)) dx.
J(y) = 0
For which values of A and B does J have critical points? Problem 8.9 Use Bernstein’s theorem to prove that one and only one extremal of the functional defined by the formula Jk (y) passes through any two points (a, A) and (b, B), a ̸= b. Z
b
2
e−2y (y ′2 (x) − 1) dx,
J1 (y) = a
Z
b
p 1 + y 2 (x) − y ′2 (x) dx,
J2 (y) = a
Z J3 (y) =
b
h p i y 2 (x) + y ′ (x) arctan(y ′ (x)) − ln 1 + y ′2 (x) dx.
a
Problem 8.10 Find the general solution of the Euler-Lagrange equation corresponding to the functional given by Z J(y) =
b
p f (x) 1 + y ′2 (x) dx,
a
and investigate the special cases of f (x) =
√
x and f (x) = x.
8.1. The elementary problem of the Calculus of Variations
353
Problem 8.11 Minimize the functional J : C1 ([0, 1]) −→ R given by Z 1 1 ′2 J(u) = u (x) + u(x)u′ (x) + u′ (x) + u(x) dx. 2 0 Problem 8.12 Let K ∈ C∞ ([a, b] × [a, b]). Find the extremals of the functional J : C1 ([a, b]) −→ R given by Z bZ b K(s, t)u(t)u(s) ds dt. J(u) = a
a
Problem 8.13 Find the extremals of the functional J : C1 ([a, b]) −→ R given by Z 1p p x2 + y 2 1 + y ′2 (x) dx. J(u) = 0
Hint. Use polar coordinates to get x2 cos(α) + 2xy sin(α) − y 2 cos(α) = β, for some constants α, β ∈ R. Problem 8.14 Consider M = {y ∈ C1 ([−1, 1]) / y(−1) = −1 ∧ y(1) = 1} and the functional J : M ⊆ C1 ([−1, 1]) −→ R given by Z 1 J(y) = x2 y ′2 (x) dx. −1
1. Prove that J is bounded from below and find inf J(u). u∈M
2. Prove that J has no point of minimum. 3. Consider (yn )n∈N ⊆ C1 ([−1, 1]) given by yn (x) = arctan
x
/ arctan
1 . n
n Show that (yn )n∈N is a minimizing sequence for J. 4. Show that if J is extended to the set of piecewise differentiable functions on [−1, 1], verifying the boundary conditions, then the extension achieves the minimum at the sign function. Problem 8.15 Consider M = {y ∈ C1 ([0, 1]) / y(0) = 0 ∧ y(1) = 0} and the functional J : M ⊆ C1 ([0, 1]) −→ R given by Z 1 J(y) = [y ′2 (x) − 1]2 dx. 0
1. Prove that inf y∈M J(y) = 0. 2. Prove that J has no global minimizer. 3. Try to prove that J ∗ , the extension of J to M ∗ = {y ∈ C1piec ([0, 1]) /y(0) = 0 ∧ y(1) = 0} does have a global minimizer. Here C1piec ([0, 1]) denotes the space of piecewise C1 functions with domain [0, 1]. Problem 8.16 Consider M = {y ∈ C1 ([0, 1]) / y(0) = 1 ∧ y(1) = 0} and the functional J : M ⊆ C1 ([0, 1]) −→ R given by Z 1 J(y) = x · y ′2 (x) dx. 0
354
Chapter 8. The Euler-Lagrange equation and other classical topics
1. Prove that inf y∈M J(y) = 0. 2. Does J have global or local extremum? Problem 8.17 Consider M = {y ∈ C1 ([−1, 1]) / y(−1) = 0 ∧ y(1) = 1} and the functional J : M ⊆ C1 ([−1, 1]) −→ R given by Z 1 J(y) = y 2 (x) · [1 − y ′ (x)]2 dx. −1
1. Prove that J has no minimizer. 2. Try to prove that J ∗ , the extension of J to M ∗ = {y ∈ C1piec ([0, 1]) /y(0) = 0 ∧ y(1) = 0} does have a minimizer: ( 0, if x ∈ [−1, 0], y˜(x) = x, if x ∈]0, 1]. Here C1piec ([0, 1]) denotes the space of piecewise C1 functions with domain [0, 1]. Problem 8.18 Consider M = {y ∈ C1 ([0, 1]) / y(0) = 0 ∧ y(1) = 1} and the functional J : M ⊆ C1 ([0, 1]) −→ R given by Z 1 J(y) = |y ′ (x)| dx. 0
Show that J has infinitely many minimizers. Problem 8.19 Redo with all the details the proof of Theorem 8.3. Problem 8.20 Prove Theorem 8.5.
8.2. Some generalizations of the elementary problem of the Calculus of Variations. 8.2.1. The fixed end-point problem for n unknown functions Let F ∈ C2 ([a, b] × Rn × Rn ). A generic element of the domain of F shall be denoted by (x, u1 , u2 , ..., un , ξ1 , ξ2 , ..., ξn ) ∈ [a, b] × Rn × Rn , and the partial derivatives of F will be written ∂F ∂F ∂F , Fuk = , Fξk = , ∂x ∂uk ∂ξk ∂2F ∂2F Fxuk = , Fξj uk = , etc. ∂uk ∂x ∂uk ∂ξj n Consider the functional J : C1 ([a, b]) −→ R, given by Fx =
Z J(y) = a
b
F (x, y1 (x), ..., yn (x), y1′ (x), ..., yn′ (x))dx,
8.2. Some generalizations of the elementary problem of the Calculus of Variations. 355 where y = (y1 , ..., yn ). The functional J is differentiable and n ∀y, h ∈ C ([a, b]) :
′
1
J (y)h =
n Z X k=1
b
[Fuk hk + Fξk h′k ] dx.
a
It can be proved, for example by using partial differentials, that a necessary n condition for y = (y1 , ..., yn ) ∈ C1 ([a, b]) to be a critical point of J is ∀w ∈ C1 ([a, b]) :
b
Z
[Fuk w + Fξk w′ ] dx = 0,
k = 1, ..., n.
a
Let’s now consider the following problem ( Find y∗ ∈ M such that J(y∗ ) = inf J(u),
(8.28)
u∈M
where n M = {y = (y1 , ..., yn ) ∈ C1 ([a, b]) / ∀k ∈ In : yk (a) = Ak ∧ yk (b) = Bk }. If y = (y1 , ..., yn ) is a solution of (8.28), then y satisfy the Euler-Lagrange system, Fuk −
d Fξ = 0, dx k
k = 1, ..., n.
Example 8.9 (Geometric optics) Suppose that the light moves thru the region Ω ⊆ R3 . Fermat’s principle states that the light goes from a point P = (x1 , y1 , z1 ) ∈ Ω to a point Q = (x2 , y2 , z2 ) ∈ Ω, a = x1 < x2 = b, along the path for which the transit time T is the smallest. Suppose that Ω is filled with an optically inhomogeneous medium so that, as a consequence, light propagates with a velocity that depends on its position inside: Ω ∋ (x, y, z) 7→ v(x, y, z) ∈ R. Let’s assume that the curve γ : [a, b] −→ R2 , given by γ(x) = (Y (x), Z(x)), is such that γ([a, b]) ⊆ Ω, γ(a) = (y1 , z1 ) and γ(b) = (y2 , z2 ). Then, the time that light takes traveling from P to Q thru the path described by γ is given by Z T (γ) = T (Y, Z) =
b
F (x, Y (x), Z(x), Y ′ (x), Z ′ (x)) dx,
a
p
1 + ξ12 + ξ22 . Therefore if v ∈ C1 ([a, b]), the light v(x, u1 , u2 ) should travel thru a curve (Y (·), Z(·)) for which the following differential equations hold √ ∂v 1 + Y ′2 + Z ′2 d Y′ √ + = 0, ∂y v2 dx v 1 + Y ′2 + Z ′2 √ ∂v 1 + Y ′2 + Z ′2 d Z′ √ + = 0. ∂z v2 dx v 1 + Y ′2 + Z ′2 where F (x, u1 , u2 , ξ1 , ξ2 ) =
356
Chapter 8. The Euler-Lagrange equation and other classical topics
8.2.2. The isoperimetric problem Let F and G be functions that belong to C2 ([a, b] × R × R). A generic element of the domain of F or G shall be denoted by (x, u, ξ) ∈ [a, b] × R × R and the partial derivatives will be written ∂F ∂F ∂G Fx = , Fu = , Gξ = , ∂x ∂u ∂ξ ∂2F ∂2G , Fξu = , etc. Gxu = ∂u∂x ∂u∂ξ A number of applications require of a functional J : M ⊆ C1 ([a, b]) −→ R, given by Z b J(y) = F (x, y(x), y ′ (x)) dx, a
where the set of admissible functions, which is not a linear space, can be written as M = w ∈ C1 ([a, b]) / w(a) = A ∧ w(b) = B ∧ K(w) = l , with l ∈ R prescribed, and Z K(w) =
b
G(x, w(x), w′ (x)) dx.
a
We say that h ∈ C1 ([a, b]) is an admissible increment of y ∈ M iff y + h ∈ M , which is equivalent to require it to satisfy ( h(a) = h(b) = 0, K(y + h) = l. The condition K(w) = l, is usually referred to as a subsidiary condition or a constraint. As every maximization problem can be stated as a minimization one, we focus only in the later case. The (generalized) isoperimetric problem can be mathematically written as ( Find y0 ∈ M such that (IP) J(y0 ) = inf J(u). u∈M
The next result provides a necessary condition for a constrained problem to have solution. Theorem 8.6 (Euler theorem. Lagrange multiplier) Let’s assume that 1. y ∈ M is a point of extremum of J, and 2. y ∈ M is not an extremal of K. Then, there exists λ ∈ R, referred as Lagrange multiplier, such that y is an extremal of the functional L : M → R given by L(w) = J(w) + λK(w), i.e., y satisfies Φu (x, y(x), y ′ (x)) −
d Φξ (x, y(x), y ′ (x)) = 0, dx
x ∈ [a, b],
(8.29)
8.2. Some generalizations of the elementary problem of the Calculus of Variations. 357 where Φ = F + λG. The Euler-Lagrange equation (8.29) can be rewritten as (Fu + λGu ) −
d (Fξ + λGξ ) = 0, dx
x ∈ [a, b].
(8.30)
To apply Theorem 8.6 we should obtain the general solution y = y(·; α, β, λ) of (8.30) which will contain two paramenters, α, β ∈ R, in addition to the Lagrange multiplier, λ ∈ R. Then these three values shall be computed by using the boundary conditions and the constraint. Example 8.10 (A functional with an infinite number of critical points) Let’s find the critical points of J : M ⊆ C1 ([0, π]) −→ R given by Z π J(y) = y ′2 (x) dx, 0
where M =
w ∈ C1 ([0, π]) / w(0) = w(π) = 0 ∧
Z
π
w2 (x) dx = 1 .
0
Proof. Here F, G ∈ C2 ([0, π] × R × R) are given, respectively, by F (x, u, ξ) = ξ 2 ,
G(x, u, ξ) = u2 .
For λ ∈ R, to be defined a posteriori, we consider H + λG so that H(x, u, ξ) = ξ 2 + λu2 . Then, Hu (x, u, ξ) = 2λu,
Hξ (x, u, ξ) = 2ξ,
d Hξ (x, y(x), y ′ (x)) = 2y ′′ (x), dx so that the Euler-Lagrange equation is Hu −
x ∈ [0, π],
d Hξ = 0, dx
i.e., y ′′ (x) − λy(x) = 0,
x ∈ [0, π].
(8.31)
The characteristic equation associated to (8.31) is m2 − λ = 0,
m ∈ C.
For the cases λ = 0 and λ > 0 we obtain, by using the boundary conditions that y(x) = 0,
x ∈ [0, π],
which is incompatible with the integral condition. So let’s assume that λ = −γ 2 with γ > 0. Then, the extremals would have the form y(x) = c1 sin(γx) + c2 cos(γx),
x ∈ [0, π].
358
Chapter 8. The Euler-Lagrange equation and other classical topics
By the condition y(0) = 0 it follows that c2 = 0. Then, by the condition y(π) = 0 we get that γ ∈ Z. By the integral condition, Z
1 2
y (x)dx =
1=
c21
Z
π
sin2 (γx)dx
and c21 =
0
0
2 . π
Then, the critical points of J have the form r 2 yγ = sin(γx), x ∈ [0, π], γ ∈ Z. π ■
8.2.3. A generalization of the isoperimetric problem Let G1 , ..., Gk and F functions that belong to C2 ([a, b] × Rn × Rn ). A generic element of the domain of F and Gk shall be denoted (x, u1 , ..., un , ξ1 , ..., ξn ) ∈ [a, b] × Rn × Rn . n Some applications require the use of a functional J : M ⊆ C1 ([a, b]) −→ R given by b
Z
F (x, y1 (x), ..., yn (x), y1′ (x), ..., yn′ (x)) dx,
J(y1 , ..., yn ) = a
where the set of admissible functions, M , contains the elements (w1 , ..., wn ) ∈ n C1 ([a, b]) such that wi (a) = Ai ,
wi (b) = Bi ,
i = 1, ..., n,
(8.32)
and Kj (w1 , ..., wn ) = lj ,
j = 1, ..., k,
(8.33)
with lj ∈ R prescribed, and Z Kj (w1 , ..., wn ) =
b
Gj (x, w1 (x), ..., wn (x), w1′ (x), ..., wn′ (x)) dx.
a
In this case a necessary condition for an extremum of J is to satisfy the system k k X X d ∂ ∂ F + F+ λj Gj − λj Gj = 0, (8.34) ∂ui dx ∂ξ i j=1 j=1 i = 1, ...., n. The general solution of the system (8.34) has 2n parameters plus k Lagrange multipliers which could be obtained by using the boundary conditions (8.32) and the subsidiary conditions(8.33).
8.2.4. Finite subsidiary conditions Let F ∈ C2 ([a, b]×Rn ×Rn ) and g1 , g2 , ..., gk ∈ C1 ([a, b]×Rn ). A generic element of the domain of F shall be denoted by (x, u1 , ..., un , ξ1 , ..., ξn ) ∈ [a, b] × Rn × Rn .
8.2. Some generalizations of the elementary problem of the Calculus of Variations. 359 n Some applications require the use of a functional J : M ⊆ C1 ([a, b]) −→ R given by Z b F (x, y1 (x), ..., yn (x), y1′ (x), ..., yn′ (x)) dx, J(y1 , ..., yn ) = a
where the set of admissible functions, M , contains the elements (w1 , ..., wn ) ∈ n C1 ([a, b]) such that wi (a) = Ai ,
wi (b) = Bi ,
i = 1, ..., n,
(8.35)
j = 1, ..., k.
(8.36)
and gj (x, w1 (x), ..., wn (x)) = 0,
Theorem 8.7 (Necessary condition for the finite subsidiary problem) Let’s assume that 1. (y1 , ..., yn ) ∈ M is a point of extremum of J, and 2. for any j ∈ {1, ..., k} and distinct i1 , i2 ∈ {1, ..., n}, the functions
∂gj ∂ui1
∂gj do not vanish simultaneously at any point of the surface ∂ui2 determined by (8.36). and
Then, there exist functions λj : [a, b] −→ R, j = 1, ..., k, referred to as generalized Lagrange multipliers, such that (y1 , ..., yn ) is an extremal of the Z b k X functional L : M → R given by L(w) = Φ dx, where Φ = F + λj gj , a
j=1
i.e., (y1 , ..., yn ) satisfies Φui −
d Φξ = 0, dx i
i = 1, ..., n.
(8.37)
Remark 8.7 Let’s consider the case when n = 2 and k = 1. Let’s write F = F (x, u, v, ξ, η), g = g(x, u, v), Z b J(y, z) = F (x, y(x), z(x), y ′ (x), z ′ (x)) dx, a
and Φ = F + λ(·) g. Then (8.37) becomes Fu + λgu −
d d Fξ = 0 and Fv + λgv − Fη = 0. dx dx
8.2.5. Problems Problem 8.21 Write explicitly the sets of admisible functions and increments for the functional defined by the formula Z 1 ′2 J(u) = u (x) + x2 dx, 0
360
Chapter 8. The Euler-Lagrange equation and other classical topics 1
Z
u2 (x) dx = 2. Find the
subject to the conditions u(0) = 0, u(1) = 0 and 0
extremals of J.
Problem 8.22 Find a curve y : [a, b] −→ R, y(−a) = y(a) = 0, which together with the segment [−a, a], for a given length l > 2a, bounds a maximum area. Problem 8.23 Write explicitly the sets of admisible functions and increments for the functional defined by the formula Z π J(y) = y ′2 (x) dx, 0 π
Z
y 2 (x) dx = 1. Find the
subject to the conditions y(0) = y(π) = 0 and 0
extremals of J.
Problem 8.24 Write explicitly the sets of admisible functions and increments for the functional defined by the formula Z 1 ′2 J(y, z) = y (x) + z ′2 (x) − 4xz ′ (x) − 4z(x) dx, 0
subject to the conditions y(0) = 0, y(1) = 1, z(0) = 0, z(1) = 1 and Z 1 ′2 y (x) − xy ′ (x) − z ′2 (x) dx = 2. 0
Find the extremals of J. Problem 8.25 Find the shortest distance between the points P = (1, −1, 0) and Q = (2, 1, −1) lying on the surface 15x − 7y + z − 22 = 0. Hint. The distance between the points (x0 , y0 , z0 ) and (x1 , y1 , z1 ) on the surface φ(x, y, z) = 0 is found from the formula Z x1 p 1 + y ′2 (x) + z ′2 (x) dx. l= x0
Problem 8.26 Write explicitly the sets of admisible functions and increments for the functional defined by the formula Z 1 J(y) = y ′2 (x) dx, 0
Z subject to the conditions y(0) = 1, y(1) = 6 and
1
y(x) dx = 3. Find the 0
extremals of J.
Problem 8.27 Write explicitly the sets of admisible functions and increments for the functional defined by the formula Z 1 J(u) = u′2 (x) dx, 0
8.2. Some generalizations of the elementary problem of the Calculus of Variations. 361 Z subject to the conditions u(0) = 0, u(1) = 1/4 and
1
y(x) − y ′2 (x) dx =
0
1/12. Find the extremals of J. Problem 8.28 (Functionals depending on derivatives of higher order) Let F ∈ C2 ([a, b] × R × Rn ). A generic element of the domain of F shall be denoted by (x, u, ξ1 , ..., ξn ) ∈ [a, b] × R × Rn . Consider the functional J : M −→ R, given by Z b F (x, y(x), y ′ (x), ..., y (n) (x)) dx, J(y) = a
n o where M = w ∈ Cn ([a, b]) / y (k) (a) = Ak ∧ y (k) (b) = Bk , k = 0, 1, ..., n − 1 . We are interested in the problem ( Find y0 ∈ M such that (8.38) J(y0 ) = inf J(u), u∈M
1. Develop the machinery necessary to deal with this situation and show that a critical point of J should satisfy the Euler equation, Fu +
n X
(−1)n
k=1
dk Fξ = 0. dxk k
2. By using Taylor expansions try to find (necessary / sufficient) conditions of second order for a critical point to be a point of minimum. 3. Try to solve (8.38) for the functional defined by Z 1 J(y) = (1 + y ′′2 (x)) dx, 0
y(0) = 0, y ′ (0) = 1, y(1) = 1, y ′ (1) = 1. 4. Try to solve (8.38) for the functional defined by Z J(y) =
π/2
(y ′′2 (x) − y 2 (x) + x2 ) dx,
0
y(0) = 1, y ′ (0) = 0, y(π/2) = 0, y ′ (π/2) = 1. 5. Try to solve (8.38) for the functional defined by Z 1 J(y) = y ′′2 (x) dx, 0
y(0) = 0, y ′ (0) = a, y(1) = 0, y ′ (1) = b. Problem 8.29 (Calculus of Variations and PDE) Let p > 1, N ∈ N, Ω ⊆ RN open connected with smooth boundary, and T ∈ C(Ω) non-negative. The nonlinear Schr¨ odinger equation ( −∆z(x) + T (x)z(x) − |z(x)|p−1 z(x) = 0, x ∈ Ω, (8.39) z(x) = 0, x ∈ ∂Ω,
362
Chapter 8. The Euler-Lagrange equation and other classical topics
serves e.g. to model systems of a very large number of particles interacting at very low temperatures, like Bose-Einstein condensates. We look for solutions of (8.39). 1. Prove that on C∞ 0 (Ω) the following formula defines an internal product Z (u, v)T = [∇u(x) · ∇v(x) + T (x)u(x)v(x)] dx. Ω
2. Define the Hilbert space (HT , ∥ · ∥T ) as the completion of C∞ 0 (Ω) with the internal product (·, ·)T . Prove that the functional I : HT −→ R given by Z 1 1 |∇u(x)|2 + T (x)|u(x)|2 dx, I(u) = ∥u∥2T = 2 2 Ω is Gateaux differentiable. 3. Prove that I is twice differentiable and write the formulas for I ′ (u)h and I ′′ (u) (h, h) for generic u, h ∈ HT . 4. Consider the functional J = I|M , that is, J is the restriction of I to the Nehari manifold n o M = u ∈ HT / ∥u∥Lp+1 (Ω) = 1 . Prove that a critical point of J, say y, weakly verifies ( −∆y(x) + T (x)y(x) − 2c |y(x)|p−1 y(x) = 0, x ∈ Ω, y(x) = 0, x ∈ ∂Ω,
(8.40)
where c > 0 is the corresponding critical value, c = J(y). Observe that the PDE appearing in (8.40) is the Euler-Lagrange equation associated to the functional J. Hint. The notion of weak solution of a PDE can be found e.g. in [7] and [4]. 5. Show that for an appropriate value of θ ∈ R, the function given by z(x) = y(x)/θ, x ∈ Ω, is a non-trivial weak solution of (8.40). 6. From now on assume now that N = 1 and Ω =]α, β[, for some α < β. Find the Euler - Lagrange equation corresponding to the functional Φ = I|A , that is, the restriction of I to the admissible set A = {u ∈ HT / u(α) = A ∧ u(β) = B} . Find the Euler - Lagrange equation associated to Φ. 7. Let u∗ a critical point of Φ. Is it possible to determine if it is a point of maximum or minimum? Discuss. 8. Without using a computer, find explicitly all the extremals of Φ for the case T (x) = x2 , x ∈ [α, β]. 9. Consider the case α = 0, β = 1, A = 0, B = 1 and T (x) = x. a) Without using a computer, find explicitly all the critical points of Φ and decide of which kind they are. b) Using Maxima redo the previous step and make the corresponding graphics using the command plot2d. c) Try to compute the critical values associated to the critical points of step a).
References [1] A. Aguas-Barreno, J. Cevallos-Ch´avez, J. Mayorga-Zambrano, and L. MedinaEspinosa. Semiclassical asymptotics of infinitely many solutions for the infinite case of a nonlinear Schr¨ odinger equation with critical frequency. Bulletin of the Korean Mathematical Society, 59:241–263, 2022. [2] G. Allan and H. Dales. Introduction to Banach Spaces and Algebras. Oxford Graduate Texts in Mathematics. Oxford University Press, 2010. [3] T. Apostol. Calculus. Vol. I: One-variable calculus, with an introduction to linear algebra. Second edition. Blaisdell Publishing Co. Ginn and Co., Waltham, Mass.-Toronto, Ont.-London, 1967. [4] H. Brezis. Functional Analysis, Sobolev Spaces and Partial Differential Equations. Springer, New York, 2011. [5] R. Coleman. Calculus on Normed Vector Spaces. Universitext. Springer, 2012. [6] B. Dacorogna. Introduction to the Calculus of Variations. Imperial College Press, 2004. [7] L. Evans. Partial Differential Equations, volume 19 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 1998. [8] W. Fairchild and C. Ionescu Tulcea. Topology. W. B. Saunders Co., Philadelphia, Pa., 1971. [9] I. Gelfand and S. Fomin. Calculus of Variations. Dover Publications, 2017. [10] P. Halmos. Naive set theory. D. Van Nostrand Company, Inc., New York, N. Y., 1960. [11] K. Hrbacek and T. Jech. Set Theory. Marcel Dekker USA, 1999. [12] T. Jones. The Legendre and Laguerre polynomials & the elementary quantum mechanical model of the Hydrogen atom. Drexel University, http://www.physics.drexel.edu/ tim/open/hydrofin/, 2009. [13] A. Knapp. Basic Real Analysis. Birkh¨auser, 2005. [14] A. N. Kolmogorov and S. V. Fomin. Elements of the Theory of Functions and Functional Analysis. Vol. 1. Metric and normed spaces. Graylock Press, Rochester, N. Y., 1957. [15] A. N. Kolmogorov and S. V. Fomin. Elements of the Theory of Functions and Functional Analysis. Vol. 2: Measure. The Lebesgue integral. Hilbert space. Graylock Press, Albany, N.Y., 1961. 363
364
REFERENCES
[16] M. Krasnov, G. Makarenko, and A. Kiseliov. C´alculo Variacional (ejemplos y problemas). Editorial Mir, Mosc´ u, 1975. [17] E. Kreyszig. Introductory Functional Analysis with Applications. Wiley, USA, 1989. [18] J. Mayorga-Zambrano. Matem´atica Superior para Ciencias e Ingenier´ıa. Colecci´ on de Matem´aticas Universitarias. Ed. AMARUN, Francia, 2020. [19] J. Mayorga-Zambrano. Matem´atica Superior para Ciencias e Ingenier´ıa. Asociaci´ on AMARUN, Par´ıs, Francia, 2020. [20] J. Mayorga-Zambrano, J. Castillo-Jaramillo, and J. Burbano-Gallegos. Compact embeddings of p-sobolev-like conces of nuclear operators. Banach Journal of Mathematical Analysis, 16, 2022. [21] J. Reddy and M. Rasmussen. Limusa, M´exico, 1992.
An´alisis Matem´atico Avanzado.
Editorial
[22] V. Trenoguin, B. Pisarievsi, and T. S´ oboleva. Problemas y Ejercicios de An´alisis Funcional. Editorial Mir, Mosc´ u, 1987.
Index Algebra - of Banach, 264 - of derivable functions Ck (I), 25 - of polynomials, 26 - of real functions, 25 - with unity, 25 commutative -, 25 Axiom of choice, 9 Ball center of a -, 69 closed -, 69 open -, 69 radius of a -, 69 Basis canonical -, 24, 26 Hamel -, 21–23, 25 Hilbert -, 22, 139, 144 orthogonal -, 139 orthonormal -, 139 Schauder -, 22, 139 topological -, 36, 40, 49, 70, 80 topological sub-, 36, 49 Boundedness Uniform - principle, 209 weakly and strong -, 232 Weakly-∗ and strong -, 238 Completeness - of L(V, W ), 181 - of the dual space, 182 Conjugate - exponent, 126 Continuity - and boundedness, 176 - and compacity, 62 - and metric spaces, 81 - and uniform convergence, 85 - at a point, 50, 51, 81 - at a region, 50 - in the initial topology, 56
- in weak topology, 233 - of a composite function, 51 - of a functional, 51 - of a metric, 81 - of a real function, 52 - of linear operators, 236 - of the absolute value function, 52 - of the addition, 57 - of the multiplication, 58 - of the norm, 117 - on a compact set, 107 - via fundamental systems, 51 - via inverse images, 50, 54 - via sequences, 54 equi-, 106 Lipschitz -, 95, 117 lower semi- (l.s.c.), 54, 235 upper semi- (u.s.c.), 54, 235 Convergence - almost everywhere, 94 - in a metric space, 80, 82 - in the initial topology, 56 - of a Cauchy sequence, 82 - of subsequences, 47 - via a fundamental system, 45, 80 absolute -, 139 characterization of the weak -, 231 characterization of weak ∗ -, 238 continuity and weak ∗ -, 241 definition of -, 44 point -, 85 properties of weak -, 233 propeties of weak ∗ -, 241 strong and weak -, 231 uniform -, 85 weak ∗ - and Schauder basis, 238 weak ∗ - and total sets, 238 weak - and Schauder basis, 231 weak - and total sets, 231 Covering 365
366
INDEX finite -, 59, 103 open -, 59
Delta Kronecker’s -, 118 Density - by a metric, 81, 89 - in a set, 42, 81 - in the space, 42 - of J(E), 245 nowhere -, 42, 87 transitivity of the - relation, 42 Differential C1 in infinite dimension, 288 C1 in finite dimension, 288 - of the inverse, 288 - partial -s, 303 Null- implies constant, 298 Fr´echet -, 269, 278, 281 Gateaux -, 268, 273, 281 Duality - map, 195 - product, 195
Function bijective -, 13 concave -, 127 contraction -, 95, 96 convex -, 126 definition of -, 8 extension of a -, 8 gamma -, 8 Gaussian -, 10 Hermite’s -, 162 injective -, 13 inverse -, 14 notation of a -, 8 projection -, 57 restriction of a -, 8 surjective -, 13 variable upper limit -, 27 Functional Minkowski -, 199 Functional space - of p-integrable functions Lp , 133 - of bounded functions B(A), 76 - of continuous functions C(I), 17, 77 - of derivable functions Ck (I), 18 - of polynomials P(I), 17, 24 definition of -, 16
Eigenvalue - of a differential operator, 31 definition of -, 29 eigenfunction associated to an -, 29, 30 Group eigenspace associated to an -, 29, Abelian -, 15 31 definition of -, 15 eigenvector associated to an -, 29 Embedding Hilbert space - of Lebesgue spaces, 185 RN , 84 - of Sobolev spaces, 185 L2 ([a, b]), 93 generalized -, 185 l2 (R), 75 Equality definition of -, 83 parallelogram -, 75 separable -, 144, 145 Parseval’s -, 140 Hyperplane Equation closed -, 198 Fredholm - of second kind, 97, 98 separating -, 199 integral -, 97, 98, 100 weakly ∗ closed -, 243 nonlinear -, 100 Inequality Family Bessel’s -, 140 - of sets, 10 Cauchy-Bunyakovsky-Schwarz -, 73– index set of a -, 10 75, 117 intersection of a - of sets, 12 Cauchy–Bunyakovsky–Schwarz -, 74 union of a - of sets, 12 Euler’s -, 292
INDEX
367
H¨ older - for finite sums, 128 Linear operator H¨ older - for infinite sums, 131 - and Schauder basis, 184 H¨ older - for integrals, 131 - in Quantum Mechanics, 173 Minkowski -, 126, 130, 131 adjoint -, 204, 205 triangle -, 69, 72, 73, 126, 130, 131 bounded-, 174 Young’s -, 127, 128 closed -, 215, 216 Infimum composition of bounded -, 178 - and adherence, 40 definition of -, 26 characterization of -, 13 derivation as a continuous -, 180 Inner-product derivation as a non-continuous -, L2 -, 77 180 - on RN , 75 differentiation as a -, 27, 29 - on l2 (R), 75 extension of a bounded -, 178 definition of -, 73 image of a -, 28 equality by using an -, 189 inverse of a -, 29 norm induced by an -, 75 invertible -, 26 Inner-product space kernel of a -, 28 R as an -, 74 space of -s, 26, 27 - as normed space, 74 Linear space completion of an -, 92, 93 - of p-summable sequences lp (R), definition of -, 73 130 subspace of an -, 76 - of linear operators, 27 Integral - of matrices, 16 - of step and regulated mappings, - of operators, 26 310 complex -, 16 differentiation under the - sign, 313 definition of -, 16 Lebesgue -, 93 dense -, 204 Riemann -, 93, 309 dimension of a -, 21, 22, 123 Isometry, 89 generated - (span), 19 intersection of -, 18 Lemma isomorphic -s, 26 Riesz -, 124 non-dense -, 204 Zorn’s -, 13, 23, 148, 193, 194 subspace of a -, 17, 76 Limit topological -, 19 - inferior, 48 trivial -, 17 - superior, 48 definition of -, 44 Metric notation of -, 44 - induced by a norm, 72 uniqueness of the -, 45, 54, 80 - on RN , 71 Linear combination - on l1 (R), 71 finite -, 19, 121 continuity of a -, 81 infinite -, 19 definition of -, 69 Linear functional product -, 81 continuous -, 192 standard - on R, 70 definition of -, 26 standard - on RN , 71 weakly ∗ continuous -, 242 topology induced by a -, 70, 71 Linear independence Metric space - of a set, 20 - as Hausdorff space, 80 - of eigenvectors, 30 compact -, 105
368
INDEX
complete -, 82, 83, 88, 89, 95, 96, definition of -, 72 105 finite dimensional -, 123 isomorphic -s, 185 completion of a -, 89 subspace of a -, 76 continuity between -s, 81 convergence in a -, 80, 82 Operation definition of -, 69 adherence -, 41 incomplete -, 87 external -, 15 isometric -s, 89 interior -, 40 open set in a -, 69 internal -, 15 separable -, 81, 82 Ordered pair, 7 subspace of a -, 76, 80, 83 topological basis in a -, 80 Point accumulation -, 47, 80 Neighborhood adherent -, 40, 80 - of a point, 38 boundary -, 42 fundamental system of -s, 38, 40, fixed -, 95, 96 44, 45, 49, 55, 70, 80 interior -, 39, 80 set of -s of a point, 38, 49 Polynomial Norm Hermite’s -, 162 p - on Rn , 126 Laguerre’s -, 170 - defined by an inner-product, 75 Legrende’s -, 154 - induced by an inner-product, 74, Property 75 Archimedean -, 46 - of a bounded operator, 174 Bolzano-Weierstrass -, 60, 103, 104 - of a definite integral, 181 finite intersection -, 63 - of the adjoint, 204 p - on L , 131 Reflexivity e 1 ([a, b]), 87 - on L - and finite dimension, 237 e 2 ([a, b]), 77 - on L - and separability of the dual, 247 - on C([a, b]), 77 - by sequences, 249 - on l1 (R), 76 - of a Hilbert space, 207 - on lp (R), 130 - of closed subspaces, 246 -s on a pivote space, 117 - of the dual, 208, 247 comparable -s, 119 definition of -, 206 continuity of the -, 117 Kakutani’s - criterion, 245 definition of -, 72 Relation dominated -, 119 antisymmetric -, 11 equivalence of -s, 124 definition of -, 7 equivalent -s, 119, 123 equivalence -, 11 Euclidean -, 74, 75 order -, 12 metric induced by a -, 72 reflexive -, 11 non-Euclidean -, 75–77 symmetric -, 11 product -, 215 transitive -, 11 s- on R2 , 250 semi-, 73 Sequence supremum -, 76, 77 bounded -, 81 Normed space Cauchy -, 82 completion of a -, 92, 93 convergence of a -, 44, 80
INDEX definition of -, 8 factorial -, 8 space of all real -s, 71 sub-, 46 Set - of neighborhoods of a point, 38 - of parts, 5 acumulation of a -, 47 adherence of a -, 49 adherence of a -, 40, 41, 164 boundary of a -, 42 bounded -, 79 cardinality of a -, 14 closed -, 35, 41, 49, 50, 83 compact -, 36, 60, 105, 124 complement of a -, 6 convex -, 234, 243 countable -, 14, 15 diameter of a -, 79 diference of -s, 6 direct image of a -, 9, 12 discrete -, 14 elementary -s, 57 elements of a -, 5 equality of -s, 5 idea of -, 5 inclusion of -s, 5 infimum of a -, 12 interior of a -, 39, 40 intersection of -s, 6 inverse image of a -, 9, 12 locally compact -, 60 lower bound of a -, 12 maximum of a -, 12 open -, 35, 40, 50, 69 ordered -, 12, 36 orthogonal -, 118 orthogonal of a -, 146 partition of a -, 10 pivote -, 35 proper sub-, 5 relatively compact -, 60, 106 sequentially compact -, 60 singleton -, 6, 44 supremum of a -, 12 total -, 150 totally bounded -, 103–105 totally ordered -, 12
369 union of -s, 6 universe -, 5 upper bound of a -, 12 void -, 5 Space e 1 ([a, b]), 87 L e L2 ([a, b]), 87 C(X, Y ), 52 C([a, b]), 85 C([a, b]), 77 L1 ([a, b]), 93 L2 ([a, b]), 93 l1 (R), 71 l2 (R), 75 e 2 ([a, b]), 77 L Banach -, 83 dual -, 174 Hilbert -, 83, 92, 103 inner-product -, 92 isometric -s, 89 metric -, 79 Supremum - and adherence, 40 characterization of -, 13 Theorem Ascoli-Arzel`a -, 106 Baire’s -, 88 Banach fixed point -, 95, 102 Closed graph -, 215 De Morgan’s -, 6 dominated convergence -, 94 generalized mean value -, 295 Hahn-Banach -, 192, 195, 199, 202 mean value - for functionals, 296 Milman-Pettis -, 250 monotone convergence -, 94 open mapping -, 214 Picard’s -, 101 Riesz -, 197 Weierstrass approximation -, 123 Topological space T1 -, 43, 44 compact -, 60, 103, 105 definition of -, 35 first countable -, 47 Hausdorff -, 43–45, 49, 54, 55, 57, 59, 80
370 locally compact -, 60 regular -, 43 sequentially compact -, 60, 103 subspace of a -, 49, 76 Topology - generated, 37 - in a metric space, 70 - of a dominated norm, 119 definition of -, 35 discrete -, 36 induced -, 49 initial -, 55, 56 product -, 57, 81 standard - on [0, 1], 50 standard - on R, U , 38, 39, 43, 70 standard - on RN , 57, 71 trivial -, 36 weak ∗ -, 236 weak -, 229 weak - for functionals, 236 weaker / stronger -, 36 Vector angle between -s, 118 orthogonal -s, 118
INDEX
List of Figures 1.1. The gamma function. . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2. The function R ∋ x 7−→ f (x) = e−x ∈ R. . . . . . . . . . . . . . 1.3. The function [−10, 10] ∋ x 7−→ h(x) = sin(x) + x. . . . . . . . .
9 10 18
2.1. The separation conditions T1 and T2 . Source: https://mathstrek. blog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 The function [0, π] ∋ x 7−→ |f (x) − g(x)| ∈ R. . . . . . . . . . . The function [0, π] ∋ x 7−→ |f (x) − g(x)|2 ∈ R. . . . . . . . . . . The functions x1 (blue), x5 (red), x10 (green) end x20 (violet). . The intersection between the functions T and [1, +∞[∋ x 7→ Id(x) = x ∈ R determines a fixed point of T . . . . . . . . . . . . . . . . . 3.5. The distance ∥x3 − x2 ∥∞ is equal to |x3 (1) − x2 (1)|. . . . . . . . 3.6. The distance ∥x8 − x7 ∥∞ is equal to |x8 (1) − x7 (1)|. . . . . . . . 3.7. An aproximation to the solution of the Fredholm equation of the second kind (3.54). . . . . . . . . . . . . . . . . . . . . . . . . .
3.1. 3.2. 3.3. 3.4.
4.1. The functions exponential (blue) and logarithmic (red). . . . . . . 4.2. The functions x1 (blue), x5 (red), x10 (green) end x20 (violet). . 4.3. Legendre polynomials P0 (blue), P1 (red), P2 (green), P3 (violet), P4 (black) and P5 (cyan). . . . . . . . . . . . . . . . . . . . . . . 4.4. The function [−1, 1] ∋ t 7→ u(t) = e−t + sin(t) ∈ R (blue) together with the truncated Legendre-Fourier series u2 (red) and u3 (green). 4.5. The value of ∥u − um ∥2 decreases as m increases. . . . . . . . . . 4.6. The functions C1 (blue), C2 (red) and C3 (green). As k increases, so does the frequency of Ck . . . . . . . . . . . . . . . . . . . . . 4.7. The function [−π, π] ∋ t 7→ f (t) = et sin(t) ∈ R (blue) together with the truncated Fourier series f3 (red) and f10 (green). . . . . 4.8. The value of ∥f − fm ∥2 decreases as m increases. . . . . . . . . . 4.9. The first Hermite functions: e0 (blue), e1 (red), e2 (green). . . .
78 79 87 97 99 99 100 127 133 153 155 155 157 161 162 162
6.1. The functions φ1 (blue), φ2 (red) and φ3 (green). . . . . . . . . . 6.2. The p norms in R2 . Source: https://wiki.math.ntnu.no . . .
240 250
7.1. 7.2. 7.3. 7.4. 7.5. 7.6.
. . . . . .
260 262 262 268 269 293
8.1. A simple scheme of the slide. . . . . . . . . . . . . . . . . . . . .
346
A simplified scheme of a can. . . . . . . . . . . . . . . . . . Several shapes for a slide. Source: http://www.etudes.ru A simple scheme for the slide. . . . . . . . . . . . . . . . . . Scheme for the directional derivative. . . . . . . . . . . . . . Scheme for the Fr´echet differential. . . . . . . . . . . . . . . The situation of Corollary 7.2. . . . . . . . . . . . . . . . .
371
. . . . . .
. . . . . .
372
List of Figures 8.2. The infinitesimal arc length defined by an infinitesimal version of Pythagoras theorem. . . . . . . . . . . . . . . . . . . . . . . . . .
347
A course of Functional Analysis with Calculus of Variations
A course of Functional Analysis with Calculus of Variations aims to bring the reader to the modern realm of applied mathematics. It introduces topological and metrics spaces before presenting Banach and Hilbert spaces. It continues with the study of the fundamental theorems of Functional Analysis: the uniform boundedness principle, the closed graph theorem, and the theorems of Riesz-Fr´echet and HahnBanach. With these tools at hand a-not-so-short study of weak topologies is developed to finish Part I. Part II is dedicated to topics of Calculus of Variations taking advantage of what was done in Part I. A good part of Calculus in normed spaces is studied before getting into the classical topics of the Calculus of Variations like the properties of the Euler-Lagrange equation.
Juan Mayorga-Zambrano is a titular professor at Yachay Tech University, Ecuador. He graduated as the best student of Escuela Polit´ecnica Nacional, obtaining the title of mathematician. He graduated with honors in Universidad de Chile, obtaining the title of Doctor in Sciences for Engineering, mention in Mathematical Modeling (PhD). Before returning to work in Ecuador he did two postdocs: the first at Universidad de Talca (Chile) and the second at the Technion - Israel Institute of Technology (Israel).